Reverse Tunnels - Implementation Guide
What’s Been Implemented (Backend Complete ✅)
1. Database Layer
- ✅
reverse_tunnelstable with all required fields - ✅ Migration:
20251114_2139_58c521e3f181_add_reverse_tunnels_for_public_urls.py - ✅
ReverseTunnelSQLAlchemy model with relationships
2. Service Layer
- ✅ Extended
IoTSecureTunnelServicewith reverse tunnel methods:create_reverse_tunnel()- Creates AWS IoT Secure Tunnelclose_reverse_tunnel()- Closes tunnelrotate_reverse_tunnel()- Rotates tunnel (close old, create new)ensure_reverse_tunnel_active()- Ensures tunnel is active (auto-creates/rotates)
3. API Endpoints
- ✅
GET /reverse-tunnels/devices/{dsn}/reverse-tunnels- List tunnels - ✅
POST /reverse-tunnels/devices/{dsn}/reverse-tunnels- Create tunnel (creates AWS tunnel immediately) - ✅
GET /reverse-tunnels/devices/{dsn}/reverse-tunnels/{id}- Get tunnel details - ✅
PUT /reverse-tunnels/devices/{dsn}/reverse-tunnels/{id}- Update tunnel - ✅
DELETE /reverse-tunnels/devices/{dsn}/reverse-tunnels/{id}- Delete tunnel (closes AWS tunnel) - ✅
GET /reverse-tunnels/devices/{dsn}/reverse-tunnels-config- Device config endpoint (ensures tunnels active)
4. Automatic Maintenance
- ✅ Celery task:
cleanup_expired_reverse_tunnels(hourly) - ✅ Celery task:
rotate_expiring_reverse_tunnels(every 30 min)
5. Frontend UI
- ✅ Reverse Tunnels page at
/reverse-tunnels - ✅ Device selector dropdown
- ✅ Create tunnel dialog (name, port, protocol, auth)
- ✅ Tunnels table with enable/disable, copy URL, delete
- ✅ Info banner about config applying on startup
- ✅ Navigation menu item added
6. Infrastructure (AWS CDK)
- ✅ DynamoDB table for tunnel configuration cache
- ✅ Lambda proxy function (NodeJS) for routing traffic
- ✅ Lambda cache warmer (Python) to sync PostgreSQL → DynamoDB
- ✅ EventBridge rule to run cache warmer every 5 minutes
- ✅ IAM permissions for backend to manage IoT Secure Tunnels
- ✅ IAM permissions for Lambda proxy to access tunnels
- ✅ Backend environment variable for tunnel cache table
- ✅ API Gateway HTTP API with Lambda integration
- ✅ ACM certificate for
*.fleet.roboticks.io - ✅ CloudFront distribution for TLS termination
- ✅ Route53 wildcard DNS records (A and AAAA)
infrastructure/lib/roboticks-stack.ts):
ReverseTunnelCache- DynamoDB table with GSI on tunnel_idReverseTunnelProxy- Lambda function for traffic proxyingTunnelCacheWarmer- Lambda function to sync configsTunnelCacheWarmerRule- EventBridge schedule (5 min intervals)ReverseTunnelApi- API Gateway HTTP APIFleetCertificate- ACM wildcard certificate for*.fleet.roboticks.ioTunnelDistribution- CloudFront distribution with custom domainFleetWildcardARecord- Route53 A record for*.fleet.roboticks.ioFleetWildcardAAAARecord- Route53 AAAA record for IPv6- Backend IAM policy for IoT Secure Tunneling operations
- Comprehensive outputs (API URL, CloudFront domain, certificate ARN)
What Needs To Be Implemented
1. Device Agent (Robotics SDK)
Location:/Users/mujacic/roboticks-sdk (Python SDK for devices)
Required Changes:
A. Add Reverse Tunnel Manager Module
Createroboticks_sdk/reverse_tunnel_manager.py:
B. Integrate into Device Startup
Modifyroboticks_sdk/device.py (main device class):
C. Add Dependencies
Add toroboticks-sdk/requirements.txt:
2. DNS Configuration
Status: ✅ Route53 DNS Configured Automatically The CDK stack automatically creates:- ✅ Wildcard A record:
*.fleet.roboticks.io→ CloudFront - ✅ Wildcard AAAA record (IPv6):
*.fleet.roboticks.io→ CloudFront - ✅ ACM certificate for TLS with DNS validation
*.fleet.roboticks.io automatically.
Optional - Cloudflare Proxy (if using Cloudflare):
If you want to add Cloudflare in front of Route53 for additional DDoS protection:
- Log into Cloudflare Dashboard
-
Add DNS Record:
- Type:
CNAME - Name:
*.fleet - Target: CloudFront distribution domain (from CDK output)
- Proxy status: DNS only (grey cloud) - CloudFront already provides DDoS protection
- TTL: Auto
- Type:
-
SSL/TLS Settings:
- Mode: Full (strict)
- Always Use HTTPS: On
- Minimum TLS Version: 1.2
3. Proxy Service (Traffic Forwarder)
Status: ✅ Complete infrastructure, ⏳ WebSocket/HTTP proxy logic pending What’s Complete:- ✅ Lambda function created (
ReverseTunnelProxyin CDK stack) - ✅ DynamoDB cache for fast tunnel lookups
- ✅ Cache warmer syncing PostgreSQL → DynamoDB every 5 minutes
- ✅ IAM permissions for IoT Secure Tunneling access
- ✅ API Gateway HTTP API with Lambda integration
- ✅ CloudFront distribution for TLS termination
- ✅ Route53 wildcard DNS (
*.fleet.roboticks.io) - ✅ ACM certificate for HTTPS
- ✅ Tunnel verification and status checking
- ⏳ WebSocket-to-WebSocket bidirectional proxy in Lambda
- ⏳ HTTP-to-IoT-Tunnel proxy in Lambda
- ⏳ AWS IoT Secure Tunneling Local Proxy protocol implementation
infrastructure/lambda/reverse-tunnel-proxy/index.js):
- Extracts DSN from subdomain (device-.fleet.roboticks.io)
- Looks up tunnel configuration in DynamoDB cache
- Verifies tunnel is open in AWS IoT
- Returns success response (proxy logic marked as TODO)
Option A: AWS Lambda + API Gateway (Serverless) ✅ Partially Implemented
Create Lambda function to proxy requests:Option B: EC2/ECS Service (Persistent)
Deploy persistent proxy service:4. Celery Beat Configuration
File:backend/celeryconfig.py or backend/app/core/celery.py
Add to beat schedule:
Testing Checklist
Backend Testing
- Run migration:
alembic upgrade head - Create reverse tunnel via API
- Verify tunnel created in AWS IoT console
- Update tunnel (disable/enable)
- Delete tunnel (verify closed in AWS)
- Device config endpoint returns tunnel config
- Celery tasks run without errors
Device Testing
- Device fetches tunnel config on startup
- localproxy binary downloads/runs correctly
- Tunnel establishes connection to AWS
- Local service (e.g., web server on port 8080) accessible
End-to-End Testing
- Access device via public URL:
https://device-abc123.fleet.roboticks.io - Traffic routes correctly: Browser → Cloudflare → Proxy → IoT Tunnel → Device
- Tunnel auto-rotates before expiration
- Expired tunnels cleaned up by Celery task
Deployment Steps
-
Deploy Backend:
-
Deploy Frontend:
-
Configure Cloudflare DNS: Add wildcard CNAME for
*.fleet.roboticks.io - Deploy Proxy Service: Deploy Lambda or EC2 proxy
- Update Device SDK: Push new version with tunnel manager
- Update Devices: Devices will fetch new SDK on next update
Monitoring
CloudWatch Metrics
- Tunnel creation success/failure rate
- Tunnel rotation count
- Active tunnels count
- Request count per tunnel
Logs
- Backend:
/var/log/roboticks-api.log - Celery:
/var/log/celery-beat.log - Device: Check device logs for tunnel establishment
Alerts
- Alert if tunnel creation fails > 5 times/hour
- Alert if Celery tasks fail
- Alert if > 10% of devices fail to establish tunnels
Cost Estimation
AWS IoT Secure Tunneling
- First 500 tunnels/month: Free
- Additional tunnels: $0.01 per tunnel per month
- Data transfer: $0.10 per GB
Cloudflare
- Free plan supports unlimited DNS queries
- Pro plan ($20/month) for advanced features
Summary
✅ Backend Implementation: 100% Complete- Database schema
- API endpoints
- AWS IoT integration
- Automatic maintenance
- Frontend UI
- DynamoDB tunnel cache table
- Lambda proxy function (infrastructure complete)
- Cache warmer Lambda (syncs PostgreSQL → DynamoDB)
- EventBridge schedules
- IAM permissions
- Backend environment variables
- API Gateway HTTP API with Lambda integration
- CloudFront distribution with custom domain
- Route53 wildcard DNS (A + AAAA records)
- ACM certificate for
*.fleet.roboticks.io
- Device SDK: Add reverse tunnel manager (~4 hours)
- Testing: End-to-end testing (~4 hours)
- ✅ HTTP proxy through IoT Secure Tunnel implemented
- ✅ Device connection verification implemented
- ✅ Detailed error handling and troubleshooting implemented
- ✅ Celery Beat scheduled tasks implemented (tunnel cleanup + rotation)
- ✅ Celery Worker + Beat ECS services configured in CDK
- ✅ WebSocket support FULLY implemented (bidirectional real-time communication)
- ✅ API Gateway WebSocket API configured in CDK
- ✅ WebSocket Lambda proxy implemented with connection management
- ⏳ Device SDK implementation needed