Skip to main content

Reverse Tunnels - Implementation Guide

What’s Been Implemented (Backend Complete ✅)

1. Database Layer

  • reverse_tunnels table with all required fields
  • ✅ Migration: 20251114_2139_58c521e3f181_add_reverse_tunnels_for_public_urls.py
  • ReverseTunnel SQLAlchemy model with relationships

2. Service Layer

  • ✅ Extended IoTSecureTunnelService with reverse tunnel methods:
    • create_reverse_tunnel() - Creates AWS IoT Secure Tunnel
    • close_reverse_tunnel() - Closes tunnel
    • rotate_reverse_tunnel() - Rotates tunnel (close old, create new)
    • ensure_reverse_tunnel_active() - Ensures tunnel is active (auto-creates/rotates)

3. API Endpoints

  • GET /reverse-tunnels/devices/{dsn}/reverse-tunnels - List tunnels
  • POST /reverse-tunnels/devices/{dsn}/reverse-tunnels - Create tunnel (creates AWS tunnel immediately)
  • GET /reverse-tunnels/devices/{dsn}/reverse-tunnels/{id} - Get tunnel details
  • PUT /reverse-tunnels/devices/{dsn}/reverse-tunnels/{id} - Update tunnel
  • DELETE /reverse-tunnels/devices/{dsn}/reverse-tunnels/{id} - Delete tunnel (closes AWS tunnel)
  • GET /reverse-tunnels/devices/{dsn}/reverse-tunnels-config - Device config endpoint (ensures tunnels active)

4. Automatic Maintenance

  • ✅ Celery task: cleanup_expired_reverse_tunnels (hourly)
  • ✅ Celery task: rotate_expiring_reverse_tunnels (every 30 min)

5. Frontend UI

  • ✅ Reverse Tunnels page at /reverse-tunnels
  • ✅ Device selector dropdown
  • ✅ Create tunnel dialog (name, port, protocol, auth)
  • ✅ Tunnels table with enable/disable, copy URL, delete
  • ✅ Info banner about config applying on startup
  • ✅ Navigation menu item added

6. Infrastructure (AWS CDK)

  • ✅ DynamoDB table for tunnel configuration cache
  • ✅ Lambda proxy function (NodeJS) for routing traffic
  • ✅ Lambda cache warmer (Python) to sync PostgreSQL → DynamoDB
  • ✅ EventBridge rule to run cache warmer every 5 minutes
  • ✅ IAM permissions for backend to manage IoT Secure Tunnels
  • ✅ IAM permissions for Lambda proxy to access tunnels
  • ✅ Backend environment variable for tunnel cache table
  • ✅ API Gateway HTTP API with Lambda integration
  • ✅ ACM certificate for *.fleet.roboticks.io
  • ✅ CloudFront distribution for TLS termination
  • ✅ Route53 wildcard DNS records (A and AAAA)
CDK Stack Components Added (in infrastructure/lib/roboticks-stack.ts):
  • ReverseTunnelCache - DynamoDB table with GSI on tunnel_id
  • ReverseTunnelProxy - Lambda function for traffic proxying
  • TunnelCacheWarmer - Lambda function to sync configs
  • TunnelCacheWarmerRule - EventBridge schedule (5 min intervals)
  • ReverseTunnelApi - API Gateway HTTP API
  • FleetCertificate - ACM wildcard certificate for *.fleet.roboticks.io
  • TunnelDistribution - CloudFront distribution with custom domain
  • FleetWildcardARecord - Route53 A record for *.fleet.roboticks.io
  • FleetWildcardAAAARecord - Route53 AAAA record for IPv6
  • Backend IAM policy for IoT Secure Tunneling operations
  • Comprehensive outputs (API URL, CloudFront domain, certificate ARN)

What Needs To Be Implemented

1. Device Agent (Robotics SDK)

Location: /Users/mujacic/roboticks-sdk (Python SDK for devices) Required Changes:

A. Add Reverse Tunnel Manager Module

Create roboticks_sdk/reverse_tunnel_manager.py:
"""Reverse tunnel manager for device-side tunnel establishment."""
import subprocess
import logging
from typing import List, Dict, Optional
import requests

logger = logging.getLogger(__name__)

class ReverseTunnelManager:
    """
    Manages reverse tunnels on the device.

    Fetches tunnel configuration from backend and establishes
    AWS IoT Secure Tunnels using local proxy.
    """

    def __init__(self, device_dsn: str, api_url: str, cert_path: str, key_path: str):
        self.device_dsn = device_dsn
        self.api_url = api_url
        self.cert_path = cert_path
        self.key_path = key_path
        self.tunnels: List[Dict] = []
        self.tunnel_processes: Dict[int, subprocess.Popen] = {}

    def fetch_tunnel_config(self) -> List[Dict]:
        """Fetch reverse tunnel configuration from backend."""
        try:
            response = requests.get(
                f"{self.api_url}/api/v1/reverse-tunnels/devices/{self.device_dsn}/reverse-tunnels-config",
                cert=(self.cert_path, self.key_path),
                verify=True,
                timeout=30
            )
            response.raise_for_status()
            config = response.json()
            self.tunnels = config['tunnels']
            logger.info(f"Fetched {len(self.tunnels)} tunnel configurations")
            return self.tunnels
        except Exception as e:
            logger.error(f"Failed to fetch tunnel config: {e}")
            return []

    def establish_tunnels(self):
        """Establish all enabled tunnels."""
        for tunnel_config in self.tunnels:
            if not tunnel_config['is_enabled']:
                logger.info(f"Tunnel {tunnel_config['name']} is disabled, skipping")
                continue

            if not tunnel_config.get('tunnel_id') or not tunnel_config.get('destination_access_token'):
                logger.warning(f"Tunnel {tunnel_config['name']} has no AWS tunnel, skipping")
                continue

            self.start_tunnel_agent(tunnel_config)

    def start_tunnel_agent(self, config: Dict) -> Optional[subprocess.Popen]:
        """
        Start AWS IoT Secure Tunnel local proxy agent.

        Downloads and runs localproxy binary if not present.
        """
        try:
            # Ensure localproxy is installed
            self._ensure_localproxy_installed()

            command = [
                'localproxy',
                '-r', 'us-east-1',  # AWS region
                '-s', str(config['local_port']),
                '-t', config['destination_access_token'],
            ]

            logger.info(f"Starting tunnel: {config['name']} -> localhost:{config['local_port']}")

            process = subprocess.Popen(
                command,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                text=True
            )

            self.tunnel_processes[config['id']] = process
            logger.info(f"Tunnel started (PID: {process.pid}) for {config['public_url']}")
            return process

        except Exception as e:
            logger.error(f"Failed to start tunnel {config['name']}: {e}")
            return None

    def _ensure_localproxy_installed(self):
        """Download localproxy if not present."""
        # Check if localproxy exists
        result = subprocess.run(['which', 'localproxy'], capture_output=True)
        if result.returncode == 0:
            return

        # Download localproxy
        logger.info("Downloading AWS IoT Secure Tunneling localproxy...")
        import platform
        import urllib.request

        arch = platform.machine()
        if arch == 'x86_64':
            url = "https://github.com/aws-samples/aws-iot-securetunneling-localproxy/releases/latest/download/localproxy-linux-x86_64"
        elif arch == 'aarch64':
            url = "https://github.com/aws-samples/aws-iot-securetunneling-localproxy/releases/latest/download/localproxy-linux-arm64"
        else:
            raise Exception(f"Unsupported architecture: {arch}")

        urllib.request.urlretrieve(url, '/usr/local/bin/localproxy')
        os.chmod('/usr/local/bin/localproxy', 0o755)
        logger.info("localproxy installed to /usr/local/bin/localproxy")

    def stop_all_tunnels(self):
        """Stop all running tunnel processes."""
        for tunnel_id, process in self.tunnel_processes.items():
            try:
                process.terminate()
                process.wait(timeout=5)
                logger.info(f"Stopped tunnel {tunnel_id}")
            except Exception as e:
                logger.error(f"Failed to stop tunnel {tunnel_id}: {e}")

B. Integrate into Device Startup

Modify roboticks_sdk/device.py (main device class):
from roboticks_sdk.reverse_tunnel_manager import ReverseTunnelManager

class RoboticksDevice:
    def __init__(self, ...):
        # ... existing init
        self.reverse_tunnel_manager = None

    def start(self):
        """Start device services."""
        # ... existing startup code

        # Initialize reverse tunnels
        self._setup_reverse_tunnels()

    def _setup_reverse_tunnels(self):
        """Setup reverse tunnels on device startup."""
        try:
            self.reverse_tunnel_manager = ReverseTunnelManager(
                device_dsn=self.dsn,
                api_url=self.config.api_url,
                cert_path=self.config.cert_path,
                key_path=self.config.key_path
            )

            # Fetch configuration
            tunnels = self.reverse_tunnel_manager.fetch_tunnel_config()

            # Establish tunnels
            if tunnels:
                self.reverse_tunnel_manager.establish_tunnels()
                logger.info(f"Reverse tunnels established: {len(tunnels)}")

        except Exception as e:
            logger.error(f"Failed to setup reverse tunnels: {e}")

    def stop(self):
        """Stop device services."""
        # Stop reverse tunnels
        if self.reverse_tunnel_manager:
            self.reverse_tunnel_manager.stop_all_tunnels()

        # ... existing shutdown code

C. Add Dependencies

Add to roboticks-sdk/requirements.txt:
requests>=2.31.0

2. DNS Configuration

Status: ✅ Route53 DNS Configured Automatically The CDK stack automatically creates:
  • ✅ Wildcard A record: *.fleet.roboticks.io → CloudFront
  • ✅ Wildcard AAAA record (IPv6): *.fleet.roboticks.io → CloudFront
  • ✅ ACM certificate for TLS with DNS validation
No manual DNS configuration required - Route53 handles all DNS for *.fleet.roboticks.io automatically. Optional - Cloudflare Proxy (if using Cloudflare): If you want to add Cloudflare in front of Route53 for additional DDoS protection:
  1. Log into Cloudflare Dashboard
  2. Add DNS Record:
    • Type: CNAME
    • Name: *.fleet
    • Target: CloudFront distribution domain (from CDK output)
    • Proxy status: DNS only (grey cloud) - CloudFront already provides DDoS protection
    • TTL: Auto
  3. SSL/TLS Settings:
    • Mode: Full (strict)
    • Always Use HTTPS: On
    • Minimum TLS Version: 1.2
Note: Cloudflare is optional since CloudFront already provides TLS termination and DDoS protection.

3. Proxy Service (Traffic Forwarder)

Status: ✅ Complete infrastructure, ⏳ WebSocket/HTTP proxy logic pending What’s Complete:
  • ✅ Lambda function created (ReverseTunnelProxy in CDK stack)
  • ✅ DynamoDB cache for fast tunnel lookups
  • ✅ Cache warmer syncing PostgreSQL → DynamoDB every 5 minutes
  • ✅ IAM permissions for IoT Secure Tunneling access
  • ✅ API Gateway HTTP API with Lambda integration
  • ✅ CloudFront distribution for TLS termination
  • ✅ Route53 wildcard DNS (*.fleet.roboticks.io)
  • ✅ ACM certificate for HTTPS
  • ✅ Tunnel verification and status checking
What’s Pending:
  • ⏳ WebSocket-to-WebSocket bidirectional proxy in Lambda
  • ⏳ HTTP-to-IoT-Tunnel proxy in Lambda
  • ⏳ AWS IoT Secure Tunneling Local Proxy protocol implementation
Current Implementation (in infrastructure/lambda/reverse-tunnel-proxy/index.js):
  • Extracts DSN from subdomain (device-.fleet.roboticks.io)
  • Looks up tunnel configuration in DynamoDB cache
  • Verifies tunnel is open in AWS IoT
  • Returns success response (proxy logic marked as TODO)
Next Steps for Full Proxy:

Option A: AWS Lambda + API Gateway (Serverless) ✅ Partially Implemented

Create Lambda function to proxy requests:
import boto3
import json
import base64

iotsecuretunneling = boto3.client('iotsecuretunneling')

def lambda_handler(event, context):
    """
    Proxy WebSocket connections to AWS IoT Secure Tunnels.

    Route: device-{dsn}.fleet.roboticks.io → AWS IoT Tunnel
    """
    # Extract device DSN from hostname
    host = event['headers'].get('host', '')
    dsn = host.split('.')[0].replace('device-', '')

    # Look up tunnel by public_url
    # (requires DynamoDB cache or direct DB query)
    tunnel = get_tunnel_by_dsn(dsn)

    if not tunnel or not tunnel['source_access_token']:
        return {
            'statusCode': 404,
            'body': json.dumps({'error': 'Tunnel not found or not active'})
        }

    # Establish WebSocket connection to AWS IoT Secure Tunnel
    # using source_access_token
    # Forward request through tunnel

    # ... proxy logic ...

    return {
        'statusCode': 200,
        'body': 'Connected'
    }

Option B: EC2/ECS Service (Persistent)

Deploy persistent proxy service:
# proxy_service.py
from fastapi import FastAPI, WebSocket
import asyncio
import boto3

app = FastAPI()

@app.websocket("/")
async def websocket_proxy(websocket: WebSocket):
    """Proxy WebSocket to AWS IoT Secure Tunnel."""
    await websocket.accept()

    # Extract DSN from host header
    host = websocket.headers.get('host')
    dsn = extract_dsn(host)

    # Get tunnel config
    tunnel = await get_tunnel_config(dsn)

    # Connect to AWS IoT Secure Tunnel
    tunnel_ws = await connect_to_iot_tunnel(
        tunnel['tunnel_id'],
        tunnel['source_access_token']
    )

    # Bi-directional proxy
    await asyncio.gather(
        forward_client_to_tunnel(websocket, tunnel_ws),
        forward_tunnel_to_client(tunnel_ws, websocket)
    )
Deploy:
docker build -t reverse-tunnel-proxy .
docker push ECR_URL/reverse-tunnel-proxy:latest

# Deploy to ECS or K8s

4. Celery Beat Configuration

File: backend/celeryconfig.py or backend/app/core/celery.py Add to beat schedule:
from celery.schedules import crontab

beat_schedule = {
    'cleanup-expired-reverse-tunnels': {
        'task': 'cleanup_expired_reverse_tunnels',
        'schedule': crontab(minute=0),  # Every hour at :00
    },
    'rotate-expiring-reverse-tunnels': {
        'task': 'rotate_expiring_reverse_tunnels',
        'schedule': crontab(minute='*/30'),  # Every 30 minutes
    },
}
Start Celery Beat:
celery -A app.celery_app beat --loglevel=info

Testing Checklist

Backend Testing

  • Run migration: alembic upgrade head
  • Create reverse tunnel via API
  • Verify tunnel created in AWS IoT console
  • Update tunnel (disable/enable)
  • Delete tunnel (verify closed in AWS)
  • Device config endpoint returns tunnel config
  • Celery tasks run without errors

Device Testing

  • Device fetches tunnel config on startup
  • localproxy binary downloads/runs correctly
  • Tunnel establishes connection to AWS
  • Local service (e.g., web server on port 8080) accessible

End-to-End Testing

  • Access device via public URL: https://device-abc123.fleet.roboticks.io
  • Traffic routes correctly: Browser → Cloudflare → Proxy → IoT Tunnel → Device
  • Tunnel auto-rotates before expiration
  • Expired tunnels cleaned up by Celery task

Deployment Steps

  1. Deploy Backend:
    cd backend
    alembic upgrade head
    systemctl restart roboticks-api
    systemctl restart celery-beat
    
  2. Deploy Frontend:
    cd frontend
    npm run build
    aws s3 sync dist/ s3://roboticks-frontend
    aws cloudfront create-invalidation --distribution-id XXX --paths "/*"
    
  3. Configure Cloudflare DNS: Add wildcard CNAME for *.fleet.roboticks.io
  4. Deploy Proxy Service: Deploy Lambda or EC2 proxy
  5. Update Device SDK: Push new version with tunnel manager
  6. Update Devices: Devices will fetch new SDK on next update

Monitoring

CloudWatch Metrics

  • Tunnel creation success/failure rate
  • Tunnel rotation count
  • Active tunnels count
  • Request count per tunnel

Logs

  • Backend: /var/log/roboticks-api.log
  • Celery: /var/log/celery-beat.log
  • Device: Check device logs for tunnel establishment

Alerts

  • Alert if tunnel creation fails > 5 times/hour
  • Alert if Celery tasks fail
  • Alert if > 10% of devices fail to establish tunnels

Cost Estimation

AWS IoT Secure Tunneling

  • First 500 tunnels/month: Free
  • Additional tunnels: $0.01 per tunnel per month
  • Data transfer: $0.10 per GB
Example: 100 devices, 1 tunnel each, 10GB/month = ~$11/month

Cloudflare

  • Free plan supports unlimited DNS queries
  • Pro plan ($20/month) for advanced features

Summary

✅ Backend Implementation: 100% Complete
  • Database schema
  • API endpoints
  • AWS IoT integration
  • Automatic maintenance
  • Frontend UI
✅ Infrastructure (AWS CDK): 100% Complete
  • DynamoDB tunnel cache table
  • Lambda proxy function (infrastructure complete)
  • Cache warmer Lambda (syncs PostgreSQL → DynamoDB)
  • EventBridge schedules
  • IAM permissions
  • Backend environment variables
  • API Gateway HTTP API with Lambda integration
  • CloudFront distribution with custom domain
  • Route53 wildcard DNS (A + AAAA records)
  • ACM certificate for *.fleet.roboticks.io
⏳ Remaining Work:
  1. Device SDK: Add reverse tunnel manager (~4 hours)
  2. Testing: End-to-end testing (~4 hours)
Total Estimated Time: 1 day for full production deployment Current Status:
  • ✅ HTTP proxy through IoT Secure Tunnel implemented
  • ✅ Device connection verification implemented
  • ✅ Detailed error handling and troubleshooting implemented
  • ✅ Celery Beat scheduled tasks implemented (tunnel cleanup + rotation)
  • ✅ Celery Worker + Beat ECS services configured in CDK
  • WebSocket support FULLY implemented (bidirectional real-time communication)
  • ✅ API Gateway WebSocket API configured in CDK
  • ✅ WebSocket Lambda proxy implemented with connection management
  • ⏳ Device SDK implementation needed