Skip to main content

Reverse Tunnel Feature Finalization

Summary

Successfully finalized the reverse tunnel feature with comprehensive lifecycle management, auto-expiration, cascade cleanup, and device reconnection capabilities.

Features Implemented

1. ✅ Tunnel Expiration Tracking

Backend:
  • expires_at field already exists in reverse_tunnels table
  • Automatically set to 12 hours when creating AWS IoT Secure Tunnel
  • Location: backend/app/services/reverse_tunnel_service.py:724
Frontend:
  • Added expires_at field to TypeScript interface
  • Added tunnel_status field for status tracking
  • Location: frontend/src/pages/ReverseTunnels.tsx

2. ✅ Background Expiration Checker

Implementation:
  • Created backend/app/services/reverse_tunnel_expiration.py
  • Runs every 5 minutes checking for expired tunnels
  • Queries tunnels where expires_at < now and tunnel_status = 'active'
  • Calls stop_reverse_tunnel() with reason="expired"
Integration:
  • Integrated into FastAPI startup via lifespan manager
  • Location: backend/app/main.py:95-128
  • Automatically starts when backend launches
  • Gracefully cancels on shutdown

3. ✅ Cascade Cleanup on Expiration

Flow: When a tunnel expires, the system automatically:
  1. Deregisters ECS task IP from ALB target group
  2. Deletes ALB listener rule
  3. Stops ECS Fargate task
  4. Closes AWS IoT Secure Tunnel
  5. Sends MQTT control message to device:
    • Topic: roboticks/fleet/{device_id}/reverse-tunnel/control
    • Payload: action: "reverse_tunnel_stopped", reason: "expired"
  6. Marks tunnel as disabled and stopped in database
Implementation:
  • Existing stop_reverse_tunnel() handles full cleanup
  • Location: backend/app/services/reverse_tunnel_service.py:177-249

4. ✅ Device Reconnection on Heartbeat

Lambda Function:
  • Updated infrastructure/lambda/device-heartbeat-handler/index.py
  • Detects offline → online transitions
  • Queries for active reverse tunnels
  • Sends MQTT reconnection messages
How it works:
  1. Device sends heartbeat after being offline
  2. Lambda checks previous status: was_offline = (previous_status != 'ONLINE')
  3. If offline → online transition detected:
    • Query: SELECT * FROM reverse_tunnels WHERE device_id = ? AND is_enabled = true AND tunnel_status = 'active'
    • For each active tunnel, publish MQTT message with reconnect: true flag
    • Device reconnects to existing ECS proxy task and IoT tunnel
  4. ECS tasks and IoT tunnels remain running during device offline periods
MQTT Message Format:
{
  "action": "reverse_tunnel_created",
  "tunnel_type": "reverse_tunnel",
  "tunnel_id": "...",
  "aws_tunnel_id": "...",
  "destination_access_token": "...",
  "region": "us-west-2",
  "service": "HTTP-8080",
  "local_port": 8080,
  "protocol": "http",
  "public_url": "device-xxx.fleet.roboticks.io",
  "is_enabled": true,
  "reconnect": true,
  "timestamp": "2025-12-05T12:00:00Z"
}

5. ✅ UI Enhancements

Expiration Display:
  • Shows time remaining in status column
  • Color coding:
    • Red: < 1 hour remaining or expired
    • Orange: < 2 hours remaining
    • Gray: > 2 hours remaining
  • Format: “Xh Ym left” or “Expired”
  • Location: frontend/src/pages/ReverseTunnels.tsx:247-267
Tunnel Status:
  • Shows tunnel status chip: pending, starting, active, stopping, stopped, error
  • Color coding:
    • Green: active
    • Orange: starting, stopping
    • Red: error
    • Gray: stopped, pending
Creation Warning:
  • Alert box at top of create dialog
  • Warning icon
  • Clear message about 12-hour expiration
  • Location: frontend/src/pages/ReverseTunnels.tsx:552-560

6. ✅ API Documentation

Endpoint Documentation:
  • Updated POST /devices/{dsn}/reverse-tunnels docstring
  • Added prominent warning about 12-hour expiration
  • Location: backend/app/api/v1/reverse_tunnels.py:112-114
Response Schema:
  • Added tunnel_status field to ReverseTunnelResponse
  • Added comment on expires_at: “Auto-expiration time (12 hours from creation)”
  • Location: backend/app/schemas/reverse_tunnel.py:72,81

Architecture Diagrams

Expiration Flow

User Creates Tunnel → expires_at set to now + 12 hours

         Background Task (every 5 min)

         Check: expires_at < now AND tunnel_status = 'active'

         stop_reverse_tunnel(reason="expired")

         Cascade Cleanup:
           1. Delete ALB listener rule
           2. Deregister ALB target
           3. Stop ECS Fargate task
           4. Close AWS IoT Secure Tunnel
           5. Publish MQTT stop message

         Update DB: tunnel_status = "stopped", is_enabled = false

Device Reconnection Flow

Device Offline → Network Restored → Device Sends Heartbeat

     Lambda: SELECT status FROM fleet_devices

     Check: was_offline = (prev_status != 'ONLINE')

     Query active reverse tunnels for device

     FOR EACH tunnel:
       Publish MQTT with reconnect=true

     Device receives MQTT control message

     Device reconnects localproxy to:
       - Existing ECS proxy task (still running)
       - Existing AWS IoT tunnel (still open)

     Traffic resumes flowing through tunnel

Testing Checklist

Backend

  • Verify background task starts on backend launch
  • Verify tunnels expire after 12 hours
  • Verify cascade cleanup runs (check CloudWatch logs)
  • Verify MQTT stop message sent on expiration
  • Verify device reconnection MQTT sent on heartbeat after offline

Frontend

  • Verify expiration time displays correctly
  • Verify color changes as expiration approaches
  • Verify “Expired” shows for expired tunnels
  • Verify tunnel status chip shows correct status
  • Verify warning alert shows in create dialog

End-to-End

  • Create tunnel → verify expires_at is set
  • Wait 12+ hours → verify tunnel auto-expires
  • Verify ECS task stopped
  • Verify ALB rule deleted
  • Verify IoT tunnel closed
  • Device offline → online → verify reconnection MQTT sent
  • Verify device reconnects and tunnel works

Configuration

Environment Variables

No new environment variables needed. Uses existing:
  • AWS_REGION - For AWS API calls
  • Database connection settings
  • IoT endpoint settings

Background Task Interval

Default: 300 seconds (5 minutes) To change, edit backend/app/main.py:111:
expiration_task = asyncio.create_task(run_expiration_checker_loop(interval_seconds=300))

Tunnel Expiration Time

Default: 720 minutes (12 hours) To change, edit backend/app/services/reverse_tunnel_service.py:114:
tunnel_info = self._create_iot_tunnel(
    reverse_tunnel=reverse_tunnel,
    device=device,
    timeout_minutes=720,  # Change this value
)

Deployment Notes

Backend Deployment

  1. Deploy backend with updated code
  2. Background task automatically starts
  3. Existing tunnels will be checked for expiration
  4. No database migration needed (expires_at already exists)

Lambda Deployment

  1. Update device-heartbeat-handler Lambda
  2. Add IAM permissions for IoT Data publish (should already exist)
  3. No environment variables needed

Frontend Deployment

  1. Build frontend with updated code
  2. Users will immediately see expiration times
  3. Create dialog will show warning

Monitoring

CloudWatch Logs

Backend logs to watch:
"Starting background tasks..."
"Found X expired reverse tunnels to clean up"
"Successfully expired reverse tunnel {id}"
"Stopped ECS task {arn}"
"Closed AWS IoT Secure Tunnel {tunnel_id}"
Lambda logs to watch:
"Device {dsn} came online from offline state"
"Found X active reverse tunnels for device {dsn}"
"Publishing reconnection message for tunnel {id}"

Metrics to Track

  • Number of tunnels expired per day
  • Average tunnel lifetime
  • Device reconnection success rate
  • ECS task cleanup success rate

Future Enhancements

  1. Configurable Expiration Time
    • Allow users to set expiration time (1-24 hours)
    • Enforce organization tier limits
  2. Expiration Notifications
    • Email warning 1 hour before expiration
    • Dashboard notification for expiring tunnels
  3. Auto-Renewal
    • Optional auto-renewal for active tunnels
    • Configurable maximum renewal count
  4. Usage Tracking
    • Track bandwidth usage per tunnel
    • Generate usage reports
  5. Connection Health
    • Monitor ECS task health
    • Auto-restart failed tunnels
    • Alert on connection issues

Files Modified

Backend

  • backend/app/main.py - Added lifespan manager
  • backend/app/services/reverse_tunnel_expiration.py - New file
  • backend/app/schemas/reverse_tunnel.py - Added tunnel_status
  • backend/app/api/v1/reverse_tunnels.py - Updated docs

Lambda

  • infrastructure/lambda/device-heartbeat-handler/index.py - Added reconnection logic

Frontend

  • frontend/src/pages/ReverseTunnels.tsx - Added expiration display and warning

Device SDK

  • roboticks-sdk/docker/roboticks/Dockerfile - Fixed nginx gzip issue

Rollback Plan

If issues occur:
  1. Disable Background Task:
    • Comment out lines 111-113 in backend/app/main.py
    • Restart backend
  2. Disable Reconnection:
    • Comment out lines 311-325 in Lambda heartbeat handler
    • Redeploy Lambda
  3. Revert Frontend:
    • Deploy previous frontend version
    • Users won’t see expiration times but functionality continues

Support

For issues or questions:
  • Check CloudWatch logs for backend/Lambda errors
  • Verify ECS tasks are running
  • Check AWS IoT tunnel status in console
  • Review device logs for reconnection attempts