Lambda Bundling Solution for Stats Tracking

Problem

The backend app layer exceeded Lambda’s 250MB unzipped size limit when bundling all dependencies from requirements.txt.

Solution

Bundle backend code directly into each Lambda function with minimal dependencies, avoiding the need for a shared layer.

Key Optimizations

Minimal Dependencies: Each Lambda only installs sqlalchemy==2.0.23
- psycopg2-binary comes from existing psycopg2Layer
- boto3 is provided by Lambda runtime (no need to bundle)
- Removed FastAPI, uvicorn, redis, influxdb, and other backend-specific deps
Selective Code Copying: Only copy necessary backend modules
- app/models - Database models
- app/services - Business logic (StatsUpdaterService)
- app/db - Database connection
- app/core - Core utilities
- Excluded: API routes, middleware, schemas, CLI tools

CDK Bundling: Use Docker-based bundling at deploy time

code: lambda.Code.fromAsset('lambda/hourly-stats-updater', {
  bundling: {
    image: lambda.Runtime.PYTHON_3_11.bundlingImage,
    command: ['bash', '-c', [
      'pip install -r requirements.txt -t /asset-output',
      'cp index.py /asset-output/',
      'mkdir -p /asset-output/app',
      'cp -r ../../backend/app/models /asset-output/app/',
      'cp -r ../../backend/app/services /asset-output/app/',
      'cp -r ../../backend/app/db /asset-output/app/',
      'cp -r ../../backend/app/core /asset-output/app/',
      'touch /asset-output/app/__init__.py',
      // ... more __init__.py files
    ].join(' && ')],
  },
})

Lambda Functions Deployed

1. MQTT Counter Lambda

Trigger: IoT Rule on roboticks/# topics
Function: Increment MQTT message counters in real-time
Size: ~5MB (standalone, no backend code needed)

2. Hourly Stats Updater Lambda

Trigger: EventBridge every hour
Function: Query AWS APIs (ECR, S3) and refresh project_stats
Size: ~15MB (backend code + SQLAlchemy)
Timeout: 5 minutes
Memory: 512 MB

3. Daily Snapshot Lambda

Trigger: EventBridge daily at 00:00 UTC
Function: Create daily_stats snapshot for yesterday
Size: ~15MB
Timeout: 5 minutes
Memory: 512 MB

4. Monthly Reset Lambda

Trigger: EventBridge on 1st of month at 00:00 UTC
Function: Aggregate daily_stats → monthly_stats, reset counters
Size: ~15MB
Timeout: 10 minutes
Memory: 1024 MB

File Structure

infrastructure/
└── lambda/
    ├── mqtt-counter/
    │   ├── index.py
    │   └── requirements.txt
    ├── hourly-stats-updater/
    │   ├── index.py
    │   └── requirements.txt (only sqlalchemy)
    ├── daily-snapshot/
    │   ├── index.py
    │   └── requirements.txt (only sqlalchemy)
    ├── monthly-reset/
    │   ├── index.py
    │   └── requirements.txt (only sqlalchemy)
    ├── device-heartbeat/
    │   └── index.py
    ├── device-logs/
    │   └── index.py
    └── psycopg2-layer/
        └── python/
            └── psycopg2/

Deployment

Build

cd infrastructure
npm run build

Deploy

cdk deploy

The bundling happens automatically during cdk deploy:

CDK spins up a Docker container with Python 3.11
Installs sqlalchemy from requirements.txt
Copies Lambda handler (index.py)
Copies backend app code from ../../backend/app/
Creates __init__.py files for Python modules
Packages everything into a ZIP file
Uploads to Lambda

Benefits

✅ No Layer Size Issues: Each Lambda is self-contained, ~15MB total ✅ Fast Cold Starts: Minimal dependencies = faster Lambda init ✅ Independent Deployments: Can update Lambdas without affecting others ✅ Type Safety: Uses real backend models and services ✅ DRY Code: Reuses StatsUpdaterService logic

Alternative Approaches (Not Used)

❌ Lambda Layer

Problem: Exceeded 250MB limit with full backend deps
Would need: Separate layers for each dependency group

❌ ECS Scheduled Tasks

Pro: Can use full backend container
Con: More complex, requires ECS cluster, slower startup

❌ Lambda Container Images

Pro: Up to 10GB size limit
Con: Slower cold starts, more complex CI/CD

❌ Standalone Lambda Code

Pro: Minimal size
Con: Code duplication, no type safety, hard to maintain

Monitoring

Check Lambda execution logs:

aws logs tail /aws/lambda/RoboticksStack-HourlyStatsLambda-XXXXX --follow
aws logs tail /aws/lambda/RoboticksStack-DailySnapshotLambda-XXXXX --follow
aws logs tail /aws/lambda/RoboticksStack-MonthlyResetLambda-XXXXX --follow

Check CloudWatch metrics:

AWS/Lambda/Invocations
AWS/Lambda/Errors
AWS/Lambda/Duration
AWS/Lambda/ConcurrentExecutions

Troubleshooting

”Unable to import module”

Check that app/__init__.py files exist
Verify backend code copied correctly: aws lambda get-function --function-name XXX

”No module named ‘sqlalchemy’”

Check requirements.txt has sqlalchemy==2.0.23
Verify Docker bundling completed: look for “Bundling asset” in deploy logs

Size still too large

Remove unnecessary imports in StatsUpdaterService
Exclude unused backend modules
Use Lambda layers for common dependencies

Future Improvements

Optimize Bundle Size: Only copy files actually used by StatsUpdaterService
Caching: Cache backend app layer between builds
Compression: Use custom Docker image with pre-installed deps
Monitoring: Add X-Ray tracing for performance insights

Architecture

Lambda bundling

Lambda Bundling Solution for Stats Tracking

Problem

Solution

Key Optimizations

Lambda Functions Deployed

1. MQTT Counter Lambda

2. Hourly Stats Updater Lambda

3. Daily Snapshot Lambda

4. Monthly Reset Lambda

File Structure

Deployment

Build

Deploy

Benefits

Alternative Approaches (Not Used)

❌ Lambda Layer

❌ ECS Scheduled Tasks

❌ Lambda Container Images

❌ Standalone Lambda Code

Monitoring

Troubleshooting

”Unable to import module”

”No module named ‘sqlalchemy’”

Size still too large

Future Improvements

Architecture

​Lambda Bundling Solution for Stats Tracking

​Problem

​Solution

​Key Optimizations

​Lambda Functions Deployed

​1. MQTT Counter Lambda

​2. Hourly Stats Updater Lambda

​3. Daily Snapshot Lambda

​4. Monthly Reset Lambda

​File Structure

​Deployment

​Build

​Deploy

​Benefits

​Alternative Approaches (Not Used)

​❌ Lambda Layer

​❌ ECS Scheduled Tasks

​❌ Lambda Container Images

​❌ Standalone Lambda Code

​Monitoring

​Troubleshooting

​”Unable to import module”

​”No module named ‘sqlalchemy’”

​Size still too large

​Future Improvements

Lambda Bundling Solution for Stats Tracking

Problem

Solution

Key Optimizations

Lambda Functions Deployed

1. MQTT Counter Lambda

2. Hourly Stats Updater Lambda

3. Daily Snapshot Lambda

4. Monthly Reset Lambda

File Structure

Deployment

Build

Deploy

Benefits

Alternative Approaches (Not Used)

❌ Lambda Layer

❌ ECS Scheduled Tasks

❌ Lambda Container Images

❌ Standalone Lambda Code

Monitoring

Troubleshooting

”Unable to import module”

”No module named ‘sqlalchemy’”

Size still too large

Future Improvements