Statistics Tracking Implementation
Overview
Implemented a comprehensive, production-ready statistics tracking system with hourly updates, daily snapshots, and monthly aggregations for billing and historical reporting.Architecture
Data Flow
Database Schema
project_stats (Real-time Counters)
- Current month counters (logs, sessions, MQTT messages, API calls, etc.)
- Storage metrics (total_docker_storage_bytes, total_s3_storage_bytes, etc.)
- Resource counts (total_devices, active_devices, current_ecr_images)
- Lifetime aggregates (lifetime_logs, lifetime_sessions, etc.)
daily_stats (Daily Snapshots)
- Immutable daily records created at 00:00 UTC
- Captures previous day’s usage and storage
- Used for daily trend charts
- Aggregated into monthly stats
monthly_stats (Monthly Historical Records)
- Created on 1st of each month from daily_stats aggregation
- Peak storage values across the month
- Totals for usage (logs, sessions, MQTT, etc.)
- Billing metadata (tier, exceeded_limits, overage_charge)
- Used for invoicing and compliance
Lambda Functions
1. MQTT Counter Lambda (infrastructure/lambda/mqtt-counter/)
- Trigger: IoT Rule on all
roboticks/#topics - Frequency: Real-time (every MQTT message)
- Function: Increment MQTT inbound/outbound counters in database
- Timeout: 30 seconds
- Dependencies: psycopg2-binary, boto3
2. Hourly Stats Updater Lambda (infrastructure/lambda/hourly-stats-updater/)
- Trigger: EventBridge hourly schedule
- Frequency: Every hour
- Function: Query AWS APIs (ECR, S3, DB) and refresh project_stats
- Timeout: 5 minutes
- Memory: 512 MB
- Dependencies: psycopg2-binary, boto3, sqlalchemy, backend app layer
- ECR:
list_images(),describe_images()for image counts and sizes - S3:
list_objects_v2()for storage calculation - Database: Count devices, sessions, logs
3. Daily Snapshot Lambda (infrastructure/lambda/daily-snapshot/)
- Trigger: EventBridge daily at 00:00 UTC
- Frequency: Daily
- Function: Create daily_stats snapshot for yesterday
- Timeout: 5 minutes
- Memory: 512 MB
- Dependencies: psycopg2-binary, boto3, sqlalchemy, backend app layer
- For each active project
- Read current project_stats
- Create daily_stats record with yesterday’s date
- Store usage counts, storage metrics, resource counts
4. Monthly Reset Lambda (infrastructure/lambda/monthly-reset/)
- Trigger: EventBridge on 1st of month at 00:00 UTC
- Frequency: Monthly
- Function: Aggregate daily_stats → monthly_stats, reset counters
- Timeout: 10 minutes
- Memory: 1024 MB
- Dependencies: psycopg2-binary, boto3, sqlalchemy, backend app layer
- For each active project
- Query all daily_stats for previous month
- Sum usage counts (logs, sessions, MQTT, etc.)
- Find peak storage values
- Create monthly_stats record
- Archive to lifetime totals
- Reset monthly counters to 0
Backend Services
StatsUpdaterService (backend/app/services/stats_updater.py)
Production-ready service with real AWS integration.
Key Methods:
update_all_projects()- Refresh stats for all active projectsupdate_project_stats(project)- Update one project from AWS APIsincrement_mqtt_inbound(project_id, count)- Real-time counterincrement_mqtt_outbound(project_id, count)- Real-time countercreate_daily_snapshot(project)- Create daily stats recordcreate_all_daily_snapshots()- Create for all projectsreset_monthly_stats(project)- Aggregate and resetreset_all_monthly_stats()- Reset all projects
- CloudWatch: MQTT metrics validation
- ECR: Image counts and sizes via
ecr.list_images(),ecr.describe_images() - S3: Storage usage via
s3.list_objects_v2()with project prefix - Database: Resource counts via SQLAlchemy queries
LimitService (backend/app/services/limit_service.py)
Refactored to only check limits (read-only), not update stats.
Methods:
check_project_limit(org)- Verify project countcheck_user_limit(org, count)- Verify user countcheck_device_limit(org, project)- Verify device countcheck_storage_limit(org, project, bytes)- Verify storagecheck_mqtt_inbound_limit(org, project)- Verify MQTT inboundcheck_mqtt_outbound_limit(org, project)- Verify MQTT outboundcheck_ecr_image_limit(org, project)- Verify ECR images
Database Migrations
Migration: 20251108_create_daily_stats.py (Revision: j8901234567h)
Creates daily_stats table with:
- Foreign keys to projects and organizations
- Time period fields (date, year, month, day)
- Usage counters (logs, sessions, MQTT, API, deployments)
- Storage snapshots (storage_bytes, s3_storage_bytes, docker_storage_bytes, logs_storage_bytes)
- Network metrics (docker_upload_bytes, docker_download_bytes, data_transfer_bytes)
- Resource counts (device_count, active_device_count, ecr_image_count)
- Indexes for efficient querying by project, org, date, year/month
- Unique constraint on (project_id, date)
Migration: 20251108_create_monthly_stats.py (Revision: i7890123456g)
Creates monthly_stats table with:
- Foreign keys to projects and organizations
- Time period fields (year, month)
- Aggregated usage totals
- Peak storage values
- Billing metadata (tier, exceeded_limits, overage_charge)
- Indexes for efficient querying by project, org, year/month
- Unique constraint on (project_id, year, month)
CDK Infrastructure
Lambda Layers
psycopg2Layer (existing):- Contains psycopg2-binary for PostgreSQL access
- Used by all Lambdas
- Bundles backend app code and dependencies
- Used by hourly-stats, daily-snapshot, monthly-reset Lambdas
- Enables importing
app.services.stats_updater
Environment Variables
All stats Lambdas receive:IAM Permissions
- Database: All Lambdas can connect to RDS PostgreSQL in VPC
- Secrets Manager: All Lambdas can read DB credentials
- S3: Hourly stats Lambda can read storage bucket
- ECR: Hourly stats Lambda can pull/describe images
- IoT: MQTT counter Lambda invoked by IoT Rules
EventBridge Schedules
- Hourly:
Schedule.rate(cdk.Duration.hours(1)) - Daily:
Schedule.cron({ hour: '0', minute: '0' }) - Monthly:
Schedule.cron({ day: '1', hour: '0', minute: '0' })
Testing
Local Development
Lambda Testing
- Deploy stack:
cd infrastructure && cdk deploy - Monitor CloudWatch Logs for each Lambda
- Check database for daily_stats and monthly_stats records
- Verify project_stats counters increment correctly
Manual Triggers
Future Enhancements
High Priority
- Add retry logic with exponential backoff for AWS API throttling
- Implement dead-letter queues for failed Lambda invocations
- Add CloudWatch alarms for Lambda errors
- Create API endpoints to query daily_stats and monthly_stats
Medium Priority
- Add query-by-date-range logic in create_daily_snapshot() for more accurate counts
- Implement pagination for large S3 buckets and ECR repositories
- Add support for multiple regions
- Create admin dashboard for viewing aggregated stats
Low Priority
- Add support for custom retention policies (delete old daily_stats)
- Export monthly_stats to S3 for long-term archival
- Integrate with Stripe for automated billing
- Add anomaly detection for unusual usage patterns
Deployment Checklist
- Database migrations created and reviewed
- Lambda functions written and tested locally
- CDK stack updated with Lambda definitions
- EventBridge rules configured
- IoT Rule for MQTT counter created
- IAM permissions granted
- Environment variables configured
- Run database migrations:
cd backend && alembic upgrade head - Deploy CDK stack:
cd infrastructure && cdk deploy - Verify Lambda functions in AWS console
- Check CloudWatch Logs for errors
- Verify daily_stats records created after 00:00 UTC
- Monitor MQTT message counters in real-time
- Wait for 1st of month to verify monthly_stats creation
Monitoring
CloudWatch Metrics
Monitor Lambda invocations, errors, and duration:AWS/Lambda/InvocationsAWS/Lambda/ErrorsAWS/Lambda/DurationAWS/Lambda/Throttles
Database Queries
Troubleshooting
MQTT counters not incrementing
- Check IoT Rule is active:
aws iot get-topic-rule --rule-name MqttCounterRule - Verify Lambda has IoT invoke permission
- Check CloudWatch Logs for MQTT counter Lambda
- Verify topic format:
roboticks/{org_id}/{project_id}/{device_id}/...
Hourly stats not updating
- Check EventBridge rule is enabled
- Verify Lambda timeout is sufficient (5 min)
- Check AWS API rate limits
- Verify VPC and security group configuration
Daily snapshots missing
- Verify EventBridge cron schedule
- Check Lambda execution logs
- Ensure database has active projects
- Verify timezone is UTC
Monthly reset failed
- Check if daily_stats exist for previous month
- Verify Lambda memory is sufficient (1024 MB)
- Check database transaction timeout
- Review aggregation logic for edge cases
References
- Backend service:
backend/app/services/stats_updater.py - Limit checking:
backend/app/services/limit_service.py - Lambda functions:
infrastructure/lambda/*/ - CDK stack:
infrastructure/lib/roboticks-stack.ts - Database models:
backend/app/models/daily_stats.py,backend/app/models/monthly_stats.py - Migrations:
backend/alembic/versions/20251108_*.py