Stats Tracking Architecture
Overview
Roboticks tracks usage statistics at three granularities: hourly, daily, and monthly. This document explains how stats flow through the system.Data Flow
Database Tables
project_stats (1 record per project)
Purpose: Live counters for current state and current month
Updated by: Hourly lambda
Columns:
current_month_*: Counters reset monthly (logs, sessions, MQTT, API, storage)total_*: Lifetime cumulative counters (devices, sessions, logs)lifetime_*: Archived totals from completed monthslast_calculated_at: When hourly lambda last ran
hourly_stats (1 record per project per hour)
Purpose: Granular hourly snapshots for detailed charts
Created by: Hourly lambda (every hour)
Columns - Activity (DELTAS):
logs,sessions: NEW activity this hourmqtt_inbound_messages,mqtt_outbound_messages: NEW messages this hourapi_calls,deployments: NEW activity this hour
storage_bytes,s3_storage_bytes,docker_storage_bytes: Current state at end of hourdevice_count,active_device_count,ecr_image_count: Current counts at end of hour
daily_stats (1 record per project per day)
Purpose: Daily aggregations for daily/weekly charts
Created by: Daily lambda (00:00 UTC)
How it’s calculated:
- Activity:
SUM(hourly_stats.logs)- adds up 24 hourly deltas - Storage:
MAX(hourly_stats.storage_bytes)- peak storage during the day - Resources:
SELECT ... ORDER BY timestamp DESC LIMIT 1- last hour’s value
monthly_stats (1 record per project per month)
Purpose: Historical billing records, immutable after creation
Created by: Monthly lambda (1st of month, 00:00 UTC)
How it’s calculated:
- Activity:
SUM(daily_stats.logs)- adds up all daily sums (28-31 days) - Storage:
MAX(daily_stats.storage_bytes)- peak storage during the month - Resources: Last day’s values
tier: Organization tier at time of snapshot (for billing audit)exceeded_limits: Boolean if any usage exceeded limitsoverage_charge: Future billing integration
Lambda Functions
Hourly Stats Updater
Schedule: Every hour (via EventBridge) Runtime: ~30-60 seconds per project What it does:- Queries database for NEW logs/sessions created this hour (delta)
- Queries ECR for current image count and sizes (snapshot)
- Queries S3 for current storage usage (snapshot)
- Updates
project_statstable (cumulative counters) - Creates
hourly_statsrecord with hourly deltas
created_at >= hour_start AND created_at < hour_end
Daily Snapshot
Schedule: Daily at 00:00 UTC (via EventBridge) Runtime: ~5-10 seconds per project What it does:- Queries
hourly_statsfor yesterday (24 records) - Aggregates:
- Activity:
SUM(logs, sessions, mqtt, api) - Storage:
MAX(storage_bytes) - Resources:
ORDER BY timestamp DESC LIMIT 1(last hour)
- Activity:
- Creates
daily_statsrecord
Monthly Reset
Schedule: 1st of month at 00:00 UTC (via EventBridge) Runtime: ~10-15 seconds per project What it does:- Queries
daily_statsfor previous month (28-31 records) - Aggregates same way as daily (SUM activity, MAX storage)
- Archives to
project_stats.lifetime_*counters - Resets
project_stats.current_month_*to 0 - Updates
project_stats.month_start_date - Creates
monthly_statsrecord (immutable historical billing record)
Data Retention
Recommended retention:hourly_stats: 7 days (for recent detailed charts)daily_stats: 90 days (for monthly/weekly charts)monthly_stats: Forever (immutable billing records)project_stats: Forever (live counters)
Querying Stats
Current month usage (for billing checks)
Yesterday’s detailed stats
Last 24 hours (hourly breakdown)
Organization-level aggregation
TODOs / Future Implementation
1. MQTT Message Tracking
Current: Hardcoded to 0 in all lambdas Planned:mqtt-counterlambda writes to DynamoDB tablemqtt_message_counts- Hourly lambda queries DynamoDB for last hour’s counts
- Adds to
hourly_stats.mqtt_inbound_messagesandmqtt_outbound_messages
2. API Call Tracking
Current: Not tracked Planned:- Middleware in
app/middleware/api_tracker.py - Increments counter in Redis:
api_calls:project:{project_id}:hour:{timestamp} - Hourly lambda reads from Redis and resets counter
3. Deployment Tracking
Current: Not tracked Planned:- ECR push webhook triggers Lambda
- Lambda increments:
deployments:project:{project_id}:hour:{timestamp}in Redis - Hourly lambda aggregates
4. Network Transfer Tracking
Current: Hardcoded to 0 Planned:- CloudWatch metrics from ALB (data transfer)
- ECR API metrics (docker upload/download)
- CloudFront metrics (if CDN used)
5. Hourly Stats Cleanup Job
Current: No cleanup Planned:- Weekly Lambda deletes
hourly_statsolder than 7 days - Keeps
daily_statsfor 90 days - Never deletes
monthly_stats(billing records)
Testing
Manual Testing
1. Trigger hourly lambda:Validation Queries
Verify no double-counting:Architecture Benefits
✅ No double-counting: Activity counts are deltas at each level (hourly → daily → monthly) ✅ Accurate storage: Storage is always a snapshot, we take MAX to show peak usage ✅ Granular charts: Frontend can show hourly trends for last 24 hours ✅ Fast queries: Aggregations are pre-computed, no real-time summing ✅ Billing accuracy: Monthly records are immutable audit trail ✅ Efficient storage: Hourly records cleaned up after 7 days ✅ Multi-tenant: All tables haveorganization_id for fast org-level aggregation