Skip to main content

Stats Tracking Architecture

Overview

Roboticks tracks usage statistics at three granularities: hourly, daily, and monthly. This document explains how stats flow through the system.

Data Flow

┌─────────────────────────────────────────────────────────┐
│                  HOURLY (every hour)                     │
│  ┌─────────────────────────────────────────────────┐   │
│  │ Hourly Lambda (EventBridge)                      │   │
│  │  - Counts NEW logs/sessions this hour (delta)    │   │
│  │  - Queries AWS (ECR, S3) for snapshots          │   │
│  │  - Updates project_stats (cumulative counters)   │   │
│  │  - Creates hourly_stats record                   │   │
│  └─────────────────────────────────────────────────┘   │
│                        ↓                                  │
│             hourly_stats table (1 row/project/hour)      │
│              - logs: HOURLY DELTA (new this hour)        │
│              - storage_bytes: SNAPSHOT (current state)   │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│                 DAILY (00:00 UTC)                        │
│  ┌─────────────────────────────────────────────────┐   │
│  │ Daily Lambda (EventBridge)                       │   │
│  │  - Aggregates yesterday's 24 hourly_stats rows   │   │
│  │  - Activity: SUM(logs, sessions, MQTT, API)      │   │
│  │  - Storage: MAX(storage_bytes) from yesterday    │   │
│  │  - Resources: LAST(devices, images) from 23:00   │   │
│  │  - Creates daily_stats record                    │   │
│  └─────────────────────────────────────────────────┘   │
│                        ↓                                  │
│             daily_stats table (1 row/project/day)        │
│              - logs: DAILY SUM (24 hourly deltas)        │
│              - storage_bytes: PEAK (max from 24 hours)   │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│              MONTHLY (1st of month, 00:00 UTC)           │
│  ┌─────────────────────────────────────────────────┐   │
│  │ Monthly Lambda (EventBridge)                     │   │
│  │  - Aggregates previous month's daily_stats       │   │
│  │  - Activity: SUM(logs, sessions, MQTT, API)      │   │
│  │  - Storage: MAX(storage_bytes) from all days     │   │
│  │  - Resources: LAST(devices, images) from last day│   │
│  │  - Archives to lifetime_* in project_stats       │   │
│  │  - Resets current_month_* counters to 0          │   │
│  │  - Creates monthly_stats record                  │   │
│  └─────────────────────────────────────────────────┘   │
│                        ↓                                  │
│           monthly_stats table (1 row/project/month)      │
│              - logs: MONTHLY SUM (all daily sums)        │
│              - peak_storage_bytes: PEAK (max from days)  │
└─────────────────────────────────────────────────────────┘

Database Tables

project_stats (1 record per project)

Purpose: Live counters for current state and current month Updated by: Hourly lambda Columns:
  • current_month_*: Counters reset monthly (logs, sessions, MQTT, API, storage)
  • total_*: Lifetime cumulative counters (devices, sessions, logs)
  • lifetime_*: Archived totals from completed months
  • last_calculated_at: When hourly lambda last ran

hourly_stats (1 record per project per hour)

Purpose: Granular hourly snapshots for detailed charts Created by: Hourly lambda (every hour) Columns - Activity (DELTAS):
  • logs, sessions: NEW activity this hour
  • mqtt_inbound_messages, mqtt_outbound_messages: NEW messages this hour
  • api_calls, deployments: NEW activity this hour
Columns - Resources (SNAPSHOTS):
  • storage_bytes, s3_storage_bytes, docker_storage_bytes: Current state at end of hour
  • device_count, active_device_count, ecr_image_count: Current counts at end of hour
Key insight: Activity counts are DELTAS (incremental), storage/resources are SNAPSHOTS (absolute values)

daily_stats (1 record per project per day)

Purpose: Daily aggregations for daily/weekly charts Created by: Daily lambda (00:00 UTC) How it’s calculated:
  • Activity: SUM(hourly_stats.logs) - adds up 24 hourly deltas
  • Storage: MAX(hourly_stats.storage_bytes) - peak storage during the day
  • Resources: SELECT ... ORDER BY timestamp DESC LIMIT 1 - last hour’s value

monthly_stats (1 record per project per month)

Purpose: Historical billing records, immutable after creation Created by: Monthly lambda (1st of month, 00:00 UTC) How it’s calculated:
  • Activity: SUM(daily_stats.logs) - adds up all daily sums (28-31 days)
  • Storage: MAX(daily_stats.storage_bytes) - peak storage during the month
  • Resources: Last day’s values
Additional fields:
  • tier: Organization tier at time of snapshot (for billing audit)
  • exceeded_limits: Boolean if any usage exceeded limits
  • overage_charge: Future billing integration

Lambda Functions

Hourly Stats Updater

Schedule: Every hour (via EventBridge) Runtime: ~30-60 seconds per project What it does:
  1. Queries database for NEW logs/sessions created this hour (delta)
  2. Queries ECR for current image count and sizes (snapshot)
  3. Queries S3 for current storage usage (snapshot)
  4. Updates project_stats table (cumulative counters)
  5. Creates hourly_stats record with hourly deltas
Key implementation detail: Calculates hourly deltas by counting records with created_at >= hour_start AND created_at < hour_end

Daily Snapshot

Schedule: Daily at 00:00 UTC (via EventBridge) Runtime: ~5-10 seconds per project What it does:
  1. Queries hourly_stats for yesterday (24 records)
  2. Aggregates:
    • Activity: SUM(logs, sessions, mqtt, api)
    • Storage: MAX(storage_bytes)
    • Resources: ORDER BY timestamp DESC LIMIT 1 (last hour)
  3. Creates daily_stats record
Why this is correct: We SUM activity deltas (logs) but MAX storage snapshots (bytes), preventing double-counting

Monthly Reset

Schedule: 1st of month at 00:00 UTC (via EventBridge) Runtime: ~10-15 seconds per project What it does:
  1. Queries daily_stats for previous month (28-31 records)
  2. Aggregates same way as daily (SUM activity, MAX storage)
  3. Archives to project_stats.lifetime_* counters
  4. Resets project_stats.current_month_* to 0
  5. Updates project_stats.month_start_date
  6. Creates monthly_stats record (immutable historical billing record)

Data Retention

Recommended retention:
  • hourly_stats: 7 days (for recent detailed charts)
  • daily_stats: 90 days (for monthly/weekly charts)
  • monthly_stats: Forever (immutable billing records)
  • project_stats: Forever (live counters)
Cleanup jobs (TODO):
-- Delete hourly stats older than 7 days
DELETE FROM hourly_stats WHERE created_at < NOW() - INTERVAL '7 days';

-- Delete daily stats older than 90 days
DELETE FROM daily_stats WHERE created_at < NOW() - INTERVAL '90 days';

Querying Stats

Current month usage (for billing checks)

project = db.query(Project).filter(Project.id == project_id).first()
stats = project.stats

current_month_logs = stats.current_month_logs
current_month_storage = stats.current_month_storage_bytes

Yesterday’s detailed stats

yesterday = datetime.now(UTC) - timedelta(days=1)
yesterday_start = yesterday.replace(hour=0, minute=0, second=0, microsecond=0)

daily_snapshot = db.query(DailyStats).filter(
    DailyStats.project_id == project_id,
    DailyStats.date == yesterday_start
).first()

print(f"Yesterday: {daily_snapshot.logs} logs, {daily_snapshot.sessions} sessions")

Last 24 hours (hourly breakdown)

now = datetime.now(UTC)
yesterday = now - timedelta(days=1)

hourly_data = db.query(HourlyStats).filter(
    HourlyStats.project_id == project_id,
    HourlyStats.timestamp >= yesterday
).order_by(HourlyStats.timestamp).all()

for hour in hourly_data:
    print(f"{hour.timestamp.hour}:00 - {hour.logs} logs")

Organization-level aggregation

# Sum all projects in organization
org_daily_logs = db.query(func.sum(DailyStats.logs)).filter(
    DailyStats.organization_id == org_id,
    DailyStats.date >= month_start
).scalar()

TODOs / Future Implementation

1. MQTT Message Tracking

Current: Hardcoded to 0 in all lambdas Planned:
  • mqtt-counter lambda writes to DynamoDB table mqtt_message_counts
  • Hourly lambda queries DynamoDB for last hour’s counts
  • Adds to hourly_stats.mqtt_inbound_messages and mqtt_outbound_messages
DynamoDB schema:
partition_key: project_id
sort_key: hour_timestamp
attributes: {inbound_count, outbound_count}

2. API Call Tracking

Current: Not tracked Planned:
  • Middleware in app/middleware/api_tracker.py
  • Increments counter in Redis: api_calls:project:{project_id}:hour:{timestamp}
  • Hourly lambda reads from Redis and resets counter

3. Deployment Tracking

Current: Not tracked Planned:
  • ECR push webhook triggers Lambda
  • Lambda increments: deployments:project:{project_id}:hour:{timestamp} in Redis
  • Hourly lambda aggregates

4. Network Transfer Tracking

Current: Hardcoded to 0 Planned:
  • CloudWatch metrics from ALB (data transfer)
  • ECR API metrics (docker upload/download)
  • CloudFront metrics (if CDN used)

5. Hourly Stats Cleanup Job

Current: No cleanup Planned:
  • Weekly Lambda deletes hourly_stats older than 7 days
  • Keeps daily_stats for 90 days
  • Never deletes monthly_stats (billing records)

Testing

Manual Testing

1. Trigger hourly lambda:
cd infrastructure
npm run invoke-hourly-stats
2. Trigger daily lambda:
npm run invoke-daily-snapshot
3. Check database:
-- Latest hourly stats
SELECT * FROM hourly_stats
WHERE project_id = 1
ORDER BY timestamp DESC
LIMIT 24;

-- Latest daily stats
SELECT * FROM daily_stats
WHERE project_id = 1
ORDER BY date DESC
LIMIT 7;

Validation Queries

Verify no double-counting:
-- Sum of daily logs should equal sum of hourly logs
SELECT
    (SELECT SUM(logs) FROM daily_stats WHERE project_id = 1 AND date >= '2025-11-01') as daily_total,
    (SELECT SUM(logs) FROM hourly_stats WHERE project_id = 1 AND timestamp >= '2025-11-01') as hourly_total;
Check for missing hours:
-- Should have 24 hourly records per day
SELECT DATE(timestamp), COUNT(*)
FROM hourly_stats
WHERE project_id = 1
GROUP BY DATE(timestamp)
HAVING COUNT(*) != 24;

Architecture Benefits

No double-counting: Activity counts are deltas at each level (hourly → daily → monthly) Accurate storage: Storage is always a snapshot, we take MAX to show peak usage Granular charts: Frontend can show hourly trends for last 24 hours Fast queries: Aggregations are pre-computed, no real-time summing Billing accuracy: Monthly records are immutable audit trail Efficient storage: Hourly records cleaned up after 7 days Multi-tenant: All tables have organization_id for fast org-level aggregation