AI features

Roboticks runs Anthropic Claude on AWS Bedrock behind a small set of typed task definitions. Every AI surface in the product maps to one AITaskType with a fixed model, fixed input/output window, and a fixed token_cost — so plan gating is deterministic and the surface area is auditable.

The customer never picks the model. The backend picks Haiku, Sonnet, or Opus per task. Bills count against the org’s monthly ai_tokens grant; overage stops at the plan ceiling unless the org has a top-up. See Pricing → AI tokens for the unit economics.

Where AI shows up

Surface	UI entry point	Doc page
Test failure triage	Run-detail page → AI Triage panel	Test debugging
Test-run analysis	Run-detail page → AI Analyze	Test debugging
Test flakiness	Test case page → Flakiness dialog	Test debugging
Sim-vs-real comparison	Run-detail page (sim run) → Compare Real	Test debugging
Inline log anomalies	Logs page → highlighted lines	Test debugging
Requirement quality	Requirement detail → Quality card	Requirements & traceability
Verification method	Requirement detail → Verification card	Requirements & traceability
Duplicate / contradiction	Requirement detail → Duplicates card	Requirements & traceability
Standards clause linkage	Requirement detail → Standards card	Requirements & traceability
Chat with requirements doc	Requirements upload page → chat dock	Requirements & traceability
Test suggestions for uncovered req	Traceability gaps → AI Suggest	Requirements & traceability
Gap explanation	Traceability matrix → uncovered row	Requirements & traceability
Evidence-pack narrative	Evidence pack detail → Generate narrative	Evidence & standards
Pre-audit Q&A	Evidence pack detail → Ask	Evidence & standards
Pack completeness gate	Evidence pack detail → Completeness	Evidence & standards
Standards coverage delta	Standard detail → Coverage AI	Evidence & standards
Standards clause summary	Standard clause → Plain English	Evidence & standards
Posture weekly digest	Posture dashboard → top banner	Evidence & standards
Natural-language search	Global search bar → “Ask” toggle	Search

Models and routing

Routing is fixed in app/core/ai_config.py and is not a tunable for customers. The mapping:

Tier	Model	Used for
Cheap / fast	`claude-haiku-4-5`	Single-requirement quality + verification-method + gap-explain, log summarisation, entity extraction, search query parse, standards clause summary
Default	`claude-sonnet-4-5`	Test-failure analysis, test-run analysis, conversation responses, flakiness, requirement generation, duplicate / standards-link, doc chat, evidence pack QA / completeness, standards coverage delta, posture digest
Heavy reasoning	`claude-opus-4-5`	Root-cause analysis, sim-vs-real comparison, evidence-pack narrative

Bedrock model IDs live alongside the routing table; cross-region inference profiles (us.anthropic...) are baked in.

How billing works

Each call bills the Bedrock-equivalent token count × 2x margin, with output weighted by the model’s output/input price ratio. So a typical Sonnet request that consumes 5 000 input + 1 000 output tokens at Bedrock prices the customer is charged roughly:

billable = round(2 × (5 000 + 5 × 1 000)) = 20 000 ai_tokens

The platform never charges by absolute dollars — it charges in units of ai_tokens the customer’s plan ships with monthly. Top-up packs add ai_tokens_prepaid that never expire. For a worst-case ceiling per task — the figure the plan-gate compares against — see max_billable_tokens() in ai_config.py. The Free plan’s 100 000-token grant covers Haiku-tier tasks but is below the worst-case requirement-generation gate, so the heavy AI features are implicitly Team+.

What the AI does NOT do

It never edits your code, your requirements, your tests, or your evidence packs. Every “AI assist” is a suggestion the engineer accepts or rejects.
It never reads your repos directly. AI prompts are built from data already in the platform (test results, requirements text, logs the runner shipped). The hosted MCP server can read more, but only when an LLM calls it through the platform’s billing-and-auth boundary.
It never sees a customer’s data outside the Bedrock invocation. Roboticks does not train models. Anthropic does not use Bedrock prompts for training under AWS’s contractual terms.
It is not a substitute for the certification auditor. AI assists draft and explain; the engineer signs.

Plan gating

Plan	Monthly `ai_tokens` grant	Effectively unlocks
Free	100 000	Haiku-tier surfaces: log summarisation, clause summaries, single-requirement quality / gap-explain
Team	5 000 000	Everything below worst-case requirement-generation; flakiness; sim-vs-real; evidence narratives at moderate cadence
Enterprise	∞	All surfaces, no rate cap beyond Bedrock-side limits

Customers can buy top-up packs at any tier; topped-up tokens are ai_tokens_prepaid and never reset.

Test debugging

Triage, flakiness, sim-vs-real, log anomalies.

Requirements & traceability

INCOSE quality, verification, duplicates, gaps, doc chat.

Evidence & standards

Pack narrative, pre-audit Q&A, clause summaries.

Natural-language search

Ask questions; get answers grounded in the project’s data.

AI Features

AI features

Where AI shows up

Models and routing

How billing works

What the AI does NOT do

Plan gating

Next

Test debugging

Requirements & traceability

Evidence & standards

Natural-language search

​AI features

​Where AI shows up

​Models and routing

​How billing works

​What the AI does NOT do

​Plan gating

​Next

Test debugging

Requirements & traceability

Evidence & standards

Natural-language search

AI features

Where AI shows up

Models and routing

How billing works

What the AI does NOT do

Plan gating

Next