AI for evidence & standards

When the work shifts from “the test failed” to “the auditor needs a story”, Roboticks layers six AI assists across evidence packs, standards, and the posture dashboard. The goal is the same as everywhere else in the product: keep the engineer in control, drop the manual prose tax.

Evidence-pack narrative

Surface: Evidence pack detail → “Generate narrative” action. For any sealed evidence pack, Opus synthesises an executive-summary / audit narrative the certification auditor reads first. It pulls together:

The pack’s requirement coverage matrix.
The standards subscriptions in effect at seal time.
The hash-chain head + verification status.
The release tag, branch, and commit lineage.
The set of failing or skipped tests and their excuses (deferred, accepted risk, etc.).

The output is a 2-3 page Markdown narrative the engineer can paste into the audit cover letter or hand back into the pack customisation workflow. The narrative is stored on the pack so re-generating costs nothing. Task type: EVIDENCE_PACK_NARRATIVE — Opus, 60k input budget, 24 token-cost.

Pre-audit Q&A

Surface: Evidence pack detail → Ask box. Free-form questions grounded in this specific pack’s contents. Useful before a customer call (“what’s our worst uncovered clause in IEC 61508 right now?”) or to prep an audit (“how did we close the gap on REQ-014 between v2.3.0 and v2.4.0?”). Each turn is a Sonnet call with the pack manifest in the context. The conversation persists per pack. Task type: EVIDENCE_PACK_QA — Sonnet, 40k input budget, 10 token-cost per turn.

Pack completeness gate

Surface: Evidence pack detail → Completeness card. Given a sealed pack + a target standard the org subscribes to, Sonnet calls out which clauses of the standard are not yet satisfied by the pack. Output is a table of (clause, gap, suggested action). Used in three ways:

As a pre-seal advisory: run completeness against the standard, fix gaps, then seal.
As a regression check: re-run on an already-sealed pack to know whether the next release needs to close the gap.
As an audit-prep pass: hand the auditor the report alongside the pack and beat them to the obvious questions.

Task type: EVIDENCE_PACK_COMPLETENESS — Sonnet, 50k input budget, 12 token-cost.

Standards coverage delta

Surface: Standard detail → Coverage AI tab. Across all of the project’s requirements, which clauses of this standard currently have zero linked requirements? The prompt embeds a slice of the standard’s clauses (typically 30-80 depending on size) and asks Sonnet to enumerate the gaps with one-line interpretation per clause. Output renders as a coverage delta table with a “draft requirements for these clauses” action that hands off to the AI requirement generator. Task type: STANDARDS_COVERAGE_DELTA — Sonnet, 20k input budget, 8 token-cost.

Plain-English clause summary

Surface: Standards detail → individual clause → hovers, expanded cards. Standards prose is dense by design. The clause summary is a Haiku-tier rephrase (“In plain English: …”) that sits inline next to the official text. It is not legal advice — see the standards disclaimer — but it makes triage of “which clauses do I even need to think about” possible at the engineer level. Task type: STANDARDS_CLAUSE_SUMMARY — Haiku, 2 token-cost. Cached per clause version so a project full of users opening the same clause pays once.

Posture weekly digest

Surface: Posture dashboard → top banner. A Sonnet call run weekly that summarises:

This week’s coverage movement (requirements newly covered or newly uncovered).
Gap-trend slope across subscribed standards.
Test flakiness movement.
Significant evidence-pack events (new seals, archival).

Customers can ask a follow-up via the banner’s chat link, which spawns a regular conversation against the digest. Task type: POSTURE_WEEKLY_DIGEST — Sonnet, 15k input budget, 8 token-cost. Scheduled by the platform — there is no user-facing trigger button beyond “regenerate now”.

What the AI here does NOT do

It does not sign anything. Every narrative, completeness report, or coverage delta is a draft; the engineer reviews and approves before the pack seals.
It does not edit standards prose. The summaries are always shown next to the official text, never replacing it.
It does not see content beyond the pack manifest + the standard’s clause catalogue + the conversation thread. Source code, MCAPs, and per-test logs are not in the AI context.

Plan gating

All six surfaces require non-zero ai_tokens_balance. Practical reality:

Free plan’s 100 000 grant covers a handful of clause summaries and a single pack completeness check.
Team (5 000 000 grant) covers the typical weekly cadence: one narrative, several QA conversations, completeness on every seal, the weekly digest.
Enterprise (∞) has no rate cap beyond Bedrock-side limits.

Evidence packs (no-AI)

The shape of the artifact AI works on top of.

Standards coverage

Subscribing to standards, pinning versions.

Requirements & traceability AI

The upstream surfaces that fill the evidence pack.

AI for Evidence & Standards

AI for evidence & standards

Evidence-pack narrative

Pre-audit Q&A

Pack completeness gate

Standards coverage delta

Plain-English clause summary

Posture weekly digest

What the AI here does NOT do

Plan gating

Next

Evidence packs (no-AI)

Standards coverage

Requirements & traceability AI

​AI for evidence & standards

​Evidence-pack narrative

​Pre-audit Q&A

​Pack completeness gate

​Standards coverage delta

​Plain-English clause summary

​Posture weekly digest

​What the AI here does NOT do

​Plan gating

​Next

Evidence packs (no-AI)

Standards coverage

Requirements & traceability AI

AI for evidence & standards

Evidence-pack narrative

Pre-audit Q&A

Pack completeness gate

Standards coverage delta

Plain-English clause summary

Posture weekly digest

What the AI here does NOT do

Plan gating

Next