// diagnostic
AI Systems Diagnostic
A senior read on why your AI system is unreliable, and exactly what it takes to fix it.
For a team with a shipped AI feature that works in the demo and breaks in front of real users.
// what you get
The deliverables, in writing.
A written technical assessment
A plain-language read of the system as it stands: retrieval, prompts, evaluation, data flow, and the infrastructure underneath. No jargon wall.
Failure modes, ranked
The three to five ways the system fails most, ordered by impact, each traced to its actual cause rather than its symptom.
A costed remediation plan
A sequenced plan: what to fix first, what each fix takes in time and money, and what you can safely leave for later.
A go or no-go recommendation
An honest call on whether this is repairable in place or whether a rebuild is the cheaper path. Backed by the assessment, not a sales pitch.
A 60-minute walkthrough
A live call to walk your team through the findings and answer the questions the document raises.
// what it costs
$750 – $2,000
Scoped by system size and how many surfaces need review.
What the fee covers
- Read-only review of the repo and the running deployment
- The written assessment and ranked failure modes
- The costed, sequenced remediation plan
- The 60-minute walkthrough call
What it does not
- Implementation of the fixes (that is the Prototype Hardening Sprint)
- Net-new feature work
- Ongoing support or retainer
// how it runs
3-5 business days.
- Day 1
Kickoff. You hand over repo access and a sample of the inputs and queries that fail. We agree on the surfaces in scope.
- Days 2-4
The review. We read the code, trace the failure paths, and reproduce the worst cases against your data.
- Day 5
Delivery. You get the written assessment and the costed plan, then the walkthrough call.
// what you bring
Four things, and we can start.
- Read access to the repository
- A sample of the inputs or queries that fail today
- Sixty minutes of an engineer who knows the system
- One person who can make the build-or-rebuild decision
// questions
Before you brief us.
Read-only is enough. We review the repo and observe the running deployment. We do not need write access or customer data to find the failure modes.
// past delivery
The work this is built on.
// what's real
You leave the week knowing exactly what is wrong, what it costs to fix, and whether to repair in place or rebuild.
Ready when you are.
Brief us · 24h response// we reply within 24 hours