A framework for verifiable agentic workflows. The AI reads documents, extracts structured data, proposes a plan, and waits for human sign-off — then gradually earns autonomy through measured performance.
Someone reads a document and types the important parts into a system. Every industry, every department, same story.
AI can read those documents — but it doesn't give you the same answer twice. That's not good enough when the output triggers real consequences.
A framework that treats AI like an apprentice — it works under supervision, proves competence through measurement, and gradually earns autonomy.
Read a 20-page policy doc, then fill out every field by hand — rules, points, thresholds, escalation levels, one by one.
1. INSERT INTO policies — name="HR-811-A"
2. INSERT INTO rules — tardy > 7min → 0.5pts
3. INSERT INTO rules — early_depart → 0.5pts
4. INSERT INTO rules — no_call → 2.0pts
5. INSERT INTO rules — unexcused → 1.0pts
6. INSERT INTO escalation_levels ×4
7. LINK policy → region US-East
INSERT INTO policies (name, org_id, region_id)
VALUES ('HR-811-A', 1, 1);
INSERT INTO rules (policy_id, name, points)
VALUES (1, 'tardy', 0.5);
-- 5 more statements...
Upload the same policy doc. AI extracts every rule, shows the plan, you rate each step, approve. Done.
The Shift
Traditional software is predictable. AI-powered software isn't. That demands a fundamentally different approach.
|
Traditional Software
|
AI-Powered Software
|
|
|---|---|---|
| How it works | Same input = same output, every time | Same input can produce different results each time |
| Where effort goes | 80–90% building: writing logic, fixing bugs | 80–90% evaluating: checking outputs, tuning quality |
| When is it "done"? | When the bugs are fixed, you ship it | Never — continuous monitoring and improving |
| When things go wrong | Bugs are predictable and reproducible | Failures are subtle, inconsistent, and can drift |
| The human's role | Operates the system: fills forms, clicks buttons | Supervises the system: reviews output, approves or rejects |
Businesses need guarantees — precise compliance, exact calculations, auditable decisions. Apprentice delivers those guarantees using technology that, by its nature, doesn't offer them.
How It Works
Instead of a black box, the AI shows its work at every stage — and you rate, review, or reject before it moves on.
The AI reads your document and pulls out structured data. You skim the results and rate the extraction — Good, Partial, or Bad.
Before anything touches the database, the AI shows its plan in plain language. You say "yes" or "no, fix this."
Run in sandbox first to verify results. When satisfied, execute against production. Deterministic. Auditable.
The Trajectory
Every approval and rejection trains the system. The human isn't removed from the process — they're promoted.
Phase 1
Review every output. Rate each step. The system builds a reliability baseline. The interface is your workspace.
1. INSERT INTO policies
2. INSERT INTO rules ×4
3. INSERT INTO escalation_levels ×4
Every step requires human review before the pipeline can proceed.
Phase 2
High-confidence results auto-approve. You focus on what the system flags as uncertain. The interface becomes your audit panel.
Low confidence on rule mapping — found 2 ambiguous escalation thresholds
Only the flagged step needs your attention. High-confidence steps flow through.
Phase 3
The system processes documents overnight. You see a morning summary. Only genuine edge cases reach a human — unusual documents, ambiguous language, cross-jurisdictional conflicts. Everything routine is handled.
Overnight Processing Summary
Mar 6, 2026 · 2:00 AM – 6:00 AM
12
Processed
10
Auto-approved
2
Flagged
0
Failed
+ 8 more documents...
Multi-State-Policy.pdf
Cross-jurisdictional conflict — CA vs. TX overtime rules
Contractor-Attendance.docx
Ambiguous — policy references external handbook
Morning summary. 10 policies handled overnight. Only 2 edge cases need you.
First Apprentice
Workforce attendance compliance — complex HR policies, nested rules, real consequences. The first domain where Apprentice proves the Extract → Plan → Execute pattern works.
Upload HR policy docs (PDF, DOCX, or paste text). AI extracts rules, thresholds, and escalation levels.
Configurable rules with rolling windows. CSV attendance upload with automatic scoring against active policies.
Regions (with labor law context) → Organizations → Policies → Rules → Employees.
Every point change — violation or manual override — is recorded. Dashboard with KPIs, trends, and risk grid.
Human feedback on every step. Token usage, cost, and response time tracking per model per stage.
Safe sandbox with sample policies and seed data. Test the full pipeline without touching production.
Beyond Fair Play
Apprentice is domain-agnostic: change the document type, the extraction schema, and the reviewer — the framework stays the same. Fair Play is the first apprentice. These are the next ones.
Doc: HR attendance policies
Extract: Rules, point values, thresholds
Plan: "Create policy, add 5 rules, link to region"
Result: Policy & rules in compliance database
Doc: Claims, medical records
Extract: Diagnoses, procedures, coverage terms
Plan: "Approve claim, apply deductible, flag for review"
Result: Adjudication decision and payout
Doc: Vendor contracts, RFPs
Extract: Terms, SLAs, pricing, penalties
Plan: "Onboard vendor, set payment terms, flag SLA risk"
Result: Contract records and payment schedule
Doc: Legal filings, compliance docs
Extract: Requirements, deadlines, obligations
Plan: "Map to reporting schedule, flag gaps"
Result: Filing submissions and audit trail
Doc: Patient forms, referrals
Extract: Demographics, conditions, medications
Plan: "Assign provider, schedule intake, flag allergies"
Result: EHR records and care plan
Same Architecture
Extract → Plan → Verify → Execute
Different document. Different domain. Same bones.
The Fair Play Initiative — upload an HR policy document in the playground, watch the AI extract rules and build an ingestion plan, then approve or reject.
Launch the Dashboard