Deepfield: a modular assessment platform
An eleven-stage strategic assessment pipeline that turns a query into a defensible course of action with full evidence lineage.
We build drafters that read source materials, fill a typed template schema, and return each field as either a draft with a confidence score and a citation, a flagged low-confidence guess, or an explicit gap. Reviewers accept, reject, or rewrite one field at a time. The system drafts. The human decides.
The same pattern recurs wherever a structured document has to be drafted from a stack of source materials. Research protocols. Grant applications. Compliance filings. Underwriting memos. RFP responses. People spend hours pulling information out of source documents to populate template fields. The process is slow. It's error-prone. Different drafters interpret the same source material differently and end up with different documents for the same underlying facts. Whoever reviews those documents has to chase inconsistencies that came from the input stage, not the substance.
Our client, a clinical research alliance preparing human subjects research protocols for IRB review, wanted a first draft a researcher could start from, not a finished document a researcher had to fight with. The underlying pattern generalizes well past IRBs.
A template auto-population API. The caller uploads a set of context documents and selects a template schema. The system parses the documents, runs LLM extraction with constrained structured outputs, and returns a draft with each field populated from the source material. Every field comes back as one of three things:
The extraction uses constrained decoding into a typed Pydantic schema, so every field conforms to a typed contract. The model can't invent a field shape. When information is missing, the schema forces an explicit gap rather than letting the model paper over the hole with fluent prose.
Three things carry the defensibility load.
First, structured outputs with constrained decoding. Every extracted field is typed. Every field is either a draft with a confidence score and a source passage, or an explicit gap. There's no unstructured middle ground where the model hides an unsupported claim.
Second, source attribution on every field. When a reviewer reads a drafted "inclusion criteria" field, they see the specific page and paragraph in the source documents that the draft came from. Accepting the draft is a judgment call with all the evidence in front of them. Rejecting it is a one-click action, not an investigation.
Third, explicit gap surfacing. The system is designed so that "I don't have enough information to draft this field" is a normal output, not a failure. A reviewer would much rather see ten fields drafted and five marked as gaps than see fifteen fields drafted with uniform confidence, because the second case hides which fields they need to scrutinize.
The point of all three is that the human reviewer stays in charge. The system drafts. The human decides. A first draft a reviewer can trust beats a finished draft they can't.
Manual extraction and reformatting. A process that took a researcher hours per document and produced inconsistent outputs across drafters and projects.
8 to 12 weeks to build a drafter for a new template library. We need the templates, a reference set of completed documents for schema design, and access to typical source document shapes. You get the deployed API, the template schemas, the extraction prompts tuned for the domain, and the confidence-scoring rubric.
It's a fit for IRBs, grant offices, compliance teams, regulatory affairs, underwriting, audit, due diligence, and any setting where a structured document has to be drafted from a set of source materials and a human reviewer needs to see exactly where every piece of the draft came from.
We've written a two-page business case for this engagement shape. Executive summary, problem statement, deliverables, risks, success metrics, investment range. Read it in the browser or print it to PDF and forward.
Read the business caseAn eleven-stage strategic assessment pipeline that turns a query into a defensible course of action with full evidence lineage.
Per-program analytical sites where every quantitative claim is backed by a reproducible query and a confidence level.
A knowledge-graph research console that opens Andrew Marshall's Office of Net Assessment tradition to a new generation of strategists.
Tell us about the decision you're trying to improve. We'll schedule a briefing with our principals to understand your environment and explore a potential fit.
Schedule a Briefing