Clinical judgement, sharpened by AI.
DOQSynth turns a bundle of medical records into a structured patient story, with every fact traced back to the page it came from. It is built to sharpen clinical judgement, not to replace it.
The problem
A medico-legal instruction rarely arrives as a coherent story, but as a bundle of records accumulated over decades. Most of the work lies in piecing together the clinical chronology before an opinion can be formed.
The facts that change a decision are the ones buried where keyword search cannot reach them: a penicillin allergy noted in 2009, a statin quietly discontinued, an earlier admission for chest pain. Plain keyword search misses them. Plain vector search invents links that are not there. Neither is safe enough to trust on its own.
"Has this patient ever taken penicillin?" is a question a chart should be able to answer in seconds, with the page to prove it. Until now it has taken an afternoon, and the answer has been only as good as the reader's stamina.
What it does
Scanned or text-layer documents are read page by page. A confidence score is recorded for every page, with a vision model as a fallback for the poorest scans. Uploads are accepted up to several gigabytes.
Medications, diagnoses, lab results, procedures, allergies, vitals, encounters and immunisations are pulled into proper database tables. UK drug naming is kept as written, not Americanised.
Every fact is dated and placed on a single timeline, filterable by event type. The chronology is derived from the extracted facts, so each entry is verifiable rather than invented.
Ask for a cardiac history, a timeline of admissions, or whether a drug was ever prescribed. Every answer arrives with the source document and page attached.
In practice
Ask a question in plain language; the answer comes back cited to the exact document and page, so verification is a single click.
clinical fact types extracted into typed tables
tier reading: text layer, then page image, then vision model
of answers carry a source page citation
patient details sent in outbound email
How it works
Plain vector retrieval is unsafe for medical questions about whether something ever happened. DOQSynth reads, structures and verifies in stages, so the answer rests on evidence rather than similarity.
Each page is read through its native text layer first. Pages with too little text are read from the page image, and the poorest scans go to a vision model. The source of every page is recorded.
Page text is chunked and embedded for search, then read by a language model that returns typed clinical facts as strict records. Each fact keeps the exact quote, page and chunk it came from, plus a confidence score.
Each question is classified. Factual questions query the typed facts directly, widened by a clinical ontology so that penicillin reaches amoxicillin, flucloxacillin and co-amoxiclav. Narrative questions use hybrid keyword and vector search with reranking. Time questions read the chronology.
Before an answer is shown, a verifier removes negations and mentions belonging to a different patient. What remains is returned with the source document, page and exact quoted text.
Clinical safety by design
Each known failure mode of a naive system has a specific countermeasure in the pipeline. The clinician sees evidence; the clinician decides.
| The risk | What DOQSynth does |
|---|---|
| Top-K retrieval misses a long-tail mention in a 1000-page chart | Typed-fact SQL runs first, with keyword and vector search as a safety net underneath it. |
| Embeddings miss the link between a generic, a brand and a drug class | A UK clinical ontology expands synonyms and classes before the search runs. |
| A negation reads the same as a confirmation to a model | A verifier filters out negations and denials before any answer is shown. |
| A family-history mention is attributed to the wrong patient | The same verifier removes mentions that belong to another person. |
| An extraction fails quietly and a fact goes missing | Every fact carries an explicit confidence field, batch validity is tracked, and parsing retries on a smaller batch. |
| A model states a fact that is not in the source | Every fact stores its verbatim source quote, page and chunk, so a spot check is one click. |
Information governance
NHS number, MRN, date of birth, name, address and contact details are encrypted in the database. Uploaded PDFs are encrypted on disk with envelope encryption, and page images are deleted once reading is complete.
The language model endpoint is configurable and can run against self-hosted models on hardware you control, so records are read by models you choose rather than a public AI provider.
Signup is invite-only with an admin role for organisation management. Optional SMS two-factor protects sign-in, and single sign-on is supported through a signed token.
Each clinician acknowledges a written AI-safety statement before first use, timestamped on their record. Every chat exchange is kept per user and per patient. No patient detail ever leaves the system by email.
Under the hood
| Runtime | A fault-tolerant, highly concurrent runtime engineered for steady throughput under heavy document loads |
| Data | A relational store combining vector and keyword search for fast, precise retrieval across large records |
| Processing | Resumable, checkpointed background processing across reading, extraction and chronology, so long jobs recover cleanly |
| Language models | Pluggable language-model endpoints with load balancing and per-host concurrency control |
| Reading | Layered document reading, from native text to page image to a vision model, chosen per page by confidence |
| Interface | A responsive single-page interface with an embedded document viewer, live progress and an interactive timeline |
| Billing | Predictable per-page billing with signed, verified payment events |
Deployment and pricing
DOQSynth by Alldoq
DOQSynth surfaces the evidence and points at the page it came from. The judgement stays with the clinician, which is exactly where it belongs.
Get in touch