AskDoc

Ask anything, get cited answers right on the source page.

AskDoc - Landing

Purpose

Document Q&A tools tend to hand you a plausible-sounding answer and trust you to take it at face value. That's fine for casual reading and dangerous for anything that matters: contracts, research papers, policy documents, technical specs.

AskDoc is built around a single principle: every answer should point you to exactly where it came from. Upload a PDF or DOCX, ask a question, and the answer arrives with inline citations that highlight the supporting passages directly on the source page. No alt-tab verification, no trust-me vibes.

Cited by default

Every answer links back to the exact span in the source document.

PDF & DOCX

Handles both formats through a unified pipeline with LibreOffice headless conversion.

Multi-model

Swap between Anthropic, OpenAI, and Google models via Pydantic AI.

AskDoc - Chat with citations

How citation highlighting works

The usual approach is to re-extract PDF text in the browser via PDF.js and fuzzy-match the citation against that. It's fragile: backend and browser extractors disagree, scanned PDFs have no text layer, multi-column and rotated text break coordinate math, and DOCX has no page coordinates at all.

AskDoc builds the span index server-side with pdfplumber: every text span on every page is stored as (page, x0, y0, x1, y1, text). The frontend loads the index once, fuzzy-matches the LLM citation in sub-millisecond time, then overlays bounding boxes on the rendered page using the react-pdf viewport. DOCX goes through LibreOffice → PDF at upload time so both formats share the exact same highlighting path.

AskDoc - Chat with citation overlays

Tech Stack

FastAPIPydantic AIAnthropicOpenAIpdfplumberNext.jsReact 19react-pdfPDF.jsmark.jsTailwind CSSRailwayVercel

Backend

FastAPI serves the API. Pydantic AI wraps Anthropic, OpenAI, and Google so the Q&A step is model-agnostic. pdfplumber (via pdfminer) extracts character-accurate spans with bounding boxes; python-docx and LibreOffice handle DOCX ingestion. Deployed on Railway.

Frontend

Next.js 16 with React 19. react-pdf renders PDF pages and exposes the viewport geometry used to project span coordinates into screen pixels. mark.js handles DOCX fallback paths. UI is built on base-ui and Tailwind. Deployed on Vercel.

AskDoc

Built to be trusted. A cited answer is the only kind that matters.