Architecture

Core Architecture

How the system is structured internally — from how your files are stored to how PDFs stay pixel-perfect through the extraction pipeline.

6 min read·Technical

Zero-loss PDF extraction

Most AI PDF tools convert your file into plain text before analysing it. In doing so they destroy fonts, tables, diagrams, and layout. The extracted output is a degraded approximation of the original.

This system takes a different approach. The AI reads your document to make a decision — which pages to keep. Once that decision is made, the AI is done. The engine goes back to the original raw PDF file, slices the approved page data byte-for-byte, and writes it into a new file. No rendering, no conversion, no re-encoding.

PDF mode pipeline
Original PDF
AI reads → decides
Raw pages sliced
New PDF stitched
Font preservation. Because pages are copied raw, every custom font, mathematical symbol, coloured table cell, and annotated diagram is preserved exactly as in the original. The output is indistinguishable from the source.

Multi-file merge architecture

When multiple files are uploaded in a single job, each file is extracted independently in parallel. The AI analyses all files simultaneously, and the resulting keep/discard verdicts are merged into a unified ordered list before the output file is assembled.

Example: 3-file merge
Paper-2022.pdf22 pages
6 pages kept
Paper-2023.pdf24 pages
8 pages kept
Paper-2024.pdf20 pages
5 pages kept
Merged output19-page combined PDF
Pages from different source files are interleaved in a logical order determined by the AI, not just concatenated. If Paper 2023 has a better introduction to a topic, it may appear before Paper 2022 material in the final output.

Client-side storage

The Study Editor stores files locally using IndexedDB — a low-level key-value store built into every modern browser. Files persist between browser sessions and survive page refreshes.

Storage typeWhat\'s storedCleared by
IndexedDBPDF & Markdown files, annotationsClearing site data or explicit delete
Session memoryActive selection, toolbar statePage refresh
Server (none)NothingN/A
Because files live in your browser, they are device-specific. Files uploaded on your laptop won\'t appear on your phone. Clearing your browser cache or storage will permanently remove all stored documents.