How Aelira Actually Fixes Your PDFs (Not Just Flags Them)
Most accessibility tools scan your PDFs and hand you a list of problems. Aelira fixes them. Here's what happens under the hood when you upload a document.
You upload a PDF. Your accessibility tool scans it. The result: "47 issues found."
Now what?
Most tools stop there. You get a report full of problems — missing structure tags, incorrect reading order, unlabelled table headers — and the implicit message is: go fix these yourself. Manually. One by one. At 30-60 minutes per file.
If you have 500 PDFs to remediate before a compliance deadline, that maths doesn't work.
Aelira takes a different approach. When you upload a PDF, you get back a fixed PDF — not just a report. Here's what actually happens.
Step 1: Structure Analysis
The first thing Aelira does is analyse your PDF's internal structure. Most PDFs created by "Save as PDF" or scanning look fine visually, but internally they're a mess. Screen readers don't see what you see — they see the raw data layer, which often has:
- No heading hierarchy — everything is flat text
- No reading order — content might be read in the wrong sequence
- Tables without headers — data grids with no context for what each column means
- Images without alt text — invisible to anyone using a screen reader
Aelira maps out all of these structural gaps before applying any fixes.
Step 2: Reading Order
This is one of the hardest problems in PDF accessibility, and where most tools give up entirely.
A two-column academic paper looks obvious to a sighted reader: left column first, then right column. But PDFs don't store content that way. The internal data might have the right column's first paragraph immediately after the left column's title, because that's how the authoring tool happened to write it.
Aelira uses a dual strategy to get reading order right:
For standard layouts (single column, two-column papers, slide handouts), a heuristic engine analyses the visual layout. It detects columns by clustering content blocks by their horizontal position, identifies headers and footers that repeat across pages, and establishes the correct top-to-bottom, left-to-right reading sequence. Headers and footers get marked as artifacts so screen readers skip them automatically.
For complex layouts (mixed columns, sidebars, pull quotes, unusual designs), Aelira uses AI vision. It renders the page as an image and asks an AI model to determine the correct reading sequence based on the visual layout — the same way a human reader would naturally scan the page.
The heuristic approach handles the majority of documents with high confidence. AI vision kicks in only when the layout is too complex for rule-based analysis.
Step 3: Table Remediation
Tables are everywhere in academic documents — grade rubrics, data tables, comparison charts, lab results. An untagged table is almost useless to a screen reader. The reader sees a stream of disconnected values with no way to know which column or row they belong to.
Aelira detects tables in your PDF, then:
- Extracts the table structure — rows, columns, cell boundaries, merged cells
- Identifies headers — analyses the first row and column for short, distinct text that looks like labels
- Optionally confirms with AI vision — for ambiguous tables, sends a snapshot to AI for a second opinion on which cells are headers
- Applies proper tags — creates the full semantic structure:
THead,TBody,TR,TH, andTDelements withScopeattributes so screen readers can announce "Column: Grade, Row: Assignment 3"
Merged cells, irregular grids, and nested tables all get handled. The more complex the table, the lower the confidence score — which brings us to the next step.
Step 4: Confidence Scoring
Here's where Aelira diverges most from other tools. Automated fixes aren't all created equal. Adding a missing document title is a near-certain fix. Generating alt text for a complex educational diagram is a judgment call.
Aelira scores every fix on a confidence scale:
| Fix Type | Confidence | What Happens |
|---|---|---|
| Rule-based fixes (title, language, bookmarks) | ~0.95 | Applied automatically |
| Heuristic fixes (heading hierarchy, reading order) | ~0.70 | Applied, flagged if complex |
| AI text fixes (alt text from context) | ~0.60 | Flagged for review |
| AI vision fixes (alt text from image analysis) | ~0.55 | Flagged for review |
Fixes scoring above 0.85 are applied automatically. You don't need to review them — they're structural, deterministic, and well-understood.
Fixes scoring below 0.85 are flagged for your review. You'll see exactly what Aelira changed and why, and you can accept, modify, or reject each one.
This means you spend your time on the 10% that needs human judgment — educational images, discipline-specific diagrams, context-dependent decisions — instead of manually tagging hundreds of headings.
Step 5: Validation
After applying fixes, Aelira doesn't just assume the PDF is now accessible. It validates.
Matterhorn Protocol — 15 machine-checkable conditions from the PDF/UA standard. Structure tree, language tags, alt text, heading hierarchy, table structure, role mappings.
veraPDF (optional, 108 rules) — The most comprehensive PDF/UA validator available. Covers edge cases that simpler validators miss.
You get a compliance report showing exactly which checks passed and which still need attention. Not "we think it's fixed" — proof it's fixed.
The AI Layer: Your Choice
For free and demo accounts, Aelira uses Google Gemini for AI-powered features like vision-based reading order, table header confirmation, and alt text generation.
For department and institutional plans, you can connect your own models — open-source options like Llama, Qwen, or Mistral running on your infrastructure via Ollama. Your documents never leave your servers.
And since Aelira's core is open source (MIT + AGPL), self-hosted users can run any model they choose. No vendor lock-in on the AI layer. Universities with data sovereignty requirements keep everything on-premises.
What This Looks Like in Practice
Before Aelira: You have a 20-page PDF lecture handout. An accessibility checker tells you it has 34 issues. You open Adobe Acrobat, start manually adding structure tags, fixing reading order, adding alt text. An hour later, you're on page 6.
After Aelira: You upload the PDF. 30 seconds later, you get it back with 31 issues fixed automatically and 3 flagged for your review — two images that need discipline-specific alt text and one complex table where AI wasn't sure about the header row. You spend 5 minutes reviewing those three items. Done.
That's the difference between a tool that finds problems and a tool that fixes them.
Under the Hood (For the Technically Curious)
Aelira's PDF pipeline is built on:
- pikepdf for low-level PDF structure manipulation (structure trees, tag insertion, reading order rewriting)
- PyMuPDF for content extraction (text blocks, table detection, bounding boxes)
- Tesseract 5 for OCR on scanned documents
- LuaLaTeX + tagpdf for producing PDF/UA-1 compliant output from LaTeX source
- Matterhorn Protocol validator — a native implementation checking 15 PDF/UA conditions
- veraPDF REST API — optional integration for 108-rule deep validation
- Gemini / Ollama — pluggable AI provider for vision and text generation
The entire remediation pipeline is open source. You can read exactly what it does to your documents, audit the logic, or contribute improvements.
PDF/UA-1 and PDF/UA-2 are both supported, with automatic version detection from document metadata.
Try It
Upload a PDF to the demo and see the pipeline in action. No signup required for the first scan.
If you're evaluating tools for your department, request a pilot — we'll process a batch of your real documents so you can see exactly what the output looks like.
Aelira is an open-core accessibility platform built for higher education. Learn more or view pricing.

Aelira Team
•Accessibility EngineersThe Aelira team is building AI-powered accessibility tools for higher education. We're on a mission to help universities meet WCAG 2.1 compliance before the April 2026 deadline.
Related Articles
What's in a Name? The Six Words Behind Aelira
People ask where the name Aelira comes from. It's not a random word — it's a mission statement hiding in plain sight.
How Do I Make LaTeX Documents Accessible?
LaTeX produces beautiful typeset documents, but the PDFs are inaccessible by default. Learn how to use tagpdf, LuaLaTeX, and alt text to create PDF/UA-compliant output.
What Is the Easiest Way to Create Accessible Documents?
The easiest path to accessible documents starts at the source. Use built-in heading styles, add alt text as you go, and export correctly — here's the step-by-step guide.
Ready to achieve accessibility compliance?
Join the pilot program for early access to Aelira's AI-powered accessibility platform
Apply for Pilot