AI-native receipt parsing at fintech scale.
A production receipt parsing system processing millions of transactions monthly. Built to replace brittle OCR pipelines with a model-first approach that handles real-world messiness.
Legacy OCR pipelines broke on crumpled receipts, faded ink, and non-standard formats. Error rates were high enough to require manual review queues staffed around the clock.
Designed a multi-stage pipeline combining vision models for layout understanding with LLMs for structured extraction. Built confidence scoring and human-in-the-loop escalation for edge cases.
Reduced manual review volume by over 70%. Accuracy improved to 96%+ on production traffic. System now handles millions of receipts monthly with no significant increase in ops headcount.