Sample Case Study

Healthcare Document Classification and Text Extraction


Partner: A market access provider bringing transparency and guidance to pharmacy and medical benefit information. They provide daily updates of the coverage available and potential restrictions across more than 6,500 healthcare provider policy plans.

Collecting healthcare provider policies manually was time-consuming and often resulted in incomplete information.

The process of parsing documents for relevant information was labor-intensive.

It was often difficult to tell if a provider policy was a completely new document or updated version of a previous document.


Language detection and document classification pipeline to identify relevant policies.

Text classification process to ingest nuanced language data and select relevant portions from within healthcare provider documents.

Document lineage and difference tracking system to alert analysts to changes in document information.

dark circle

Workload Productivity

Two orders of magnitude increase in workload productivity.

dark circle

Documents Processed

Two orders of magnitude increase in documents processed per day.

dark circle

Labor Resources

Order of magnitude reduction in necessary labor resources.

Service Highlights

Automate the process of extracting relevant and nuanced information from documents, both digital and scanned, into a structured, organized dataset.

Classify documents by use case or topic using visual cues and lexical features.

Associate and link updated versions of previous documents to create a lineage of version history and identify specific differences between versions.