Unlocking the power of interpreting unstructured health data
- Monsana Team
- 6 days ago
- 2 min read
The data dilemma in healthcare
Around 80% of healthcare data is unstructured; physician notes, discharge letters, narrative reports, rich in clinical insights but difficult to use for research or trial recruitment. Only about 20% is structured in formats like checkboxes and codes that machines can easily analyze. (1)
Clinicians can interpret unstructured data effectively, but for secondary uses such as research or trial matching, this data is often dismissed as “low quality” because it doesn’t fit predefined categories. As a result, valuable information remains untapped.
The limits of structured data
Structured data suits simple queries like age or diagnosis codes. However, clinical research often requires nuanced details, such as disease progression or treatment response, not captured by checkboxes. For example, trial eligibility criteria like “severe but stable asthma exacerbations in the past 6 months” are typically buried in narrative notes.
Forcing all clinical data into structured fields risks losing context and precision, which compromises quality.
Traditional handling of unstructured data

Existing technologies often extract key concepts and convert them into structured fields using entity recognition and rule-based methods. While effective for straightforward facts like diagnoses, they often miss subtle or contextual details, such as symptom changes after treatment, which are vital for complex trial eligibility.
Instead of converting all data into rigid templates, we need tools that interpret unstructured data like a physician would. Generative AI and natural language processing (NLP) can read, interpret, and summarize complex medical narratives while preserving nuance and extracting relevant information for specific questions.
Monsana’s approach: starting with the question, not the data

At Monsana, we begin with the question: Is this patient eligible for this clinical trial? Our AI-powered platform reads full medical records, including unstructured narratives, and interprets them with physician-level nuance. It understands complex eligibility rules, identifies relevant indicators (structured or not), and provides clear, explainable outputs.
This preserves clinical richness, reduces missed matches, and accelerates trial recruitment, enabling rapid, accurate, and nuanced patient-trial matching impossible with structured data alone.
Conclusion: let’s unlock the full potential of health data
Healthcare data is complex, and so are its data. Some data fits structured queries; others need advanced interpretation. The future lies in combining all tools to maximize every piece of information.
Matching patients to trials is not just a technical challenge but a human one, every missed match is a missed opportunity. By embracing unstructured data’s nuance with advanced technologies, we can transform connections among patients, clinicians, and researchers.
If you want to learn more or collaborate, contact us at valerie.vandeweerd@monsana.ai or connect via LinkedIn.
Harnessing Unstructured Data and Hospital Interoperability, November 15, 2024, Applied Clinical Trials https://www.appliedclinicaltrialsonline.com/view/harnessing-unstructured-data-and-hospital-interoperability