top of page
Search

Adapting AI pre-screening to clinical workflows: lessons from AZ Klina Cardiology

AI-driven pre-screening technology has the potential to streamline patient recruitment for clinical trials, but only if it adapts to how clinical teams actually work. Expecting teams to change their workflow for the technology is rarely effective. This principle became clear during our recent pilot with the cardiology department at AZ Klina.

 

Understanding the clinical context

Clinical workflows for identifying eligible patients can vary widely between departments. In some centers, patient identification is opportunistic: physicians identify eligible patients during their consultations. In other departments, systematic pre-screening is standard, often performed by study coordinators who review large numbers of patient records to flag potential candidates.

 

These differences create distinct needs for AI pre-screening: with systematic screening, a broader AI approach is acceptable because the main goal is to efficiently exclude truly eligible patients and reviewing some false positives still reduces overall workload. In opportunistic settings, however, each additional patient flagged adds directly to the physician’s workload, so high precision is critical.

 

The AZ Klina pilot: initial results

This need for tailoring became clear in our pilot at AZ Klina, where the cardiologist, also the principal investigator, personally reviewed patients. 1,200 patient records were automatically screened, and the AI flagged 81 as highly likely eligible, of which 24 were confirmed eligible. Each review took 2–3 minutes, totaling roughly 4 hours.

 

While departments accustomed to manual pre-screening would see this as a huge efficiency gain as 1120 patients were automatically excluded from manual review, for the cardiologist personally conducting the reviews, the workload was still significant. This illustrates that AI pre-screening must be adapted to the specific clinical workflow.

 

From broad to strict pre-screening: the jardiance example

To address this need for adaptation based on clinical needs, the AI pre-screening can be adjusted to be more broad or more strict, which directly affects the number of false positives. To make this more concrete, consider the following example: one of the eligibility criteria for the study was that patients needed to be on antidiabetic medication. Some medications, like Jardiance, can be prescribed for multiple conditions: diabetes, heart failure, or kidney disease.

 

In the first version of our AI pre-screening, a broad approach was used: if a patient was on Jardiance, the system assumed they met the “on antidiabetic medication” criterion, even if it was prescribed for another condition. The clinician would then manually review each flagged patient to confirm whether the medication was truly for diabetes.

 

To reduce the workload, the AI was refined to flag patients on Jardiance only if it was clearly prescribed for diabetes, directly matching the eligibility criterion. This made the system more precise and time-efficient while still capturing most eligible patients.

 

From broad to strict pre-screening: pre- vs post-Results

With the initial broad approach, it became clear that the workload was still too high for the cardiologist. To address this, the AI criteria were made stricter. In addition to the Jardiance adjustment described above, similar stricter rules were applied to other eligibility criteria. The impact was clear:

  • Before refinement: 81 patients flagged → 24 truly eligible (~30% true eligibility rate)

  • After refinement: 29 patients flagged → 17 truly eligible (~60% true eligibility rate)

 

By refining the algorithm, the number of patients the cardiologist had to review was reduced, nearly doubling the true eligibility rate and improving the workload-to-benefit ratio. However, about 7 truly eligible patients were missed due to stricter criteria, highlighting the trade-off between precision (reducing false positives and workload) and recall (ensuring eligible patients are not missed).

 

Balancing precision, NPV, and PPV

At Monsana, we continuously work on improving precision, as well as negative predictive value (NPV) and positive predictive value (PPV). A higher precision or PPV reduces the number of false positives and avoids unnecessary work for clinicians, while a high NPV ensures we are not missing truly eligible patients.

 

What is “optimal” depends on the clinical workflow: some teams can handle a broader set of flagged patients, others need very high precision to keep the workload reasonable.

 

The key takeaway: tailoring AI to workflows

The AZ Klina pilot illustrates that AI pre-screening cannot be one-size-fits-all. Depending on the clinical workflow, algorithm thresholds, and screening criteria must be tailored to balance workload, benefit, and patient capture.

 

At Monsana, we don’t just implement AI; we adapt it to the workflow and capacity of each study team, continuously refining precision, NPV, and PPV to maximize efficiency and impact while respecting the realities of clinical practice.

 
 
bottom of page