Living Data 5.1 - Webinars, Blogs, and Big News
It's been a fun couple weeks at Nested Knowledge. We've had some awesome webinars, intriguing and informative blog posts, and a huge update coming your way.
We had two superuser events. The first was from Medlior Health Outcomes Research Ltd. 's Jody Filkowski where we discussed an emerging use case for Nested Knowledge: AI Reviews. These rapid evidence generation exercises are useful for a variety of reasons, helping to dramatically speed up decision making as well as a head start to more comprehensive work.
The next was with Mobility HEOR 's Evelyn Rizzo, CPO, MSc where we discussed all the ways Nested Knowledge can be adapted to a variety of different synthesis workflows, as well as how to use the software to create cross-functional alignment throughout the product life cycle. Watch to learn more about how Nested Knowledge can be used to create a "Living Evidence Library" and keep evidence up to date and enable collaboration across and within HEOR, Market Access, Medical Affairs, Medical Communications, and Regulatory Affairs.
We also had some blogs that went out, the first being: How to Evaluate the Performance of an AI Model. We made this as a simple guide to understanding AI Screening Models and Cross-Validation in Systematic Literature Reviews (SLRs).
Key Insights:
- Recall as the Primary Metric: In the context of SLRs, recall is the most important metric because it measures how comprehensive the review is. High recall means that fewer studies are missed, which is critical for thoroughness.
- Trade-off with Precision: While high recall is beneficial, it comes with a trade-off. Lower precision means more false positives, which translates to more work for reviewers to sift through and exclude irrelevant studies. However, this trade-off is often acceptable because the cost of missing relevant studies is higher than the cost of adjudicating additional irrelevant studies.
- AI's Role in Screening: The AI can significantly reduce the initial workload of reviewer screening while also ensuring that almost all potentially relevant studies are included. Reviewers then focus on verifying these studies, which, while still labor-intensive, ensures a thorough review process. Therefore, AI can save time in Dual Screening, bolster comprehensiveness, and allow experts to focus on assessing borderline or includable studies!
- Human-Level Performance? The goal is not to achieve perfect performance but approach human-level performance, particularly in Recall—in fact, given that it is trained on expert adjudications, expecting AI to exceed human accuracy may be an unachievable expectation. The AI's higher Recall, at the cost of depending on the adjudicator for Precision, demonstrates its effectiveness in not missing relevant studies, making it a valuable tool in the SLR process.
When we compare the AI's performance to humans (see Internal and External Validation), we see some trade-offs:
- Robot Recall: The AI had significantly higher Recall than humans in the Internal Validation, meaning it misses fewer includable studies. This is crucial in SLRs because missing relevant studies can compromise the review's comprehensiveness. By having a higher recall, the AI ensures that almost all relevant studies are caught, which maximizes the most important metric in Screening.
- Robot Precision: The AI's precision is lower than that of humans, indicating that it includes more false positives. This means that while the AI is good at not missing studies, it does include more irrelevant studies that need to be manually reviewed and excluded.
While comprehensiveness may be the most important contribution, time savings should also be considered. It has been demonstrated that, when used in Dual Screening, Robot screener can achieve Time savings of roughly 45% of screening time!
Recommended by LinkedIn
Key terms
- Cross-Validation: Helps estimate model performance on new data.
- Accuracy: Shows agreement between the model and human decisions.
- Recall: Measures how well the model includes relevant records, thus assessing comprehensiveness.
- Precision: Indicates the correctness of the model’s include decisions.
- Performance Standards and Priorities: The goal is to near-human levels of performance but with a priority of model Recall, so that greater comprehensiveness is offered by AI-assisted screening.
- Continuous Improvement: AI models should be continuously evaluated and improved based on their performance metrics. Regular updates and retraining with new data can help enhance their accuracy and reliability, and both heuristics and an understanding of Cross-Validation is vital to supporting decisions of when to employ AI models.
Our next blog was about JCA heading into the future. Starting 12 January 2025, all novel oncology medicines and advanced therapy medicinal products must be evaluated by the European HTA.
Nested Knowledge can help tremendously with this process and this blog goes a long way in outlining just how that would happen.
Buckle up.
The future of how you use Nested Knowledge is about to change for the better. While we've made some small feature updates and bug fixes over the past few months, there's a massive update coming your way.
Make sure you're subscribed to the Newsletter and follow us on LinkedIn so you don't miss this huge release!
As always if you ever have any questions, comments, or concerns please reach out to us at: contact@nested-knowledge.com
Happy Building,
The Nested Knowledge Team
Health Economics and Outcomes Research Consultant | Evidence Strategy | Reimbursement Support
1yI'm excited to see the future updates!