CHALLENGE COMPLETED
Smart Data Query Challenge
Region: U.S.
Introduction
Use machine learning technologies to learn from historical data related to queries and their associated patterns, along with the clinical data from those studies.
Challenge Statement
Clinical data managers must merge data across several reports by reviewing the study’s audit trail manually to determine if a data validation error (queries) must be raised or not. Historically, the industry has been limited to manual data review processes to validate data from clinical trials, which are very time-consuming and laborious. With a large training data set available from several historical studies as reference, the ideal machine learning solution should be able to automate the identification of data discrepancies with accuracy to optimize operational efficiency and reduce cycle time for database locks.
Results
Multiple selected companies participated in this challenge and each applied innovative methods ranging from rules-based techniques, to clustering algorithms, to deep learning with neural networks. The winning solution leveraged a deep learning model that enabled human-in-the-loop feedback mechanism, prediction interpretability as well as empowering data managers to focus on more difficult data reconciliation challenges, while the machine assists in predicting repeatable cross panel data reconciliation patterns.