M. Bergeron1, S. Goswami1, D. Tong1, M. Shahin2, R. Singh3, B. Corrigan4, R. Keizer1; 1InsightRX, San Francisco, CA, USA, 2Pfizer Research & Development, Groton, CT, USA, 3Pfizer, Cambridge, MA, USA, 4Pfizer, Groton, CT, USA.
Principal AI Engineer InsightRX Toronto, Ontario, Canada
Background: Analyzing clinical trial data is labor-intensive, time-consuming and requires programming skills. While generative AI has shown promise in various fields, it lacks reliability and reproducibility. We present Apollo-AI, a novel platform bridging this gap by integrating generative AI into clinical software via robust agentic workflows to augment analysis and reporting capabilities using natural language chats. Methods: We develop a modular agentic architecture designed to reliably perform analysis tasks while providing insight to users in real-time. The system comprises three components: 1) a Conversational Agent that collaborates with users to co-create tasks; 2) Task Agents, specialized LLMs endowed with deep domain knowledge; and 3) an Agent-Computer Interface allowing efficient file searching, viewing, editing and code execution by AI Agents. The platform is planned to support Ad-Hoc Analysis, Non-Compartmental Analysis (NCA), Population Pharmacokinetics (PopPK) modeling and report generation. Importantly, its modular design using checkpointed handoffs between agents produces a clear audit trail. The platform was tested across two realistic datasets in an automated way for its ability to parse and prepare data for plotting / summarization in response to variable prompts. The visual correctness of plots and summary tables was then scored manually by four human reviewers. Results: Apollo-AI empowers users by surfacing relevant knowledge (e.g. standard operating procedures, dataset features, etc.) at the right time, catching errors, and streamlining tasks in clinical trial analysis and reporting. Experiments to quantify the improved efficiency and reproducibility are currently underway. For a set of common PK Data Exploration Tasks, the platform currently generates the correct data in 97% (88% - 100%) of cases. For one-shot requests, in 54% (47% - 62%) of prompts the task resulted in a visually correct plot, while for 6% (5 - 7%) of prompts no or wrong output was generated. In 36% (33 - 48%) of outputs a minor aesthetic adjustment was deemed necessary. Conclusion: By leveraging agent-based workflows, Apollo-AI enhances analysis and reporting capabilities, reduces the need for advanced programming skills, and accelerates clinical trial data analysis processes.