Search for a command to run...
Abstract Background Elicit AI aims to simplify and accelerate the systematic review process without compromising accuracy. However, research on Elicit’s performance is limited. Objectives To determine whether Elicit AI is a viable tool for systematic literature searches. Methods We compared the included studies in four systematic reviews to those identified searching with Elicit. We calculated sensitivity, precision and observed patterns in the performance of Elicit. Results Elicit had an average of 39.6% precision (26.7% - 46.2%) which was higher than the 7.55% average of the original reviews (0.65% - 14.7%). However, the sensitivity of Elicit was poor, averaging 37.9% (25.5% - 69.2%) compared to 93.5% (87.2% - 98.0%) in the original reviews. Elicit also identified some included studies not identified by the original searches. Discussion At the time of this evaluation, Elicit did not search with high enough sensitivity to replace traditional literature searching. However, the high precision of searching in Elicit could prove useful for preliminary searches, and the unique studies identified mean that Elicit can be used by researchers as a useful adjunct. Conclusion Whilst Elicit searches are currently not sensitive enough to replace traditional searching, Elicit is continually improving, and further evaluations should be undertaken as new developments take place. Key Messages AI tools, such as Elicit, have been developed to improve the efficiency of systematic review processes, including the identification of studies. Using four case study systematic reviews Elicit searches had a sensitivity between 25.5% and 69.2% (37.9% average) and precision between 26.7% and 46.2% (39.6% average). Elicit identified some unique studies that met the inclusion criteria for each of the case study systematic reviews. Elicit is constantly improving and developing its systems, thus independent researchers should continue to evaluate its performance.