Search for a command to run...
Artificial intelligence is expected to play an increasingly significant role in both medicine and law. The performance of generative artificial intelligence (GAI) models observed since 2022 may encourage consideration of their use in forensic expertise. We evaluated the ability of such models, in 2025, to conduct medical liability expertises. Nine fictional clinical cases describing the medical management of patients treated by a healthcare professional or institution were submitted to three GAI models (ChatGPT-4 Turbo, Gemini, and Mistral AI). The models were prompted to determine whether or not medical malpractice had occurred. Each model was queried five times for each case, and the responses were compared to the conclusions of a panel of three experts. Out of 135 requests, the conclusions of the GAI models aligned with those of the expert panel in 86 instances. The discrepancies were partly due to false negatives, where shortcomings were missed by the models, and to false positives, where deficiencies were identified that experts did not acknowledge. In other cases, they reflected differences similar to those that might arise between human experts. All models provided contradictory answers for the same case at least once. Taken together, these risks of false negatives, false positives and internal inconsistencies presently prevent the use of GAI models in conducting forensic medical assessments. Beyond these concerns, we doubt that the current functioning of GAI models, which relies on probabilistic content generation, is suited to forensic expertise and its demand for rigorous reasoning. While GAI tools may eventually complement human expertise, the ultimate assessment and conclusions should remain the prerogative of trained forensic medical experts.