Search for a command to run...
The study by Major et al. tackles an issue in an ill-defined area of modern technological communication: something that is being called “artificial intelligence” or “AI” but actually consists of a host of overlapping technologies, including inference engines, neural networks, machine learning, query scripts, and hardware load-balancing, among others. Because there are financial incentives to hype this technology stack, which can generate plausible simulacra of creative outputs, the result has been elevated into the realm of “intelligence.” These systems are not intelligent, but rather represent a technology that has, for decades, fooled humans because we are easily fooled. Science was invented to address this vulnerability. These technologies work by “tokenizing” every input, whether a word or an image, and then recursively approximating the likelihoods of specific tokens appearing next in a sequence of tokens used to generate text or an image. Referring to errors by AI as “hallucinations” presumes that there is a state in which these systems know reality and are simply deviating, when in fact “all they do is hallucinate.”1 As Major et al. describe, AI-use disclosure guidelines for authors at journals are not uniform in many ways. However, the publisher’s own use of large language models (LLMs) and other systems that they call “AI” is also not typically disclosed to authors, reviewers, or readers. Did an AI system screen the manuscript checklist, references, and initial submission before processing it into the editorial queue? Were the statistical models first checked by an AI system? Was the rejection or acceptance letter generated by AI? So-called “AI screening tools” are used more and more often without disclosure across scientific journal publishers. Since open access (OA) and the author-pays model (Gold OA) caused massive consolidation in the scientific publishing industry, a few major for-profit companies now control most of the manuscript systems, have financial incentives to appear to be embracing AI technologies, and disclose very little about how these tools were developed or tested, or even if, and how, they are being used. AI tools at any level in the research supply chain introduce unique risks to science and medical research. Science was invented to allow humans to better understand the universe in which they live, and to understand themselves. It is a process built to separate causation from coincidence, so to have AI systems, which are themselves predicated on coincidence, involved on either side of making or evaluating scientific claims is risky, introduces errors that these systems may accept or simply fail to detect, and worse—because these systems then ingest their own outputs later—creates a vicious cycle of information pollution. The study by Major et al. tackles 1 aspect of the AI hype era. On the basis of a broader view of the landscape, it is my strong opinion that journals should have an absolute prohibition on the use of AI tools in their current state—or any likely future state—for authors, for screening, or for communication of any kind. This is, after all, science: a human endeavor created so that we are not fooled by correlation that is not causation. LLMs and AI stacks are all built on correlation without causation—that is, hallucinations. It behooves every scientist and practitioner using scientific information to understand the overall risks, to hold all of the players in the research supply chain to high standards of disclosure and excellence, and to demand that the culture of science is elevated over the worship of unproven technologies.
Published in: Journal of Bone and Joint Surgery
Volume 108, Issue 4, pp. 253-254