Detecting Machine-Written Content in Scientific Articles

The recent surge in popularity of AI tools such as ChatGPT is forcing the science community to reckon with its place in scientific literature. Prestigious journals such as Science and Nature have attempted to restrict or prohibit AI use in submissions, but are finding it difficult to enforce because of how challenging it is becoming to detect machine-generated language.

Because AI is getting more advanced at mimicking human language, researchers at the University of Chicago were interested in learning how frequently authors are using AI and how well it can produce convincing scientific articles. In a study published in the Journal of Clinical Oncology Clinical Cancer Informatics, Saturday, June 1, Frederick Howard, MD, and colleagues evaluated text from over 15,000 abstracts from the American Society for Clinical Oncology (ASCO) Annual Meeting from 2021 to 2023 using several commercial AI content detectors. They found that there were approximately twice as many abstracts characterized as containing AI content in 2023 as compared to 2021 and 2022 – indicating a clear signal that researchers are utilizing AI tools in scientific writing. Interestingly, the content detectors were much better at distinguishing text generated by older versions of AI chatbots from human-written text, but were less accurate in identifying text from the newer, more accurate AI models or mixtures of human-written and AI-generated text. 

As the use of AI in scientific writing will likely increase with the development of more effective AI language models in the coming years, Howard and colleagues warn that it is important that safeguards are instituted to ensure only factually accurate information is included in scientific work given the propensity of AI models to write plausible but incorrect statements. They also concluded that although AI content detectors will never reach perfect accuracy, they could be used as a screening tool to indicate that the presented content requires additional scrutiny from reviewers, but should not be used as the sole means to assess AI content on scientific writing.

 

withyou android app