ChatGPT-generated scientific papers could be picked up by new AI-detection tool, say researchers

  sophia sophia   7122   10 Jun, 2023 

Description:

Earlier this year, a US radiologist published a paper written with the help of ChatGPT in a peer-reviewed journal. Som Biswas of the University of Tennessee Health Science Center in Memphis wrote the article "ChatGPT and the Future of Medical Writing" for the journal Radiology. He said he produced the article, which he edited, to help raise awareness about the usefulness of the technology. "I am a researcher and I publish articles on a regular basis,” Dr Biswas told The Daily Beast. "If ChatGPT can be used to write stories and jokes, why not use it for research or publication for serious articles?" The Daily Beast reported he's since gone on to publish 16 more journal articles in four months using the chatbot. It also cited one journal editor who said they had experienced a "dramatic uptick" in articles. Heather Desaire, a professor of chemistry at the University of Kansas can relate. "The story rings true with my personal experience," she told the ABC. "I worry about journals being overwhelmed with paper submissions and, as a reviewer for those journals, being asked to do 10 times more reviews than I normally would." While Professor Desaire is no enemy of ChatGPT, she does think it's important to keep an eye on unintended impacts, and she hopes her latest research might help. A new AI-detector for scientific texts? In today's issue of the journal Cell Reports Physical Science Professor Desaire and colleagues reported they have developed a highly accurate method of detecting ChatGPT-generated writing in scientific texts. Professor Desaire said it could help journal editors who find themselves deluged by material written with the help of the chatbot. A detector might help editors prioritise what articles they send out for review, she said. To develop their tool, the researchers first identified a set of "telltale signs" that differentiate AI-generated text from that written by human scientists. They did this by carefully analysing 64 "perspective" articles from the journal Science — these are review papers that comment on current research and put it in context. Then they analysed 128 ChatGPT-generated articles on the same research topics. From comparing the two they identified 20 characteristics that could help decide the authorship of scientific texts. Covering paragraph complexity, diversity in sentence length, use of punctuation and vocabulary, some of these features appeared unique to scientists, the researchers found. For example, while they found humans writing on Twitter might use punctuation like double exclamation points to express emotion, Professor Desaire and her team found scientists have different linguistic penchants. "Scientists are special people," she said. "They aren't using double exclamation points, but they are using more parentheses and dashes than ChatGPT." And her research found scientists did like to go on a bit by comparison. "The difference in paragraph length really jumps out at you," said Professor Desaire. Humans were more likely to write very short sentences and very long sentences, she added. Another characteristic of human-generated scientific text was the use of "equivocal language" — words like "however", "although" and "but". And scientists used more question marks, semicolons and twice as many capital letters as ChatGPT. Training the AI-detector The researchers then used these 20 features to train an off-the-shelf machine-learning algorithm known as XGBoost. Known in the business as a "classifier", the algorithm provides a mathematical way of deciding between two options — Professor Desaire and her team use it in their daily work to identify biomarkers for diseases such as Alzheimer's. They checked to see how well their AI-detector performed on a test sample of 180 articles and found it was very good at working out if a scientific article was written by ChatGPT or an actual scientist. "The method is more than 99 per cent accurate," Professor Desaire said, adding it was better than existing tools, which were trained on a wider range of texts, beyond scientific writing. It could be adapted for other purposes such as detecting student plagiarism, as long as it was trained on the right language used by the group in question, she said. "You could morph it into whatever domain you wanted to, by thinking about what features would be useful." Will it work in the real world? Researchers not involved in the study said the comparison of texts that were 100 per cent AI-generated versus 100 per cent human generated was not realistic. "It's an artificial distinction," said Vitomir Kovanovi? who builds machine learning and AI models at the University of South Australia's Centre for Change and Complexity in Learning (C3L). He said when scientists use ChatGPT there tends to be more of a collaboration between human and machine, so for example a scientist may edit the AI-generated text. This is just as well because ChatGPT can get things wrong, with one study even finding it can generate fictitious references. But because the researchers compared 100 per cent AI with 100 per cent human-generated text instead of using collaborative texts, their success rate was boosted, Dr Kovanovi? said. Lingqiao Liu from the Australian Institute for Machine Learning at the University of Adelaide agreed the real-world accuracy may be lower, leading to more wrong classifications than expected. "Methodologically it's OK, but there is a risk of using this," said Dr Liu, who develops algorithms to detect AI-generated images. Professor Desaire and colleagues said studies with larger samples would be needed to show how broadly the approach could be applied. But follow-up studies so far had shown the tool was still useful when applied to human/ChatGPT collaborations, Professor Desaire said. "We can still predict with very high accuracy the difference." AI 'arms race' But as Dr Liu noted, it was possible to instruct ChatGPT to write in a specific way that could get a 100 per cent AI-written text past a detector. And the publication of features distinguishing human from machine-written text would only make this easier. In fact, some commentators talk about an "arms race" between those attempting to get machines to be more human-like and those trying to catch out people who would use this to nefarious ends. Dr Kovanovi? believes this is a "pointless race to have", given the momentum of the technology and its potential positives. He says AI detection "misses the point". "I think it's much better to sink our effort into how we can use AI productively." He also argued the practice of using anti-plagiarism software to score university students on how likely it was their work was written by AI was causing unnecessary stress. "It's hard to trust that score," he said. Kane Murdoch, who investigates misconduct at Macquarie University, said the operation of anti-ChatGPT software can be like a "black box". Unlike Professor Desaire's research, he said the detail behind some AI-detection systems can be sketchy. "It's very unclear how these numbers are arrived at," he said. "We could just be looking at improving assessment." Mr Murdoch also wonders whether AI detection in fields like science might scare people away from the "ethical" use of AI, which could help important communication of science. "Someone may not be a particularly strong writer, but they might be a very good scientist." Regardless of the challenges, Dr Liu said it was important to continue research into AI detection, and the research by Professor Desaire and colleagues was "a good starting point" for assessing scientific writing. A spokesperson for the journal Science said the publication had recently updated its editorial policies to specify that text generated by any AI tool cannot be used in a scientific paper. They said while there may eventually be "acceptable uses" of AI-generated tools in scientific papers, the journal was waiting for more clarity on what uses the scientific community saw as "permissible." "A tool that could distinguish with accuracy whether ChatGPT produced submissions could potentially be a useful addition to our rigorous peer review processes if it had a proven track record of accurate detection," they added.

Comments

  • james john

    Many computer scientists and AI researchers are doing research and working in Open AI

    Reply | 01 Jul, 2023
  • james john

    ChatGPT is a product of Open AI which is a research center or research lab in America

    Reply | 01 Jul, 2023

Respond to Talk

Subscribe to Newsletter

and receive new ads in inbox

x

John Doe

3