Primary tabs

AI Detection as a Weapon Against “Paper Mills”

By Mary-Ann.Reposo, 30 September, 2025
Publishers today face numerous challenges related to verifying the authenticity of submitted texts. One of the most significant threats in recent years has been the mass use of generative artificial intelligence models, such as ChatGPT, to create or "enhance" scientific papers. While these tools can support creativity, they are increasingly being used unethically—to generate content without disclosing the use of AI.
Article type
Article
Main text

By Ewelina Kochanowska
Biblioteka Analiz, No. 645 (13/2025)

The phenomenon of AI-generated content in academic publishing is not marginal. According to the journal Applied Intelligence, ChatGPT was referenced in about 10 percent of the articles published in each of the journals analyzed, showing that authors across disciplines have quickly adapted AI tools.

A particularly troubling phenomenon in the academic world is the rise of so-called paper mills—informal or even criminal organizations that prepare scientific papers on demand, often on a massive scale. These content factories offer clients ready-made publications, authorship attribution opportunities, or the sale of articles to specific journals. Increasingly, they rely on generative AI to quickly create seemingly original and formally correct texts.

Scientific publishers are one of the main targets of this practice—especially those that publish open-access articles and do not enforce strict publication standards. Paper mills exploit the fact that many editorial boards rely solely on similarity analysis to detect plagiarism. In this scenario, AI-generated articles often pass traditional originality checks because standard plagiarism detection systems focus on identifying matches between the submitted text and sources available in databases (publications, internet resources).

However, AI-generated text may be original in the sense of containing no direct borrowings, having been produced through statistical language predictions. Only stylometric and linguistic analysis—i.e., AI detection—can identify and present arguments that the text was not the result of individual author effort but rather created by a language model.

For academic editors responsible for the quality and authenticity of publications, such information is crucial—both ethically and legally. According to a 2024 Nature report, over 10,000 suspicious scientific papers published in recent years in Asian and Eastern European journals may have originated from paper mills. Many of them included falsified data, random graphs and charts, and texts generated or edited by AI.

In response, AI detection—especially when combined with metadata analysis (e.g., repeated authorship, affiliations, rapid submission of multiple texts)—is becoming a key element in combating this practice. A Stanford University study showed that AI detection models correctly identified generated sections in 9 out of 10 cases, provided the analysis involved longer academic texts.

Thus, AI detection is becoming a breakthrough tool because, by analyzing linguistic style and structure, it can indicate that a text—despite showing no similarities—exhibits traits characteristic of machine-generated content. Moreover, many AI detection reports can highlight which parts of a text may have been created by language models and which are likely human-written, which is essential when evaluating individual authors' contributions.

One example comes from a renowned European publisher cooperating with an international indexing database, which, thanks to AI detection, discovered that several authors from the same institution submitted a dozen articles in a short period with nearly identical structure, style, and graphic layout. Although each article was “original” in the sense of showing no textual overlaps, reports revealed that the bulk of the content in each case bore typical features of texts generated by ChatGPT or Gemini. The publisher notified the journal’s editorial board, and after an internal audit, all articles were withdrawn, and the institution was placed under additional monitoring.

In summary, implementing AI detection in anti-plagiarism systems brings numerous benefits. Most importantly, it enhances the credibility of the publisher by assuring reviewers and readers that published content is authentic. It also supports the peer-review process by giving editors and reviewers an additional tool to assess the quality of submissions. It helps prevent abuse—simply knowing that texts are checked for AI generation deters undisclosed use—and ensures compliance with international database requirements. For instance, Scopus and Web of Science are beginning to mandate transparency regarding AI usage.

At a time when AI technology is rapidly transforming the landscape of academic communication, publishers must respond just as dynamically to new challenges. AI detection in anti-plagiarism systems is not a luxury but a necessity if we care about maintaining ethical standards, publication credibility, and the protection of scientific output.

Sources:

  1. Picazo-Sanchez, P., Ortiz-Martin, L. (2024) “Analysing the impact of ChatGPT in research. Applied Intelligence”, https://doi.org/10.1007/s10489-024-05298-0
     
  2. Van Noorden, R. (2023), “More than 10,000 research papers were retracted in 2023 – a new record”, Nature, www.nature.com/articles/d41586-023-03974-8
     
  3. Cell Press (2023), “AI-generated academic science writing can be identified with over 99% accuracy”, ScienceDaily, www.sciencedaily.com/releases/2023/06/230607124132.htm
Standfirst
Publishers today face numerous challenges related to verifying the authenticity of submitted texts. One of the most significant threats in recent years has been the mass use of generative artificial intelligence models, such as ChatGPT, to create or "enhance" scientific papers. While these tools can support creativity, they are increasingly being used unethically—to generate content without disclosing the use of AI.

comment