Sell schools the AI detector, sell students the workaround. Inside a troubling double game.

As students face expulsion and workers lose jobs, critics say flawed detection tools are ruining lives – while their makers profit from both sides.

A magnifying glass is held over a laptop keyboard, focusing on the keys, suggesting a search or detailed inspection concept.

When Haishan Yang was reportedly expelled from the University of Minnesota over allegations that he used AI to complete an exam, he compared the outcome to being hit with “a death penalty.” The trouble began when his written preliminary exam submission was flagged as suspiciously similar to AI-generated work. Passing this exam was mandatory for the 33-year-old doctoral student to begin work on his dissertation.

Instead of moving a step closer to obtaining his Ph.D., Yang found himself before a student conduct review panel. Eventually, the panel would determine that he essentially cheated by using ChatGPT or a similar AI tool for parts of his work. Yang was found guilty of academic dishonesty. According to MPR News, Yang decided to file a federal lawsuit against the university. While his case was ultimately dismissed without prejudice, the situation still raises questions about the quality of methods to determine AI usage, related bias, and whether the punishments are truly justified.

A person in a blue sweater stands in front of a whiteboard covered with complex mathematical equations.
Image courtesy of Haishan Yang

According to Yang, one of the ways the panel that ruled against him reached its decision was by using AI detection software. As pervasive as AI usage has become, there are certain spaces where its use is not only frowned upon but can result in severe consequences. Students face anything from a zero on a paper to getting kicked out of school. Likewise, employees could find themselves out of work. 

With outcomes this severe, one wants to be sure that the methodology used to achieve them is both flawless and fair. Yet even with advanced AI detection software, this could be an unrealistic expectation. As more people face consequences arising from AI detection use, controversy is inevitable.

While some argue that detection software is necessary to ensure work is completed honestly, others worry about the legal and ethical consequences of false accusations. With evidence emerging that AI detectors have blind spots, unfair and harmful outcomes are highly likely. If such issues aren’t properly addressed, AI detectors could create nearly as many problems as they are meant to solve.

AI detectors cause frustration and may lead to lower-quality work

Even as some maintain the inevitability and essential nature of AI detection software, Bob Hutchins, PhD, suspects that the tool might amount to little more than “security theater.” Hutchins, an AI strategist and consultant, shared with IWAI that he’d “been on both sides” of the detection software experience, having relied on it to check student work and also been placed in circumstances where his own written work was tested for AI usage.

“I have seen students reduced to tears in order to try to explain their writing process to administrators who trusted an online percentage more than the student in front of them.”

—Bob Hutchins, PhD

Hutchins shared that his own work has been flagged by AI detection software, not because it was created with the help of AI, but rather, because the detector was trained to recognize a specific type of writing as problematic. “Clear writing, consistent voice, good grammar. Apparently, that is now suspicious.”

This sentiment was echoed by Pilar Lewis, a public relations associate who admittedly “felt frustrated” in how AI detectors were built to interpret what types of written work would ultimately get falsely labeled as AI-generated. 

“Now, if your writing is flagged as being AI-generated, you have to defend your tone and clarity,” said Lewis. She noted that the more polished the final product, the more likely a detector would determine that the work was created using an AI tool. “Good professional writing is supposed to be clear, organized, and efficient.”

In what travel and sustainability writer Mariana Zapata Herrera dubbed “a Black Mirror moment,” a detector not only mislabeled her writing as completely AI-generated but also offered to help her rewrite it “to sound more human.”

Some express concern that the perpetual reliance on and unquestioned trust in AI detection software will lead to a dramatic drop in content quality. By making one’s work rougher and simpler, it may be possible to avoid being accused of improper AI use. 

Treating AI as both poison and cure causes an ethical (and legal) conundrum

If you’re afraid that your completely human work will be mislabeled as AI-generated, many detection services offer a humanizer tool. By running your work through these apps, they promise to eliminate all the supposed tells that would cause your content to get flagged.

In an email comment, ZeroGPT explained that its humanizer was designed to “refine content” to the point that it could subsequently fool any detector, including its own. “The goal is to make the text appear more natural and human-like while maintaining the original meaning.”

While it may be possible to find such options for free, these services are often provided as part of paid subscription plans. It’s a situation that creates not only a massive conflict of interest but also a serious ethical conundrum. The question now is whether one should trust a service that treats AI as a threat but then also uses that very same technology to provide the solution, and for a price.

Rashad Matthews, an AI expert and compliance professional, strongly advises against relying on AI services that provide both checkers and humanizers. “It’s extremely high risk if a [business is] trying to sell both,” says Matthews, pointing out that such service providers could soon find themselves on the wrong side of AI-related legislation. 

Matthews referenced the EU Artificial Intelligence Act as an example. These laws are designed to provide a framework for AI development across Europe. Part of this is risk reduction and ensuring that AI developers create and utilize the technology without causing harm. Matthews insists that playing both sides by providing a detector and a humanizer might be deemed a harmful practice.

“A company sells a detector to a [college] institution,” said Matthews, “and then [that] same company sells a humanizer to the students.” It’s a conflict of interest that may seem profitable in the short term, but Matthews believes it could lead to lawsuits and penalties as legislative bodies worldwide work to establish crucial regulatory guidelines.

Avoiding these potential consequences requires an honest conversation about AI detection software, its limitations, and how to get the most out of the technology.

AI detection requires transparency—and acknowledgment of inherent ableism

AI detection services cannot avoid adhering to the same accountability standards readily applied to AI content generation software. Software developers must be transparent about what their detectors can and cannot do. It might also be wise to consider whether it’s such a good idea to make that software fail when put up against its own humanizer software. 

A major selling point of AI humanizers is that they fix the work of people who are neurodivergent, have learning disabilities, or are non-native English speakers. Laurence Minsky, a professor at Columbia College Chicago, notes that he’s submitted his own work to checkers multiple times and wasn’t surprised that it was flagged.

“I know the inclination of AI checkers to falsely flag the content written by people with dyslexia.” Despite such flaws, Minsky doesn’t write off detectors completely, instead combining the software with his own instincts and understanding of his students.

“It’s the student’s knowledge of the topic and their ability to talk about it after handing in their piece that set off my alarms,” said Minsky. “When this happens, and I then check their writing, I have not had a student object when their submission comes up as being AI-generated.”

But rather than escalating the situation in a manner similar to the Haishan Yang controversy, Minsky instead opts for a teachable moment, explaining to his students that AI-generated material could contain plagiarized work or be limited by hallucinations. If the student was flagged for a disability-related reason, Minsky works with them to identify ways to avoid a similar false positive in the future.

The steps taken by Minsky might not be necessary if detection software brands moved to alleviate these issues and reduce risks on their own. For these reasons, Rashad Matthews states that “there will always need to be a human in the loop,” someone who fully understands the technology and can explain it while ensuring the detection software adheres to regulatory expectations.

Rather than stopping at transparency, Bob Hutchins wonders whether the use of AI detectors suggests we’re approaching AI use in educational and professional spaces the wrong way. “We’re trying to detect cheating instead of asking how we should be reimagining assignments in an age where generative AI is a thing. If something can be done 100% by ChatGPT, then maybe we should be changing the task.”

Separator

Join free: Weekly AI insights and analysis

Trusted by 200,000+ AI enthusiasts, entrepreneurs and consultants.