AI detectors can be xenophobic — let’s end them

September 28, 2023 — by William Norwood
Graphic by Isabelle Wang
Algorithms and data sequences are rarely an accurate replacement for humans.
AI detectors create problems for non-native speakers and are often inaccurate; they need to be improved substantially before being used as a tool to catch cheaters.

In the era where Artificial Intelligence (AI) can generate human-sounding text, many companies, organizations and schools have tried to combat its use with AI detectors. Tools such as GPT Zero, ZeroGPT, Originality AI, Winston AI and TurnItIn (used by the district for both plagiarism and AI detection) attempt to detect AI-generated text by using algorithms that compare AI written texts with human written text and AI text. 

However, other than stylistic choices, AI-generated text is largely indistinguishable from human text — there’s no secret code embedded within the text that marks it as a product of AI. 

These detectors work by fighting fire with fire — they look at the perplexity of sentences and measure “how ‘surprised’ or ‘confused’ a generative language model [the detector] is when trying to predict the next word in a sentence,” according to The Guardian. If an AI model used by a detector can easily predict the next word in a sentence, text perplexity is low, and if it cannot predict it, it’s high. Low perplexity means the text is most likely AI-generated. 

AI detectors also come with a large number of issues, including false positives, which occur more often for non-native English speakers. 

According to a paper written by Stanford scholars, generative language detectors “consistently misclassify non-native English writing samples as AI-generated, whereas native writing samples are accurately identified.” 

Within the study, they tested seven AI detectors with 91 non-native English speakers essays. All seven identified 18 of the human written text as AI-generated text, and at least one detected 89 of the 91 essays as AI written.

Although these programs are supposed to accurately detect AI and maintain academic standards, they instead point the finger at non-native English speakers.

This is an especially big problem in the Bay Area, where there is a high level of non-native English speakers. According to the Mercury News, 51% of households in Santa Clara County speak another language at home. Consequently, local districts have a higher percentage of non-native English speakers and thus risk having increasingly higher levels of false claims of AI use.

This can then lead to damaging permanent marks on a person’s academic record, hurting their future educational experience. 

In fact, OpenAI, the creators of ChatGPT, has removed its own AI detector off of the market, citing a “low rate of accuracy” for the tool. When the creators of the largest AI tool identify a problem with detectors, this should be a fair indication for schools to re-evaluate the tools they use to identify AI written text. 

The Harvard Graduate School of Education suggests that teachers should use a four-step approach to AI tools: Stop pretending it does not exist, use AI with students, teach students how to ask AI questions and use generative AI to spark imagination. This is simply the only way to adapt with the times, and reduce fear surrounding AI. AI can be used as a powerful tool when used correctly, and in a creative manner. 

In the meantime, let’s put AI detectors where they belong: in the broad category of wishing thinking that didn’t work out in the real world.

Tags: Ai, ChatGPT
2 views this week