AI Essentials: Do AI Writing Detectors Work?

Daniel A. Lopez
3 min readJan 22, 2024

--

Generated via DALL-E

Do AI writing detectors work?

So much of the mainstream conversation around artificial intelligence in education of the last twelve months has orbited around academic integrity and the emergence of AI writing detectors.

In absence of concrete strategies and practices for adapting to a world with AI, many educators have resorted to leveraging AI writing detectors as a response to AI chatbots, which allow students to generate an essay or writing sample in seconds.

In today’s AI Essentials, we explore the world of AI writing detection so you have some context on the experience and impact of detectors as you continue on your artificial intelligence journey.

I will take us on this journey by testing two of my personal writing samples, two AI generated samples, and a mixed AI/human written piece in two popular AI detectors — Undetectable AI and GPTZero.

Generated via DALL-E

As a TL;DR for my non podcast friends, here are four reactions that came up for me after the experiment:

  • Overall, the AI detectors were not totally inaccurate but there were instances where we saw some uncertainty with each. It seemed like Undetectable really struggled to assess AI written personal narratives whereas GPTZero performed well across all samples except for the academic blog piece which it gave a highly contradictory assessment on.
  • I only generated samples using ChatGPT and I know GPTZero was built originally with the intention of detecting ChatGPT curated text. It makes me wonder what the results would have been if I also tried samples from Bard and Claude.
  • I created anomalies with each tool in a six minute experiment. Imagine if I actually spent an hour doing this with the intention of making the text undetectable? You will also notice that on the undetectable interface I was showing, in addition to detecting AI written text, it claims to be able to make AI generated content undetectable by other AI detectors. In other words, making it sound human.
  • I am not taking any of these detections at face value. It might be helpful to test a few if you have a strong suspicion a student is submitting an AI generated piece alongside thoughtful conversation (or you could just go here first), but I do think you need to weigh this with the trade off that you could destroy any trust or relationship you may have built with the student in the process.

For my AI enthusiasts who listened, what do you think about AI detectors? What advice could you share with educators who suspect a student of using AI to generate an essay?

Check out the full episode here. You can also find the study on detectors I referenced in the episode here. Join the conversation at TheAiEducationConversation.com

--

--

Daniel A. Lopez
Daniel A. Lopez

Written by Daniel A. Lopez

AI Education Practitioner | Host of The AI Education Conversation | College Access Leader

No responses yet