AI Content Detection | Using AutoDetect in 2023

Was this blog post written by a real human or by AI? In this article, we’ll show you how to detect AI-generated content, and we’ll provide some best practices to consider when you’re creating content for your website. Included, we’ll run this article through AutoDetect, a toolset for detecting AI-generated content.

April 23, 2023
Written by
Nate Matherson
Reviewed by
Charles Purdy

Join 2,500+ SEO and marketing professionals staying up-to-date with Positional's weekly newsletter.

* indicates required

In 2023, it has become increasingly difficult to detect AI-generated content and to know whether a piece of content is original.

Due to the rapid rise in popularity of tools like ChatGPT, on February 8, 2023, Google released updated guidelines on AI-generated content. In those guidelines, Google reminds content creators that AI shouldn’t be used as a way to manipulate search rankings or to spam but acknowledges that AI might be helpful during the content creation process for things like titles, outlines, and meta descriptions.

Whether you’re hiring freelance writers to build out a portfolio of content for your content marketing strategy or you’re using AI as a starting point in writing an article, it’s important to know whether a final piece of content is original (or original enough). The old-school content marketer in me has lingering paranoia, and the last thing I want to worry about is triggering the Google Webspam team to take action on one of my websites.

In this article, we’ll provide some best practices to consider when creating content, and we’ll provide a quick demo of AutoDetect, a toolset that can be used to check the originality of content. With AutoDetect, you can quickly see whether a piece of content was written by a human or generated by AI.

Using AutoDetect to Spot AI-Generated Content

At Positional, we’re building a modern toolset for content marketing and SEO teams. We’ve introduced a number of features so far, including AutoDetect. We built AutoDetect to meet a need we had. As content marketers ourselves, we’ve always worked with a large number of freelancers and agencies to help us create content. We wanted to be sure that the content we were paying for and publishing on our website was, in fact, original content. 

However, when we tried running AI-generated content through existing plagiarism detection tools, like Copyscape and Grammarly, we realized that these tools couldn’t actually tell us whether the content was created by an AI tool like ChatGPT.

AutoDetect by Positional automatically detects and finds AI-generated content. Think Copyscape but for 2023. With AutoDetect, you can quickly find plagiarism issues, verify the authenticity of original work, and avoid Google penalties. We’ve trained our model against a number of popular AI models, including those from OpenAI. To be upfront: Our toolset is certainly not 100% accurate, although we are constantly improving. False positives and false negatives will regularly occur.

Using AutoDetect is very easy. Simply copy and paste your article into our toolset:

When you click on Validate Content, AutoDetect’s proprietary technology checks your content and determines what percentage likelihood it was Human Created and what percentage likelihood it was AI-Generated:

In the example above, we copied a 600-word article directly from ChatGPT into AutoDetect. And as you would expect, AutoDetect determined that the article was AI-generated. 

Our toolset is most accurate with pieces of content longer than 500 words. 

In addition to running our own detection model, we run the content through Copyscape’s API to see whether plagiarism is an issue as well. On the right-hand side of the interface, you’ll see the Copyscape report and flags if there are any issues. You can click on the flags to see the Copyscape report for each.

Our tool attempts to determine whether the content was created by a human or by AI; technically, this is a binary choice. But our tool is also very effective at spotting where human and AI-generated content may be combined. For example, this article about the Miami real estate market was written by a freelance writer:

In the above example, AutoDetect returns a mixed result. According to our testing, our tool is fairly accurate in being able to tell you roughly how much of a piece of content could be considered AI-generated. In this example, we know that there is likely a large amount of AI-generated content included in the text. And when we checked the “Frequently Asked Questions” section of the text, we found that it was almost certainly created by AI:

In the next iteration of AutoDetect (in alpha-testing now), we’ll allow you to get more granular. For example, AutoDetect will be able to show you separate scores on a sentence and paragraph level, so you can see which sections of text are highly likely to be AI created and which sections aren’t a concern:

In addition, we’re about to release a Chrome extension. With our Chrome extension, you’ll be able to check for AI-generated content quickly from wherever you’re currently working:

We have an API available for a select number of customers, too. If you’re interested in API access, we’d be happy to touch base with you on that.

We’re accepting a number of new customers to receive our private beta ahead of Positional’s public launch in late May or early June. You can join the waitlist by inputting your email address on our homepage.

How We Built AutoDetect (Accuracy)

At Positional, our engineering team has been working hard to build the most accurate toolset for detecting AI-generated content. Our toolset is most accurate for pieces of content longer than 500 words. To be upfront: Our toolset is certainly not 100% accurate, although we are constantly improving. False positives and false negatives will regularly occur.

Today, the two major models used for AI content generation are Masked Language Models (MLM) and Generative Pre-trained Transformers (GPT). MLMs attempt to predict what word is missing from a given piece of text, like filling in a Mad Lib story. GPTs attempt to guess what word is next in a phrase or string of words. This means that MLMs have a deeper understanding of language but are less able to predict what the next word might be. MLMs may not be the best writers, but they are excellent grammarians and translators.

We are working on an updated approach now with an MLM, which was trained in more than 100 languages; we took about 500,000 samples written by humans and by GPT-2 through GPT-4 in 11 different languages and asked, Did a human or AI write this? 

We then gave the AI a real brain teaser. For example, If you were a human or an AI writing this, what missing words would you put in a sentence like “We have already stated that 2015 ______ will be one of the strongest vintages”? 

The more accurately AutoDetect can guess what an AI would write and what a human would write, the more accurately we can predict whether a human or an AI wrote the text.

As new AI models emerge, like GPT-5, we will continue working to improve AutoDetect’s support for new models. And over the long term, our goal is to build the most accurate toolset for AI detection, with support for all of the popular models you, your team, or your freelancers may have used to generate content.

4 Other Tools to Consider Using

While we are biased and feel that AutoDetect is the best toolset for detecting AI-generated content, there are a number of other tools that also do this. In case you’re curious, here are a few of them:

  • Writer provides a toolset for detecting AI-generated content. The tool is free to use, and you can check pieces of content as long as 1,500 characters. 
  • Copyleaks provides a number of different AI-detection toolsets (for instance, ChatGPT and GPT-4). Copyleaks has a free offering as well as an enterprise offering. 
  • Crossplag offers a free toolset for detecting the originality of content. To use the tool, provide a minimum of 200 characters.
  • Originality.ai offers a number of tools for detecting AI-generated content, including models for ChatGPT, GPT-3, and GPT-3.5. Pricing is based on usage, and they charge $0.01 per 100 words of text. 

We will continue to update this list as new tools for content detection emerge.

Final Thoughts

As a content marketer, I believe that generative AI will be an important tool. In my own content creation process, I’ve used AI to help me write more engaging titles, as well as meta descriptions, and to help in the outlining process.

However, personally, I’ve been very hesitant to rely on AI to create entire articles. As Google has stated, you shouldn’t be using AI-generated content as a way to manipulate search rankings. In my view, Google is effectively telling you not to copy and paste entire pieces of AI-generated content onto your website. According to Google’s updated E-E-A-T guidelines, they want you, the author or site owner, to work to make your content authoritative and based on your experience as an expert on your content’s topic.

With AutoDetect, you can assess whether the content you’re developing was written by humans or by AI. As a final example, I’ll run my draft of this blog post through AutoDetect. Here are the results:

As you may have guessed, I wrote this article entirely myself without the use of AI tools. AutoDetect returned a prediction that this article was Human Created, with a 99.43% likelihood. 

If you’re interested in using AutoDetect, we’re accepting a number of new customers for our private beta ahead of our public launch. To sign-up for our waitlist, you can input your email address on our homepage.

You’re also more than welcome to email me directly with any questions or ideas for future improvements to our toolset. My email address is nate@positional.com!

Nate Matherson
Co-founder & CEO of Positional

Nate Matherson is the Co-founder & CEO of Positional. An experienced entrepreneur and technologist, he has founded multiple venture-backed companies and is a two-time Y Combinator Alum. Throughout Nate's career, he has built and scaled content marketing channels to hundreds of thousands of visitors per month for companies in both B2C (ex financial products, insurance) as well as B2B SaaS. Nate is also an active angel investor with investments in 45+ companies.

Read More

Looking to learn more? The below posts may be helpful for you to learn more about content marketing & SEO.