ChatGPT discovered to be pulling data from AI-generated…

(Image credit: Getty Images)

ChatGPT’s latest model, GPT-5.2, has been found to be sourcing data from Grokipedia, xAI’s all-AI-generated Wikipedia competitor. According to The Guardian, the AI LLM would sometimes use Elon Musk’s AI-generated online encyclopedia for uncommon topics like Iranian politics, and details about British historian Sir Richard Evans. Issues like this were raised as problematic a few years ago in AI training, where some experts argued that training AI on AI-generated data would degrade quality and lead to a phenomenon called “model collapse.” And while citing AI-generated data is different from using it for training, it still poses risks to people relying on AI for research.

The biggest issue with this is that AI models are known to hallucinate or make up information that is wrong. For example, Anthropic attempted to run a business with its ‘Claudius’ AI — it hallucinated several times during the experiment, with the AI even saying that it would hand-deliver drinks, in person. Even Nvidia CEO Jensen Huang admitted in 2024 that solving this issue is still “several years away” and requires a lot more computing power. Furthermore, many users trust that ChatGPT and other LLMs deliver accurate information, with only a few checking the actual sources used to answer a particular question. Because of this, ChatGPT repeating Grok’s words can be problematic, especially as Grokipedia isn’t edited directly by humans. Instead, it’s completely AI-generated and people can only request changes to its content — not write or edit the articles directly.

Using another AI as a source creates a recursive loop, and we might eventually end up with LLMs citing content, which haven’t been verified, from each other. This is no different from rumors and stories spreading between humans, with “someone else said it” being the source. This results in the illusory truth effect, where false information is deemed correct by many, despite having data saying otherwise, because it’s been repeated by so many people. Human society was littered with myths and legends similarly, passed over hundreds of years through several generations. However, with AI churning through tons of data at infinitely faster speeds than humans, the use of AI sources risks the proliferation of digital folklore with every query entered into AI LLMs.

What’s more troubling is that various parties are already taking advantage of this. There have been reports of “LLM grooming,” with The Guardian saying that some propaganda networks are “churning out massive volumes of disinformation in an effort to seed AI models with lies.” This has raised concerns in the U.S., with Google’s Gemini, for example, reportedly repeating the official party line of the Communist Party of China in 2024. This seems to have been addressed at the moment, but if LLMs start citing other AI-generated sources that haven’t been vetted and fact-checked, then this is a new risk that people need to look out for.

Follow 3DTested on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

TOPICS

9 Comments Comment from the forums

jp7189

So, the question has to be asked: how accurate is grokopedia. Wikipedia has had problems with inaccuracies from folks intentionally messing with articles, and it's been used for AI reference material. Is grokopedia actually worse?

The article is correct that retraining on inaccurate data can cause a snowball effect, but there are plenty of examples of the opposite happening; using AI to clean up bad data to make training better.
Reply
Dementoss

Admin said:
ChatGPT has been found to be citing Grok in some of its answers, returning recursive results that risks spreading hallucinated or incorrect information.
The spreading of hallucinated or incorrect information, has been the bread and butter of AI services from the start...
Reply
LordVile

Dementoss said:
The spreading of hallucinated or incorrect information, has been the bread and butter of AI services from the start...
Remember, you can use glue to stop your cheese sliding off pizza
Reply
nimbulan

jp7189 said:
So, the question has to be asked: how accurate is grokopedia. Wikipedia has had problems with inaccuracies from folks intentionally messing with articles, and it's been used for AI reference material. Is grokopedia actually worse?

The article is correct that retraining on inaccurate data can cause a snowball effect, but there are plenty of examples of the opposite happening; using AI to clean up bad data to make training better.
Even the best LLMs can reach maybe 90% accuracy from the information I've read. Accuracy is worse when topics are more complex or technical. AI's training on AI-generated content (which is quickly taking over the internet) will just amplify accuracy errors. The only thing they could do is obviously flag AI-generated content so it gets automatically excluded from training material, but that would also inform users about what content is AI-generated, which they don't want to do.
Reply
thisisaname

LordVile said:
Remember, you can use glue to stop your cheese sliding off pizza.
Iron nails work better. You get some iron in your diet and less nasty chemicals on your pizza.:ROFLMAO::giggle:
Reply
thisisaname

When everything in the internet is from LLM AI where will they go to find new content to steal reference🤯
Reply
Mr Marc G

Wikipedia is notoriously inaccurate. Scientists who have tried to update articles on their own work get their edits rejected by the wiki gatekeepers. I'd frankly trust Grok much more than wilipedia. That being said, I never trust any AI for anything important. It has no discernment. And sources like Wikipedia and the msm are very often purposefully inaccurate to support editorial narrative.
Reply
hotaru251

jp7189 said:
how accurate is grokopedia.
I mean its elon's grok...the same thing that regularly gets "fixed" when it states stuff he doesnt like. So its likely accurate about a lot of stuff (same for wikipedia) but on touchy/hot topics its likely biased (again like wikipedia)
Reply
rblowery

This is my fear with all AI models. Once AI models rely on another AI model for information the propensity for error increases dramatically. The dreaded AI Echo chamber. I had this very conversation with Grok yesterday. Grok has become more accurate again, after losing accuracy for about 6 months. I pay for Super Grok, and I would hope it's accurate. But, not always. It IS always confident that it's answer is accurate. Even when it is not even close to accurate. I am building complex tax VBA for Roth Conversion software using Excel then I'll port to python. Grok makes coding errors, but doesn't always recognize the error. I've learned to see what it is missing, tell it what it missed, and typically after a few tries we get it right. Much quicker then writing the VBA functions myself. But those incorrect functions are likely coming from snippets of code from other AI. Grok sees this line of code repeated numerous times, assumes it's correct, and feeds it back to me. It failes to compile, we work out the failed code. Again this results on lightening fast VBA coding. I was able to code all 50 states tax code in less than 48hrs. But, I do worry that Grok will get less effective again as AI takes over 90% of all new information posted to the web. It is a problem that must be handled. We must have genuine human data.

Hotaru251 said:
i mean its elon's grok...the same thing that regularly gets "fixed" when it states stuff he doesnt like. So its likely accurate about a lot of stuff (same for wikipedia) but on touchy/hot topics its likely biased (again like wikipedia)

Reply

Show more comments