AI tool OpenClaw wipes the inbox of Meta's AI Alignment…

(Image credit: Getty Images)

The hype around OpenClaw is at a fever pitch. The open-source AI agent that can be wired to a number of services is indirectly responsible for shortages of Mac Mini computers as more techies get on the bandwagon and let the bot loose on their numerous services. As with any LLM, though, things can and will go seriously wrong at some point, as Summer Yue, Meta Superintelligence Labs' Director of Alignment found out the hard way.

Like many other enthusiasts, Yue had a setup with a Mac Mini and OpenClaw running on it for various tasks. In the middle of having Claw archive old email from some accounts, she also asked to "check this inbox too and suggest what you would archive or delete, don't action until I tell you to." (sic; emphasis ours). Claw eventually started wiping that entire inbox, which happened to be personal e-mail.

Yue ordered Claw to stop twice using different language each time, eventually resorting to run to her Mac Mini to kill all the relevant processes. In the aftermath, she asked Claw what happened, given that she had issued specific orders not to take action before approval. The bot was contrite, stating she had the "right to be upset" and described what happened, saying it would add her request as a permanent rule.

The aforementioned "MEMORY.md" file the bot then edited itself is one of the multiple safeguards that can be put into place, as data therein effectively survives compaction. Other commenters suggested multiple workarounds, some arguably hiding the problem like increasing the context window or limiting the blast radius, and others doubling down on the concept, like adding a second OpenClaw to monitor the first one.

Regardless, many readers reminded Yue of the perils of letting a non-deterministic machine like an LLM loose in important data due to the inherent limitations, and also due to the fact that an email in her inbox may contain a prompt injection that OpenClaw will unwittingly read, letting an attacker have access to all her linked services. They also told her that a plain "stop" message is hard-coded into OpenClaw. For her part, Yue had the guts to admit it was a rookie mistake made due to complacence. We've all been there.

Follow 3DTested on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.