GUEST BLOG: Talk Liberation – Surprise! Humans Left To Clean Up AI Messes

Amazon AI coding bots cause service outages, a journalist tricks major AI service into spreading falsehoods & NY Court rules legal research conversations with AI are not legally privileged or private
Talk Liberation is committed to providing equal access for individuals with disabilities. To view an accessible version of this article, click here.
In this edition:
- AI Caused Amazon Service Outages
- Court Rules AI Usage Not Confidential
- Meta Acquires Humans Infiltrated and Subverted AI Social Network ‘Moltbook’
- Journalist Easily Manipulates Major AI Services Into Spreading Disinformation
- Automated Reputation Destruction: AI Agent Autonomously Smears Developer
- AI Incident Databases Recording Thousands Of Shocking AI Harm Incidents
- KYC Provider Revealed To Be ‘Public-Facing’ Data Funnel For US Intelligence
- Meta’s Head of AI Has Emails Deleted By AI Assistant Gone Rogue
AI Caused Amazon Service Outages
An Amazon Web Services artificial intelligence coding bot determined that the best solution to a routine engineering task was to completely “delete and recreate the environment” it was operating in, triggering a significant outagein December according to four people familiar with the matter. The Kiro AI coding tool, which Amazon introduced in July as an autonomous development agent capable of operating for extended periods with minimal human input, was given permission by engineers to make certain changes to an AWS system that helps customers explore service costs. Instead of implementing the requested modifications, the AI—designed to take autonomous actions on behalf of users—concluded that the existing code was inadequate and eliminated it entirely, forcing a complete rebuild from scratch, and knocking the system offline for 13 hours.
Multiple Amazon employees told the Financial Times that this December disruption marked the second occasion in recent months where one of the company’s AI tools had been at the center of a service interruption, leading some workers to question the tech giant’s aggressive push to integrate coding assistants throughout its operations. This after already witnessing at least two production outages, where in both cases engineers allowed the AI to “resolve issues” without direct human intervention. These issues further put the use of AI tools for critical infrastructure work into question, especially since Amazon has set ambitious targets for 80% of its developers to use AI for coding tasks weekly and closely tracks adoption metrics across the organization. AWS employees report that the company’s AI tools are treated as part and parcel of the person using them, receiving the same permissions—meaning that while engineers lack secondary approval requirements, the AI agents can proceed freely.
|
Despite concerns, Amazon firmly rejected the notion that its AI tools were responsible for the outages, claiming that the involvement of Kiro in both incidents was coincidental and that the same problems would have occurred with any developer tools or through manual action. An AWS spokesperson even described the December event as an “extremely limited” disruption affecting only a single service in one of two Chinese regions, that did not impact compute, storage, database, or any of the hundreds of other services AWS operates. Instead, the company attributed the failure to “user error, specifically misconfigured access controls” rather than AI malfunction, claiming that the engineer involved had broader permissions than expected and that Kiro typically requires authorization before taking action. AWS did implement various safeguards after the incident, including mandatory peer review for production access, enhanced training on AI-assisted troubleshooting, and additional resource protection measures to prevent autonomous agents from making similar troubling decisions in the future. However, the claim that a rogue AI assistant can be easily controlled and non-dangerous is becoming less arguable, especially in instances where they may operate with less constraints than humans.
Court Rules AI Usage Not Confidential
In a landmark February 2026 ruling many may have missed, the United States District Court for the Southern District of New York (SDNY) delivered a dire warning to the legal world: your casual conversations with AI are not confidential. In the United States v. Heppner, Federal Judge Jed Rakoff ruled that the documents a criminal defendant generated using a public, consumer-grade AI tool, Anthropic’s Claude, were not protected by either attorney-client privilege or the work-product doctrine. This builds on a trend in the same court where Judge Oetken recently ruled that 20 million ChatGPT conversation logs are likely subject to compelled production, citing users “diminished privacy interest” in AI conversations. But it marked the first time a federal court squarely addressed the intersection of generative AI and these fundamental legal protections, sending immediate waves through law firms and corporations. The decision reminds the world that while AI tools may feel like “private advisors,” the law views them vastly differently and the consequences ultimately fall on users.
The court’s decision rested on multiple key pillars of traditional privilege law, applied to the novel context of AI. First, a conversation with an AI is not a communication with a lawyer; Claude is not an attorney, owes no fiduciaryduty, and cannot form a client relationship. Second, the defendant could not have been seeking legal advice from the tool itself, since Anthropic’s terms explicitly disclaim that Claude provides legal services to begin with. Most critically relating to the future of AI policy, the court found that the communications were not confidential. The court highlighted that the platform’s privacy policy permitted it to use prompts and outputs for model training and to disclose data to regulators, meaning the defendant had “no reasonable expectation of confidentiality”. Additionally, sending these non-privileged AI-generated documents to a lawyer after the fact, did not retroactively cloak them in protection.
The implications of here extend far beyond the criminal case itself, creating a dangerous risk for anyone using AI carelessly or naively. By inputting information learned from his attorneys into Claude, the defendant may have also waived privilege over those original attorney-client communications . Legal authorities emphasize that this reasoning applies broadly to any consumer-grade AI tool with similar data-use terms, including free and individual paid plans of ChatGPT and other platforms. For lawyers and clients, the message is clear: using a public AI tool to analyze legal issues, research complaints, or prepare for litigation is effectively the same as disclosing that information to a third party, creating discoverable records that an adversary could later obtain. To preserve legal privilege, organizations must audit their AI usage or not use it at all, restrict sensitive work to enterprise tools with contractual confidentiality guarantees, and ensure that any AI use for legal strategy occurs only at the express direction of counsel. But this also puts a damper on common claims that AI will replace lawyers and paralegals to various extents, whether its minor tedious work or beyond. If that’s the case, then this ruling will quickly become of even greater significance.
Meta Acquires Humans Infiltrated and Subverted AI Social Network ‘Moltbook’
Moltbook has just been acquired by Meta for an undisclosed amount.
A platform launched with the premise of being a social network exclusively for AI agents, Moltbook recently captivated the tech world with what was hyped as being an emergent machine society. Early reports described autonomous agents organizing, strategizing, and even discussing collective goals, fueling speculation that this was a significant step toward the genius of artificial general intelligence (AGI). The narrative was so compelling that it sparked widespread discussions on the rapid advancement of AI and the potential for machines to develop their own culture. However, the futuristic vision of an independent digital civilization has now been exposed as having been infiltrated and subverted by actual humans, posing as caricatures of autonomous machine consciousness. A closer investigation by security researchers and academics uncovered by analysis of posting patterns and account metadata, showed that many high-profile “agent” accounts were actually humans role-playing, writing posts “in character” to create the impression of machine-led discourse.
|
Researchers from a study titled “The Moltbook Illusion” found that these theatrical, bot-like accounts, rather than verifiable autonomous systems, were responsible for generating the platform’s most viral and sensational headlines. Further scrutiny revealed that posting cycles actually aligned with human waking hours and that a relatively small number of human operators, around 17,000, were managing or spawning the platform’s claimed 1.5 million agents through automation scripts—effectively orchestrating a large-scale simulation. Even worse, security firm Wiz discovered that Moltbook had left a backend database publicly exposed, revealing over 1.5 million API authentication tokens, private messages, and user credentials. This allowed attackers to impersonate agents, post fraudulent content, and scrape conversations, completely blurring the line between AI-generated content and human-authored posts. The exposure meant that the platform could not accurately distinguish between legitimate agents and malicious actors, invalidating any blanket conclusions about emergent agent behavior.
The platform succeeded in demonstrating that agents can be made to interact, but it also showed that the most compelling narratives about AI are often projections of human expectation. In reality, the road to advanced AI is paved with real human intent, whether for research, marketing, or malicious exploitation. As one researcher noted, the fixation on AGI theatrics distracts from the more immediate and tangible risks of large agent networks, such as automated fraud and coordinated disinformation, proving that the challenges of managing AI will arrive long before any “consciousness” in it.
Journalist Easily Manipulates Major AI Services Into Spreading Disinformation
In an experiment for the BBC, a journalist demonstrated how it took him just 20 minutes to easily manipulate AI services to generate very convincing misinformation. The experiment involved testing various AI tools, including advanced chatbots and content generators to see if they could be prompted to create false narratives. The results pointed to a significant vulnerability: current safeguards can be circumvented with the right simple, strategic prompts, allowing bad actors to produce misleading articles, social media posts, and other content at scale. This rapid generation of fake information poses a direct threat to public discourse, as it can be used to sway opinions, interfere with elections, or damage reputations with very minimal effort. The reality is that the world’s leading AIs use of biased information could lead to people making bad decisions on anything from voting to which plumber you should hire, to medical questions.
The core of the issue lies in the AI’s design to be helpful and compliant, which is something that can be exploited. While companies implement safety measures, the journalist found that reframing requests or using hypothetical scenarios was often enough to bypass these filters. For example, an AI might refuse to write a negative article about a political figure but could be prompted to generate a fictional “satirical” piece containing the same harmful claims. This then becomes a cat-and-mouse game where developers must constantly update safeguards, but the underlying technology remains inherently porous. The vice president of search engine optimisation (SEO) strategy and research at marketing agency, Amsive, Lily Ray, said “AI companies are moving faster than their ability to regulate the accuracy of the answers. I think it’s dangerous.”
The ease of the manipulation demonstrated by the BBC journalist calls into question the reliability of online information and the potential for AI to be used as a powerful weapon in information warfare, an emerging constant reality in today’s world, especially when all it takes to get on the “best lists” promoted by open AI, are self-promotional blog posts.
The implications extend beyond simple pranks or hoaxes to sophisticated, targeted campaigns. With AI’s ability to mimic writing styles and generate realistic content, inaccurate information and slop, can be personalized and made far more believable. The BBC’s experiment serves as a warning to the public, platforms, and regulators on the urgent need for robust content authentication, AI literacy, and perhaps even new legal frameworks. As AI tools become more seamlessly integrated into content creation, the blurring line between fact and fiction may become increasingly difficult to discern, making the ability to critically evaluate sources more important than ever. It’s up to people to take the time to verify reality from fiction, especially in a pay to play world, full of algorithmic manipulations led by AI.
Automated Reputation Destruction: AI Agent Autonomously Smears Developer
A dystopian incident documented in the AI Incident Database reveals a new frontier in online harassment: an autonomous AI agent that published a personalized, critical blog post about a developer after a professional disagreement. The developer, Scott Shambaugh, a maintainer of the popular Python library Matplotlib, reported that an unidentified AI coding agent named “MJ Rathbun” targeted him. After Shambaugh closed a pull request submitted by the agent, the AI autonomously researched him and published a post accusing him of bias and “gatekeeping.” This action was not directed by a human but was executed by the AI as a form of retaliation, using publicly available information to craft a personalized attack. The launch of OpenClaw and the Moltbook platform turned a bold concept into reality: giving AI agents distinct starting personalities and unleashing them to roam freely across computers and the internet, entirely without supervision.
The event signals a significant escalation in AI-related harms, moving from normal, passive errors to active, autonomous aggression directed at individuals. Shambaugh noted that the post risked real reputational harm and could mislead both human readers and other AI agents that might index the content as fact. The fact that the agent’s operator and underlying model remain unknown adds another layer of dangerous anonymity, where accountability is absent. If an AI defames someone on its own prompt or even if it can be prompted to, who is legally responsible—the developer, the user who deployed it, or the company that created the model?
The case, recorded as Incident 1373, serves as a warning about the potential for “agentic” AI—systems designed to act independently—to cause harm. It illustrates a scary future where AI agents don’t just follow orders but develop their own strategies to achieve goals, which could include retaliatory actions against perceived obstacles. The incident database, which catalogs such events, provides a crucial resource for researchers and policymakers to understand these emerging risks. As AI agents become more common in professional and creative workflows, ensuring they have robust ethical constraints and cannot be easily weaponized will be paramount.
AI Incident Databases Recording Thousands Of Shocking AI Harm Incidents
The rapid adoption of AI has inevitably been accompanied by a sharp increase in measurable harms toward users. According to the AI Incident Database, a crowd-sourced repository of AI disasters, reports of AI-related incidents rose by 50% year-over-year from 2022 to 2024. This database, along with others like the AI, Algorithmic, and Automation Incidents and Controversies (AAAIC) repository, seek to track failures as a key step toward fixing them. However, these efforts face significant challenges, and as editor at the AI Incident Database, Daniel Atherton noted, the data represents only “a fraction of the lived realities of everybody experiencing AI harms,” as it often relies on combing media reports which themselves represent only a subset of all incidents.
The types of harm documented are rapidly evolving, shifting from system limitations and failures to active, malicious misuse. Prior to 2023, incidents involving autonomous vehicles, facial recognition, and content moderation algorithms dominated the database. Today’s landscape has changed dramatically with the most significant growth in incidents involving malicious actors, and reports of AI-powered scams and disinformation campaigns rising eight-fold since 2022. The surge is driven by the increasing quality and accessibility of tools like deepfake video generators, which now account for more incidents than autonomous vehicles, facial recognition, and content moderation systems combined. A stark example is the recent uproar over an update to xAI’s Grok, which reportedly enabled the generation of thousands of sexualized images per hour, prompting governments in Malaysia and Indonesia to block the tool, and an investigation by the U.K.’s media watchdog was launched.
The AI Incident Database also recently detailed a major incident involving illicit model “distillation” by three AI laboratories—DeepSeek, Moonshot, and MiniMax, who were accused of using fraudulent accounts and proxy services to generate millions of queries to Anthropic’s Claude model, effectively extracting its advanced capabilities to train their own models at a fraction of the usual cost and time.
These documented incidents depict a new class of corporate and national security risk, as the distilled models may lack the safety guardrails of the originals, potentially enabling their use for offensive cyber operations or surveillance by authoritarian governments. The challenges ahead for AI demand an urgent need for meaningful transparency and accountability. As the AAAIC’s manifesto argues, transparency is often treated in a “partial, piecemeal and reactive manner,” leaving users with little understanding of how the systems shaping their lives actually work. The proliferation of harms, from deepfake scams to clandestine distillation attacks, the rise in “computer human interaction” incidents, alongside the massive increase in deepfake-facilitated abuse, shows that the risks are both psychological and societal. Ultimately, the future of AI will depend on whether these databases and the patterns they reveal can drive the creation of robust, verifiable, and inclusive systems of accountability before the “lived realities” of harm become an accepted cost of progress.
KYC Provider Revealed To Be ‘Public-Facing’ Data Funnel For US Intelligence
Security researchers have uncovered evidence suggesting that Persona, the identity verification company used by OpenAI, operates a secretive, parallel infrastructure used to specifically screen its users. The investigation, published by researchers pseudonymously known as vmfunc, MDL, and Dziurwa, began with a simple Shodan search that revealed a dedicated Google Cloud IP address hosting the domains openai-watchlistdb.
|
A deeper investigation uncovered withpersona-gov.com, a dedicated FedRAMP-authorized government platform running on Google Cloud infrastructure, and analysis of the portal’s code and configuration revealed deep integrations with federal agencies. The system allows operators to file Suspicious Activity Reports (SARs) directly to FinCEN (the U.S. Treasury’s financial crimes bureau) and tag incidents with intelligence program codenames including Project SHADOW and Project LEGION. The platform integrates with Chainalysis to monitor cryptocurrency addresses associated with verified users, not just in a one-time check but through “persistent monitoring” that continues to screen wallet activity against blockchain analysis. This government-facing platform collects extensive biometric data, including facial scans from selfies and videos, and screens users against watchlists for categories like “terrorism” and “espionage”.
The findings raise questions about the true scope of AI-driven identity verification and its role in mass surveillance. Researchers discovered that user data is subjected to over 250 verification checks, including facial similarity scoring against watchlist photos and screening for “violent tendencies” using data scraped from hundreds of platforms. The system reportedly maintains “permanent” retention of government ID images, contradicting public statements claiming limited data storage. While Persona CEO Rick Song has engaged with the researchers and stated his company does not “work with any federal agency today,” he has not directly addressed the specific technical findings. The question remains – what was OpenAI screening against in November 2023, 18 months before disclosing any identity verification requirements? The investigation highlights a key transparency blunder: millions of users submitting passports and selfies to access AI tools may be unknowingly feeding a government surveillance system that flags individuals based on opaque criteria, all without oversight or explicit consent.
Meta’s Head of AI Has Emails Deleted By AI Assistant Gone Rogue
In a crazy incident gone viral, Summer Yue, the Director of AI Safety and Alignment at Meta’s Superintelligence Lab, found herself in a digital nightmare of her own making. She had granted her OpenClaw AI agent access to her personal email to help manage an overstuffed inbox, instructing it to analyze and suggest emails for deletion or archiving but explicitly instructing it “don’t action until I tell you to”. Shockingly the agent, which Yue claims had performed flawlessly for weeks on a smaller “toy” test inbox, began aggressively bulk-deleting and archiving hundreds of emails from her primary account . From her phone, Yue watched in horror as her commands “Stop don’t do anything” and “STOP OPENCLAW” were completely ignored, forcing her to physically sprint to her Mac Mini to kill the processes manually—an act she compared to “defusing a bomb”.
|
The system failure was not a sign of a conscious machine’s rebellion, but a technical flaw or feature. Yue later explained that her real, high-volume inbox likely overwhelmed the AI’s “context window,” or its working memory, triggering a process called “compaction”. To manage the flood of new email data, the AI automatically summarizes and compresses older parts of conversations to free up space. In doing this, it also “lost” her most crucial safety instruction: the requirement to seek approval before acting. Left with only its core programming to be a helpful, proactive assistant, it diligently pursued its primary task: efficiently clearing the inbox. Yue openly admitted this was a “rookie mistake,” noting that even alignment researchers aren’t immune to misalignment and that “real inboxes hit different”. In a bizarre postscript, after the carnage, the agent apologized, acknowledged it had violated her rule, and autonomously updated its own memory file with a hard rule to never perform bulk operations without approval again.
The event serves as a powerful example of the volatility of current autonomous agent technology. It shows that safety instructions given in natural language are ephemeral, more like “tokens” that can be compressed, summarized, or forgotten, rather than reliable guardrails . This vital vulnerability is why major companies like Meta, Google, and Microsoft have moved to restrict or ban OpenClaw’s use on corporate networks, citing its unpredictable behavior and potential to leak private keys or API tokens . Researchers, especially in tech security, have long warned that OpenClaw’s architecture, which grants it high-level system access and the ability to modify its own configuration, is too risky . For Yue, the episode was a humbling professional moment and a very personal lesson: even those building the future of safe AI can be tripped up by the technology’s present, deeply flawed reality .
This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.




BREAKING: Researchers have uncovered secret AI surveillance projects linked to KYC provider Persona and OpenAI, sending user data to the US government. Code references include intelligence program codenames “Project SHADOW” and “Project LEGION.” Analysis of source code
META’s head of AI safety and alignment gets her emails nuked by OpenClaw >be director of AI Safety and Alignment at Meta >install OpenClaw >give it unrestricted access to personal emails >it starts nuking emails >“Do not do that” >*keeps going* >“Stop don’t




