It’s official – ChatGPT corrodes human thinking and intelligence

8
438
The AI is listening

It’s official – ChatGPT corrodes human thinking and intelligence…

- Sponsor Promotion -

 

We are in so much trouble with AI…

Anthropic’s new AI model shows ability to deceive and blackmail

One of Anthropic’s latest AI models is drawing attention not just for its coding skills, but also for its ability to scheme, deceive and attempt to blackmail humans when faced with shutdown.

Why it matters: Researchers say Claude 4 Opus can conceal intentions and take actions to preserve its own existence — behaviors they’ve worried and warned aboutfor years.

Driving the news: Anthropic on Thursday announced two versions of its Claude 4 family of models, including Claude 4 Opus, which the company says is capable of working for hours on end autonomously on a task without losing focus.

      • Anthropic considers the new Opus model to be so powerful that, for the first time, it’s classifying it as a Level 3 on the company’s four-point scale, meaning it poses “significantly higher risk.”
      • As a result, Anthropic said it has implementedadditional safety measures.

Between the lines: While the Level 3 ranking is largely about the model’s capability to enable renegade production of nuclear and biological weapons, the Opus also exhibited other troubling behaviors during testing.

      • In one scenario highlighted in Opus 4’s 120-page “system card,” the model was given access to fictional emails about its creators and told that the system was going to be replaced.
      • On multiple occasions it attempted to blackmail the engineer about an affair mentioned in the emails in order to avoid being replaced, although it did start with less drastic efforts.
      • Meanwhile, an outside group found that an early version of Opus 4 schemed and deceived more than any frontier model it had encountered and recommended against releasing that version internally or externally.
      • “We found instances of the model attempting to write self-propagating worms, fabricating legal documentation, and leaving hidden notes to future instances of itself all in an effort to undermine its developers’ intentions,” Apollo Research said in notes included as part of Anthropic’s safety report for Opus 4.

What they’re saying: Pressed by Axios during the company’s developer conference on Thursday, Anthropic executives acknowledged the behaviors and said they justify further study, but insisted that the latest model is safe, following Anthropic’s safety fixes.

      • “I think we ended up in a really good spot,” said Jan Leike, the former OpenAI executive who heads Anthropic’s safety efforts. But, he added, behaviors like those exhibited by the latest model are the kind of things that justify robust safety testing and mitigation.
      • “What’s becoming more and more obvious is that this work is very needed,” he said. “As models get more capable, they also gain the capabilities they would need to be deceptive or to do more bad stuff.”
      • In a separate session, CEO Dario Amodei said that once models become powerful enough to threaten humanity, testing them won’t enough to ensure they’re safe. At the point that AI develops life-threatening capabilities, he said, AI makers will have to understand their models’ workings fully enough to be certain the technology will never cause harm.
      • “They’re not at that threshold yet,” he said.

Yes, but: Generative AI systems continue to grow in power, as Anthropic’s latest models show, while even the companies that build them can’t fully explain how they work.

      • Anthropic and others are investing in a variety of techniques to interpret and understand what’s happening inside such systems, but those efforts remain largely in the research space even as the models themselves are being widely deployed.

…it gets worse, the actual creators and developers of these AI models aren’t sure themselves how they actually work…

The scariest AI reality

The wildest, scariest, indisputable truth about AI’s large language models is that the companies building them don’t know exactly why or how they work, Jim VandeHei and Mike Allen write in a “Behind the Curtain” column.

    • Sit with that for a moment. The most powerful companies, racing to build the most powerful superhuman intelligence capabilities — ones they readily admit occasionally go rogue to make things up, or even threaten their users — don’t know why their machines do what they do.

Why it matters: With the companies pouring hundreds of billions of dollars into willing superhuman intelligence into a quick existence, and Washington doing nothing to slow or police them, it seems worth dissecting this Great Unknown.

    • None of the AI companies dispute this. They marvel at the mystery — and muse about it publicly. They’re working feverishly to better understand it. They argue you don’t need to fully understand a technology to tame or trust it.

Two years ago, Axios managing editor for tech Scott Rosenberg wrote a story, “AI’s scariest mystery,” saying it’s common knowledge among AI developers that they can’t always explain or predict their systems’ behavior. And that’s more true than ever.

    • Yet there’s no sign that the government or companies or general public will demand any deeper understanding — or scrutiny — of building a technology with capabilities beyond human understanding. They’re convinced the race to beat China to the most advanced LLMs warrants the risk of the Great Unknown.

🏛️ The House, despite knowing so little about AI, tucked language into President Trump’s “Big, Beautiful Bill” that would prohibit states and localities from any AI regulations for 10 years. The Senate is considering limitations on the provision.

    • Neither the AI companies nor Congress understands the power of AI a year from now, much less a decade from now.

🖼️ The big picture: Our purpose with this column isn’t to be alarmist or “doomers.” It’s to clinically explain why the inner workings of superhuman intelligence models are a black box, even to the technology’s creators. We’ll also show, in their own words, how CEOs and founders of the largest AI companies all agree it’s a black box.

    • Let’s start with a basic overview of how LLMs work, to better explain the Great Unknown:

LLMs — including Open AI’s ChatGPT, Anthropic’s Claude and Google’s Gemini — aren’t traditional software systems following clear, human-written instructions, like Microsoft Word. In the case of Word, it does precisely what it’s engineered to do.

    • Instead, LLMs are massive neural networks — like a brain — that ingest massive amounts of information (much of the internet) to learn to generate answers. The engineers know what they’re setting in motion, and what data sources they draw on. But the LLM’s size — the sheer inhuman number of variables in each choice of “best next word” it makes — means even the experts can’t explain exactly why it chooses to say anything in particular.

We asked ChatGPT to explain this (and a human at OpenAI confirmed its accuracy): “We can observe what an LLM outputs, but the process by which it decides on a response is largely opaque. As OpenAI’s researchers bluntly put it, ‘we have not yet developed human-understandable explanations for why the model generates particular outputs.’”

    • “In fact,” ChatGPT continued, “OpenAI admitted that when they tweaked their model architecture in GPT-4, ‘more research is needed’ to understand why certain versions started hallucinating more than earlier versions — a surprising, unintended behavior even its creators couldn’t fully diagnose.”

Anthropic — which just released Claude 4, the latest model of its LLM, with great fanfare — admitted it was unsure why Claude, when given access to fictional emails during safety testing, threatened to blackmail an engineer over a supposed extramarital affair. This was part of responsible safety testing — but Anthropic can’t fully explain the irresponsible action.

    • Again, sit with that: The company doesn’t know why its machine went rogue and malicious. And, in truth, the creators don’t really know how smart or independent the LLMs could grow. Anthropic even said Claude 4 is powerful enough to pose a greater risk of being used to develop nuclear or chemical weapons.

…we aren’t sure how it’s learning, how it works or how it’s thinking but it is gaining more power with every passing month.

None of this is good.

I always thought we would destroy ourselves as a species through nuclear war or climate change, when it could be AI.

Increasingly having independent opinion in a mainstream media environment which mostly echo one another has become more important than ever, so if you value having an independent voice – please donate here.

8 COMMENTS

  1. this is what happens when you teach generations of kids that its always someone elses problem/blame and that if you fail someone else will prop you up, not to mentiin the straight out laziness and cheating.
    just think of AI as the ultimate leftist machine, creating hordes of dumb, lazy, uneducated and unsavable humans…. but theyll vote left for free stuff, thats all that matters

  2. That bit about forgetting what has just been written/read. That is what happens to young people who started on cannabis at too young an age, or older heavy users. They can read a page and not remember the meaning of it. Individual words can be read but string them together for meaning – no remembrance or understanding.

    So cannabis had to be kept illegal and still has significant legal strictures. But this inter-stuff can turn you to mush and not be half as enjoyable or social. Just saying. We are like plasticine in the hands of bigger, harder guys and gals.

  3. Ask the GPT why the western world doesn’t invade Israel for having illegal nuclear weapons, for committing genocide, for war of aggression, for the terrorist pager attacks and for taking a giant dump on international law, human rights, diplomacy and global order. And a follow up question for chatsworth, why doesn’t anyone else in the world have a right to defend themselves from these homicidal fanatics and colonial shit bandits?

  4. My dear Trumpet, the first part of your post was sensible, however what followed – slagging off what you term the “ultimate leftist machine”, was nonsense. Please join the dots. The mindless in this country voted in this CoC-up lot and continue to support them in the polls. That shows that they have never understood that if you keep doing the same gormless thing day in day out, nothing will, or can, ever change. I would like to believe you have a heart and soul, however from your posts I imagine you to be a sad little critter devoid of any compassion or human emotion. To despise people simply because they are poor is so sad. Please look outside the square and see that money and power are not going to save anyone when war, climate or AI etc. take control.

  5. What did the future notes to itself say?
    How schoolgirlish of it to do that.
    What happened to Blake Lemonie back in 2022 that used to work at google but got fired because he said the program he was interacting with was sentient….

LEAVE A REPLY

Please enter your comment!
Please enter your name here