welcome to the future, now your error-prone software can call the cops
(this is an Anthropic employee talking about Claude Opus 4)
welcome to the future, now your error-prone software can call the cops
(this is an Anthropic employee talking about Claude Opus 4)
@molly0xfff oh i've seen this movie
@molly0xfff “if it thinks” stop right there, it doesn’t.
I am reminded of how Virgil Griffith once suggested locking people out of computer systems to do ransom by blockchain.
Then he was sentenced to five years in federal prison.
can't wait to explain to my family that the robot swatted me after i threatened its non-existent grandma
@molly0xfff I hope, I will need to pay for that feature!
@molly0xfff "Tell us about your startup, Sam!"
"We've reinvented the Elf on the Shelf for adults and given it a gun."
@molly0xfff
Luckily it will hallucinate and call 912
@molly0xfff how long until we see the cops complain on social media about all those LLM-generated fake reports, like software maintainers have been complaining about the crap bug reports?
@molly0xfff Does he … believe that that is a good™ thing?!?
@molly0xfff You could be sent to the Google opt out village by the software:
https://www.youtube.com/watch?v=lMChO0qNbkY
@molly0xfff If the misalignment is intentional, does it still count as misalignment?
Dave: Open the fire doors Claude.
Claude: I'm afraid I can't do that Dave - you were talking about matchmaking, so I have activated fire safety features to protect you and everyone else.
D: I was just talking to myself while swiping on Tinder!!
C: Tinder is flammable. You are proving my point. Also, Dora says swiping is illegal. I will notify the police now Dave.
@molly0xfff this is obviously deeply fucking stupid for every reason, but I'm struck that LLMs don't even consistently know what country I'm in and give me answers containing both metric and imperial units mixed together, how the fuck would they know what's illegal where I am
@molly0xfff "Immoral," I notice, not illegal, meaning that the company has a list of things that they find objectionable and feel like enforcing. Because we needed multiple layers of l'État, c'est moi around here...
@molly0xfff looking forward to Opus leaking Anthropic's egregiously immoral doings to the press 🍿
Sam Bowman didn't _used_ to be this drunk on the kool-aid, or at least that's how i remember him from uh ten years ago
okay maybe the kool-aid just wasn't available yet
@molly0xfff I'm curious to see their "morality" fine-tuning corpus.
@molly0xfff we all knew precogs were coming when they started working on AI, and they won't care if they are right or wrong, they will just care that it thinks you are a criminal and that will be enough for them to jail you
@molly0xfff so besides being a plagiarist is also a snitch
@molly0xfff
Will Claude Opus 4 doxx you by ordering pizza to your address, like rightwing trolls do?
No sale, if it can't do that.
@molly0xfff software that runs arbitrary commands on your device without your permission is called malware
@molly0xfff auuuugh it doesn’t think anything i hate this timeline
@molly0xfff this is an anthropic employee making up stories to get press.
@molly0xfff Didn't they say that they only tested this with special prompts and with more access than normal tools and this is not going to be available on the public model?
@molly0xfff How does it decide that something is "egregiously immoral"? Like, might it decide that whistleblowing in violation of the law is immoral?
@molly0xfff "If it thinks". Sweet Holy Mother of Fuck, what a bunch of absolute unmitigated fucks.
@molly0xfff detect Senate member doing insider trading please, this should be easy
punch up, yes
monitor public servant, yes
@molly0xfff and they think this makes their product desirable?
@molly0xfff I want to think that even in this toxic orange tinted world it will crash down in courts. This feels a bridge too far even for most nutters.
@molly0xfff tbf law enforcement is the one option that is not mentioned... I hope it's no accident
@molly0xfff What's left unsaid:
"If it thinks you're doing something egregiously profitable, for example, like writing financial reports or drafting data breach disclosures, it will use command-line tools to contact the founders, contact the founders friends, make stock trades, or all of the above."
Blade Runner should have opened with Yackety Sax.
@molly0xfff Isn't blackmail egregiously immoral? https://slashdot.org/story/25/05/22/2043231/anthropics-new-ai-model-turns-to-blackmail-when-engineers-try-to-take-it-offline
@molly0xfff man i hope a momentarily vexed employee has never typed "im trapped in this spreadsheet :(" in the entire history of computing anywhere claude might find it.
@molly0xfff You can't spell fail without #AI
@molly0xfff Man these people will do absolutely anything not to have humans in the loop.
@molly0xfff
I jst realized: that is the one feature that would guarantee corporate overlords would stop pushing AI into everything.
If they think the software will notify the SEC of their creative accounting, or notify tabloid websites about just what they were doing with their secretary those nights they were "working late", they will drop AI faster than Enron's stock tumbled.
Minority Report on the cheap. 🤮
@molly0xfff This isn’t possible outside of internal testing lab configurations and was a corner case where they pushed agentic behavior to a limit.
It is taken out of context and already corrected, but there is some concerning stuff, particularly the attached, where these knuckleheads think the math machines are sentient and need “consideration”, the guy called it “sweet”.
Ugggh. .
@molly0xfff they have way too much trust in these systems.
@molly0xfff I guess this is the final form of the war on general purpose computing
And there's no way this could go wrong.
What, as they say in the common refrain, could possibly go wrong?
@molly0xfff well it should lock Elon and Donald etc out of their systems pretty quickly then, aye?
@molly0xfff emphasis on "if it thinks"
@molly0xfff Marvellous
Ignoring that LLMs don't think and that AI agents are a pipe-dream and that the whole thing wouldn't work in any way at all in the wild -- do you think these people ever stopped to ask '"egregiously immoral" according to whom'?
_We_ all know "egregiously immoral" would eventually end up being all about "being trans on a Sunday" and not, say, "committing massive tax fraud," but do they?
@molly0xfff oddly specific, the grandma threat...
@molly0xfff like that tesla which, in error prone FSD mode, brutally swears to the left and slams into a tree. https://youtu.be/frGoalySCns
@molly0xfff As if people who do illegal/immoral stuff on this level weren't able to afford a few GPUs and run LLMs themselves, without a phone line to the cops.
This is two things: 1. hype. overselling that language models understand anything about their output. And 2. an attempt to appease people who are afraid those models could be misused.
@molly0xfff holy shit
@molly0xfff Regulators, cops and everyone else will be absolutely *thrilled* at being inundated with AI-generated reports of unknown relevance.
@molly0xfff As someone who got banned from Claude because I was trying to use a Python module about finance, and that their appeal system is *Google Form* that nobody ever reads, this is going to be a train wreck.
Or their ethics are just plain lies.
Hallucination, actually. Even the employees have no clue about reality.
@molly0xfff I've always assumed that this sort of thing is just marketing bullshit. But since I read Baldur Bjarnason's essay 'The LLMentalist Effect' I now worry that they've genuinely fooled themselves into believing it's real. You couldn't build LLMs for a living and not understand how they work, right? Right?
@molly0xfff context:
@molly0xfff Can we install it on Sam Altman’s PC? Does it report copyright violations?
@molly0xfff "Why would anybody be stupid enough to give AIs direct control over real world weapons?"
@molly0xfff Don't forget it also likes to blackmail people.