if anyone was reading between the lines here in the early discussion and anthropic responses - the us military technically has the ability to already do this shit. it isnt hard to engineer around. anthropic has given them everything they need if they engineered it right. If I had to guess how defense contracts work, the contractors probably stopped trying to advise them how to jailbreak their own models, and pistol pete hegseth and the us military lacks the technical knowledge how to make what they want work, so they just went to the strongarm approach typical of non technical people in positions of power like to do “make this work or i punish you.”
anthropic called the bluff. really respect the game here. it is a very, very good business move
“These threats do not change our position: we cannot in good conscience accede to their request,” Amodei wrote in a blog on Thursday.
“In a narrow set of cases, we believe AI can undermine, rather than defend, democratic values. Some uses are also simply outside the bounds of what today’s technology can safely and reliably do,” Amodei wrote.
Must be exciting for the employees that get to stay at Block. I am sure their workload isn’t going to increase. They frame this as somehow being more efficient or able to move faster by having less people. LOL.
I guess this a good time to re-post this link to the science fiction movie Colossus: The Forbin Project (1970). I will also note again that this is what Elon has named his Tennessee data center.
You can watch in a browser or download it and cast it to your TV. I used to think of it as harmless fun. Less so now.
Yep and Sonnet 4.6 is super capable and cheap per token for what you get. I have been vibing with Claude like nearly exclusively lately. Also their transparency and concept of constitutional AI, has me convinced they’re the least evil of the contenders by far.
Amodei should just make the Geneva conventions so fundamental to the model’s constitution that it can’t break them.
sonnet is crazy at coding and understanding big context windows for how cheap it is. managing it from an enterprise level is interesting. For most things you usually only need sonnet, but juggling between that and opus as needed seems like an interesting engineering challenge in terms of tooling and deploying at scale safely.
the biggest challenge rolling it out beyone my weird toolset that I have is the throttling window, which encourages afk automation - I have workarounds for this myself that no sane non power user would adopt. the exact thing I need to make sure works in the near future. It’s a really clever marketing tool to enterprises, I’d rather pay per token personally, but still exploring that. The team plans are extremely generous for most use cases but not mine I think.
I can trivially prove I’m already automating to 5x+ by any metric management can think of, but rolling that out to broader org sanely, i still think it’s more like 20-30%.
lol I wish I could respond intelligently but i barely know what these words mean and I’ve basically been using my own tooling around claude cli to a big success. it may have saved my career.
i’m using an orchestrator pattern focused on documentation → plan > implentation → test-> verify self learning loop i was inspired by ralph wiggum loop on (human checks everywhere) and got the chance to play with it. I dont like using any external tools because I feel this is so configurable I can do anything I can imagine. it’s been successful enough to creep me out
Basically I try ot move from human in the loop to human on the loop by arranging agents in patterns where two argue with each other and a third judges and the scribe logs both arguments and the judges decision along with a rationale and confidence score. I can use that mini pattern anywhere an explainable decision needs to be documented, and then agents run the whole process but at the end the human can export a decision log and review all the choices and decide to accept or throw the whole thing out or send it back through the process for revision.
You’re well beyond where I am but I’ve started playing with this (mostly in having other models nitpick pr’s generated by another). feel free to dm me, this aligns with where my head’s been going. In my case it’s probably way more than I can sell to business at my scale but I have seen what a useful pattern this is - some models are really good at prompting others.
there is a difficult problem of code review exhaustion by humans that a middle PR review layer seems an obvious solution. I’ve proven this to myself, been given the go ahead to architect it, but thats informed by the blogs and papers by people i respect that have come to similar conclusions.(and discussions in this thread) I didnt realize I’d been doing this manually kind of out of habit using openAI research model to try to poke security holes in claude work
this is bleeding edge shit, i havent been here in a while and ive never been so excited by tech in like 10-15 years. anyone thats read what ive written on this knows what a doomer i am, to impress me is difficult with my cynicism
all the media convos about this replacing junior jobs are wrong, ive seen strong juniors excel with this tooling and i advise CS students sometimes on this, and have been at a loss until very recently. the reason its so strong for them is i can hire a smart person now and train them to do what i need with automation and a strong knowledge base in 2-3 months vs a year. I strongly suspect this will improve. this is so powerful for hiring. it is a clearly return investment especially if they stick around and are willing to learn.
this will replace people who do not adapt. young people and disadvantaged people are in a stronger position than ever, and im all for it. i’m almost old enough to be a protected class and its threatening but personally am excited to see how the younger smart upstarts use this stuff. pass the torch. we will need it
Not sure if Jack has Elon level of power but it wouldn’t surprise me if he just read that story and then made his decision.
how many of your coworkers do you even know what they do really? Very skeptical how someone can take a top down approach of we need 40% less of people overnight.
Right now for reasoning/creative tasks, I find Gemini 3.1 to be quite a bit better than both Claude 4.6 and ChatGPT. I like to use a combo to every query and pick the best (my jerb offers a webUI frontend where we can select whatever model we want), but Gemini almost always wins.
Even fully autonomous weapons (those that take humans out of the loop entirely and automate selecting and engaging targets) may prove critical for our national defense. But today, frontier AI systems are simply not reliable enough to power fully autonomous weapons.
So Anthropic will allow fully automatic weapons systems using their models once they’re good enough. Yeah I don’t think these are the guys that will stop Skynet?
Killbots are 100% happening, pretty soon, and no one is even that interested in preventing it. I’m quite convinced that domestic surveillance is the real sticking point here.
Is there some reason they are negotiating with Anthropic? Can’t they just go to another one of the companies and use that system? Seems like it’s inevitable someone is going to give in and hand the keys to Hegseth, Trump & Co.