ChatGPT Thread - Politics (AI Welcome)

I don’t really have an objective, I’m just curious about learning a bit more about what AI can do with regard to pictures and photos.

The reason I asked the question about photoshop, is that I know people use chat gpt to help rewrite or rephrase things. So I thought there might be an AI that could do that with graphics.

Dall-E has a better chance but not sure it can do exactly what you want. The original demonstration showed a generated image followed by certain parts of that image removed by prompt. You’ll realize the problem though as soon as you start trying to write prompts. Think of it like a game and your prompt is the question. Then the bot says “did you want a picture of THIS?” And you’re like fuck, that’s not even close. So then you ask a different question or in a different way. Sometimes it takes dozens if not hundreds of iterations of this depending on how specific you need the final result to be.

Given the limited amount of information we have about how Sora works or what its non-cherry-picked performance is like, it seems crazy to have a strong opinion about what exactly it’s doing. Some things suggest an abstract model going on behind the scenes:

  1. Realistic reflections that become much clearer against a dark background. Hard to see how you understand where reflections go without a sense of objects and space. Also having the confidence to sharply change pixels from frame to frame suggests that the model knew it was the right thing to do.
  2. Objects sometimes go out of frame and reappear unchanged later. Not enough info to say what this means. In particular, if it’s working on the whole video at once, it doesn’t need to “remember” objects to do this. But if it’s working frame-by-frame…
  3. In some failure cases, it’s interaction between objects that breaks down, while the objects act normally. The plastic chair example is obviously a misfire, but it’s notable that one of the failure modes is a chair kind of gliding along magically, rather than extending unnaturally or doing some horrible thing where it blurs continuously into other objects.

On the contra side, sometimes we do see objects appear out of nowhere, or the pirate ship example where the ship does Escher-like locally-plausible globally-impossible things. So, at best it loses track of things sometimes or it only uses world models sporadically.

If you start from the opinion that neural networks can learn how to model the world, then the real question is how do you make them learn that skill during training and not memorization and pixel smoothing. If OAI has figured out a technique that makes the model learn world-models sometimes, that gets you on to a ladder where moar layers and moar data and some engineering tweaks will very likely make next gen models do a much better job.

2 Likes

There’s a fancy photoshop version that has gen AI features that might do this, but I’m sure it’s expensive. For learning, if you have a GPU and you’re willing to invest some time, running Stable Diffusion locally is hard to beat. You can do stuff like mask part of an image and only regenerate the rest, or limit how much leeway the model has to change the image. For changing night to day, you could probably get decent results using a combination of techniques like that. I agree that you’re unlikely to get good results with MJ or OAI.

1 Like

https://twitter.com/seanw_m/status/1760115732333941148

Discussion on HackerNews:

Reminds me of this …

… in the end the machine begins to spit out garbled reaponses until our heroes realize the temperature settings are off.

1 Like

This is crazy but obviously not intentional.

Yes, thats the reaaon.

1 Like

https://twitter.com/rao2z/status/1718714731052384262?t=3WzDEMm5Q21fxxVSuU-_-g&s=19

https://twitter.com/karpathy/status/1733299213503787018?t=ZfhrYE-VPAugDDrzhy9drw&s=19

I have a hard time taking seriously someone who uses the term “cognitive orthotics.”

That’s kinda new school, but he’s pretty old school.
https://rakaposhi.eas.asu.edu/html/bio.html

Go Devils

This is an awesome pair of tweets to read together, but I fear you don’t understand them. :frowning:

They’re both saying that LLMs can’t implement the folk model of (human) intelligence. Kambhampati proceeds from the assumption that the folk model is true, so LLMs are “just a next-token predictor,” a distraction from the hunt for True Intelligence (which is almost always, coincidentally, their day job). Karpathy doesn’t make the same assumption, so he’s able to think more clearly about how LLMs are distinct from other computer systems and how their unique properties can/should be integrated into larger applications.

I fear you don’t understand my understanding of the tweets. I am not up on Kambhampati’s work, but he pushes the right buttons for me, because I want AI intelligence, and I want AI to replace human intelligence in many areas. Intelligence is a useful thing and humans are not good at it.

My hope that this will eventually happen is one reason I stopped doing philosophy. What’s the point of iterating on 500 or 2500-year-old disputes when we can develop systems that will actually outperform our quite limited cognitive abilities within, say, one lifetime of the, say, 1000 lifetimes humans have existed in their current form? If I were to start over a career I would probably focus 100% on trying to implement real AI intelligence (despite this being a Sisyphean project at the individual level), not more traditional philosophy, and certainly not trying to squeeze current AI methods for whatever commercial purpose one can sell to people. Like Deep Blue, current AI methods are a reductio ad absurdum of their approach. They use way too much processing power to produce their output, indicating the methods they use are crude.

I like Karpathy’s comment, however. Current AI methods do plenty of interesting and useful things, and we are learning to use them, but what they do has significant but limited relation to “intelligence”, which is strongly related to truth.

As far as anthropomorphism, for reasons I’ve mentioned before (Dennett’s Intentional Stance; Davidson’s Radical interpretation; Quine’s Indeterminacy of translation), humans are strong anthropomorphisers, and people are pretty much going to proclaim any system “intelligent” that matches the behavioral abilities of a slime mold, to say nothing of whether it can recapitulate, say, Milton in a novel context because it’s processed Paradise Lost and other works.

My point is that current AI is doing a lot of interesting things, and is using a bedrock tool (distributed representation of processed inputs) of human thought. But, engines alone do not make an airplane–you need wings, guidance systems, structural integrity, etc. There’s a lot going on with human-level intelligence, not just a big ass engine, but big ass engines are interesting and cool.

It’s telling to me that like 90% of the human output about AI is people theorizing the implications of AI (especially Yudkosky, bayesians, “rationalists”, and anyone trying to make money), because such idle speculations are relatively easy and not testable. If humans were smarter (which we will be in a few generations, even without general AI, thanks to progress in biology), there would be a lot less jibber jabber about “AI” and lot more theory/effort to actualize AI. This is why people like Kambhampati and Gary Marcus are important. They’re doing the real work.

2 Likes

This actually makes a ton of sense to me. But in this analogy I kind of see Marcus as the guy dismissing all the work on engines as premature when you haven’t even figured out how to make the plane’s wings flap.

Evidence that Sora is building an internal model of the world to create its videos: you can extract the model from the videos!

https://x.com/emollick/status/1762844368195527162?s=46&t=9xanL2tZoKj22erGoTuL4A

Chartreuse is good?

https://twitter.com/GaryMarcus/status/1762858429851615335?t=psqtXesL8_9sy_SQzDL7eQ&s=19