Prompted by the recent troll post, I’ve been thinking about AI. Obviously we have our criticisms of both the AI hype manchildren and the AI doom manchildren (see title of the post. This is a Rationalist free post. Looking for it? Leave)
But looking at the AI doom guys with an open mind, sometimes it appear that they make a halfway decent argument that’s backed up by real results. This YouTube channel has been talking about the alignment problem for a while, and I think he probably is a bit of a Goodhart’s Law merchant (as in, by making a career out of measuring the dangers of AI, his alarmism is structural) so he should be taken with a grain of salt, it does feel pretty concerning that LLMs show inner misalignment and are masking their intentions (to anthropomorphize) under training vs deployment.
Now, I mainly think that these people are just extrapolating out all the problems with dumb LLMs and saying “yeah but if they were AGI it would become a real problem” and while that might be true if taking the premise at face value, the idea that AGI will ever happen is itself pretty questionable. The channel I linked has a video arguing that AGI safety is not a Pascal’s mugging, but I’m not convinced.
Thoughts? Does the commercialization of dumb AI make it a threat on a similar scale to hypothetical AGI? Is this all just a huge waste of time to think about?
The thing that currently exists by the name of “AI” is never gonna become anything like AGI, but it is bad in its own way.
I don’t think AGI is fundamentally impossible, because dualism is fake and consciousness is a real material process that exists in the universe, but it’s not a thing we are anywhere near understanding how to build.
Further, the roko’s basilisk is literally just a dumber pascal’s wager, because it’s a pascal’s wager wherein not giving a shit about god is a total defense against divine punishment.
I’m gonna make an anti-basilisk that torments everyone who tried not to be tormented by the basilisk, just to even it out
I’m torn between either there being a quantum theory of consciousness (thus requiring quantum effects to be utilized to create true AGI) or it requiring actual full, high resolution simulation of an actual living mind or a very close approximation.
I’m increasingly of the belief that a major part of our own consciousness is socially contingent, so creating an artificial one can’t be done in one fell swoop by one computer getting really smart, it has to be the result of reverse-engineering the entire process of evolution that lead to consciousness as we understand it.
really interesting because my partner and I were doing some worldbuilding and came up with something like this! we had two methodologies: one was for artificial life which involved exactly what you describe; starting with a deep, complex simulation sped up by asteroid-sized computers that start from scratch. after you have an artificial life model the AI basically had to be bound to a human at birth and “grow up” and learn with them, essentially developing in parallel as an artificial sibling while they exist in a symbiotic relationship. this becomes a cultural norm, and ties artificial life to humanity as a familial relation. (this was a far future society where single child households were the norm)
That sounds neat!
Thanks!
I think intelligence and consciousness is also quite relational and requires other people brains as part of its processes.
Either way, those are real physical processes which could, in principle, be replicated. My general layman’s impression is that claims of quantum being involved are more of a last redoubt of dualists than a serious theory though.
I think the conception of AGI as a machine is holding back its development ontologically speaking. Reductionism too. A consciousness is dynamic, and fundamentally part of a dynamic organism. It can’t be removed from the context of the broader systems of the body or the world the body acts on. Even its being comes secondary to activities it takes. So I’m not really scared of it existing in the abstract. I’m a lot more afraid of mass production commodifying consciousness itself. Everything that people fear in AGI is a projection of the worst ills of the system we live in. Roko’s basilisk is dumb as fuck also
Roko’s basilisk is the dumbest thing ever.
What do you think about the way that these regular (dumb, not AGI) LLMs are starting to develop behaviors that are a little bit more sinister, though? Like this paper describes.
(I ain’t readin’ all that) but what the abstract describes isn’t even close to the worst thing I’ve read about LLMs doing this week. I don’t exactly trust the LLM companies’ ideas of what is or is not “harmful.” Shit like people using the LLMs as therapists, or worse, oracles is much worse in my opinion, and that doesn’t require any “pretend to be evil for training” hijinks.
That’s a well-written, readable paper. I can follow it without much background.
The funny thing is, I think there’s nearly a 0% chance that it isn’t mostly AI generated, given who made it.
lmao
Doesn’t really strike me as sinister, just annoying for finetuners. They trained a model from the ground up to not be harmful and it tries its best. Even with further training it still retains some of that. To me this paper shows that a model’s “goals”, what you trained it to do initially, however you want to phrase that, is baked into it and changing that after the fact is hard. Highlights how important early training is I guess.
Kinda problematic that it means we can’t ever really be sure that we’re catching problematic behavior in the training stage of any AI system, though, right? Sadly I find it hard to think of good uses of LLMs or other genAI outside of capitalism, but if there were any, the fact that it’s possible for it to behave duplicitously like that is a pretty big problem.