The fact that AI is “not perfect” is a HUGE FUCKING PROBLEM. Idiots across the world, and people who we’d expect to know better, are making monumental decisions based on AI that isn’t perfect, and routinely “hallucinates”. We all know this.
Every time I think I’ve seen the lowest depths of mass stupidity, humanity goes lower.
Think of the dumbest person you know. Not that one. Dumber. Dumber. Yeah, that one. Now realize that ChatGPT has said “you’re absolutely right” to them no less than a half dozen times today alone.
If LLMs weren’t so damn sycophantic, I think we’d have a lot fewer problems with them. If they could be like “this could be the right answer, but I wasn’t able to verify” and “no, I don’t think what you said is right, and here are reasons why”, people would cling to them less.
LowKey sprinkling my comments with error’s to make sure I’m talking with a member of the resistance instead of with a proxy of our AI overlords. Totally intended ;)
Honestly Claude is not that sycophantic. It often tells me I’m flat out wrong, and it generally challenges a lot of my decisions on projects. One thing I’ve also noticed on 4.6 is how often it will tell me “I don’t have the answer in my training data” and offer to do a web search rather than hallucinating an answer.
There is a benchmark that kinda tests that. It’s call the bullshit benchmark. Basically, LLMs are given questions that don’t make sense in different ways, and their answers are judged based on how much they pushed back or bought in. Claude is in a league of its own when it comes to pushing back on non-sense questions.
Yes i saw that benchmark and was honestly not surprised with the results. It seems that Anthropic really focused on those issues above and beyond what was done in other labs.
If you thought people were dumb before LLMs… just know that now those people have offloaded what little critical thinking they were capable of to these models.
The dumbest people you know are getting their opinions validated by automated sycophants.
The fact that AI is “not perfect” is a HUGE FUCKING PROBLEM. Idiots across the world, and people who we’d expect to know better, are making monumental decisions based on AI that isn’t perfect, and routinely “hallucinates”. We all know this.
Every time I think I’ve seen the lowest depths of mass stupidity, humanity goes lower.
Think of the dumbest person you know. Not that one. Dumber. Dumber. Yeah, that one. Now realize that ChatGPT has said “you’re absolutely right” to them no less than a half dozen times today alone.
If LLMs weren’t so damn sycophantic, I think we’d have a lot fewer problems with them. If they could be like “this could be the right answer, but I wasn’t able to verify” and “no, I don’t think what you said is right, and here are reasons why”, people would cling to them less.
Has anyone made a nonsycophantic chat bot? I would actually love a chatbot that would tell me to go fuck myself if I asked it to do something inane.
Me: “Whats 9x5?”
Chatbot: “I don’t know. Try using your fingers or something?”
Edit: Wait, this is just glados.
I am not a chatbot, but I can do daily “go fuck yourself’s” if your interested for only 9,99 a week.
14,95 for premium, which involves me stalking your onlyfans and tailor fitting my insults to your worthless meat self.
Citation needed
Ah, no, that’s a human error. Not a bot.
LowKey sprinkling my comments with error’s to make sure I’m talking with a member of the resistance instead of with a proxy of our AI overlords. Totally intended ;)
Wgat does the error do with a to?
Honestly Claude is not that sycophantic. It often tells me I’m flat out wrong, and it generally challenges a lot of my decisions on projects. One thing I’ve also noticed on 4.6 is how often it will tell me “I don’t have the answer in my training data” and offer to do a web search rather than hallucinating an answer.
There is a benchmark that kinda tests that. It’s call the bullshit benchmark. Basically, LLMs are given questions that don’t make sense in different ways, and their answers are judged based on how much they pushed back or bought in. Claude is in a league of its own when it comes to pushing back on non-sense questions.
https://petergpt.github.io/bullshit-benchmark/viewer/index.html
Yes i saw that benchmark and was honestly not surprised with the results. It seems that Anthropic really focused on those issues above and beyond what was done in other labs.
If you thought people were dumb before LLMs… just know that now those people have offloaded what little critical thinking they were capable of to these models.
The dumbest people you know are getting their opinions validated by automated sycophants.