• Hotzilla@sopuli.xyz
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    15 days ago

    GPT-5 without “thinking” mode got the answer wrong.

    GPT-5 with thinking answered:

    Here are the 21 US states with the letter “R” in their name:

    Arizona, Arkansas, California, Colorado, Delaware, Florida, Georgia, Maryland, Missouri, Nebraska, New Hampshire, New Jersey, New York, North Carolina, North Dakota, Oregon, Rhode Island, South Carolina, Vermont, Virginia, West Virginia.

    It wrote a script that verified it while doing the “thinking” (feeding the hallusinations back to the LLM)

      • jj4211@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        15 days ago

        Well, not quite, because they don’t have criteria for ‘right’.

        They do basically say ‘generate 10x more content than usual, then dispose of 90% of it’, and that surprisingly seems to largely improve results, but at no point is it ‘grading’ the result.

        Some people have bothered to provide ‘chain of thought’ examples and even when it’s largely ‘correct’, you may see a middle step be utterly flubbed in a way that should have fouled the whole thing, but the error is oddly isolated and doesn’t carry forward into the subsequent content, as would be the case in actual ‘reasoning’.