Stop telling me AI is the future [Still Vreni]

cannedtuna@lemmy.world · 7 days ago

Stop telling me AI is the future [Still Vreni]

pfried@reddthat.com · 21 hours ago

I’m saying there is no “big leap” necessary. As the paper that introduced the transformer said, attention is all you need.

AnyOldName3@lemmy.world · 21 hours ago

If we’re going to pull up other people’s pithy phrases that aren’t intended to be taken entirely literally, then the relevant one here is machine learning is the second best solution to any problem. In the (approximately, depending on how you define it) century people have been thinking about computers, we’ve already found better solutions to lots of problems. If a transformer-based neural network can get 99% accuracy in sixty seconds on 92 billion transistors of GPU and billions more for its VRAM, that’s pretty useless if we can also do it with 100% accuracy in sixty microseconds on a $1 microcontroller, or even faster on a less constrained device.

The attention is all you need phrase is specifically in the context of sequence transduction models, and specifically referring to the discovery that they don’t need a combination of attention, recurrence and convolution, but actually only need attention if it’s used in the novel way introduced by the paper. If you don’t need to transduce any sequences, then this isn’t necessarily relevant, and it’s critically not a claim that you can do everything by transducing sequences. It was a surprise that applying it to generating new text instead of just converting it worked as well as it did, and a surprise that it kept getting better with larger models instead of plateauing around the GPT-1 and GPT-2 era, and a surprise that the text generation could be used to do other things, even ones as basic as addition. These things weren’t predicted by the Attention Is All You Need paper.