Blocking AI bots from Microsoft, others has been “pain in the a**”: Reddit CEO | Huffman says companies must pay to scrape Reddit data even though Reddit itself relies on free, user-generated content

ForgottenFlux@lemmy.world · edit-2 4 months ago

Blocking AI bots from Microsoft, others has been “pain in the a**”: Reddit CEO | Huffman says companies must pay to scrape Reddit data even though Reddit itself relies on free, user-generated content

markon@lemmy.world · 4 months ago

Yep they now get paid for the data we have them. I have no sympathy lol. At least these models can’t actually store it all losslessly by any stretch of the imagination. The compression factors would have to be like 100-200X+ anything we’ve ever been able to achieve before. The numbers don’t work out. The models do encode a lot though and some of it is going to include actual full text data etc but it’ll still be kinda fuzzy.

I think we do need ALL OPEN SOURCE. Not just for AI, but I know on that point I’m preaching to the choir here lol