reddit can go fuck itself.
That’s the kind of talk that can get you banned from Reddit. 😜
I imagine almost my entire Post history can get me banned on Reddit.
People who posted on Reddit ( speaking in the past tense, because who would continue to do so now that we have better things? ) never intended for it to be of limited access. Reddit was a publicly accessible place, and people shared their thoughts and comments on it because it was the frontpage of the internet, so the place of choice to share things with the world. That being scraped should not be a problem. But clearly Reddit didn’t want to give you a platform to share your thoughts with the world, they wanted you to donate your thoughts and take it as their property so that they can capitalize on it.
I don’t know… I mean, I agree. But I’m seeing a lot of demands that instances should prevent scraping. Ok, it could be astroturf; a campaign by Reddit/data brokers to neutralize the free competition. But you have seen all those deleted posts on Reddit. Those are some special little minds.
you’re right, there’s probably some anti-ai/anti-scraping folks on there aswell as here. Personally I most definitely hate intellectual property more than I do generative AI. But you’re right, different people on there will feel differently. But the point still stands that for those who thought they shared their thoughts with the world, their ideas that they donated were taken from them.
That place is becoming more and more of a shithole. Bots, Ads, trolls, garbage mods… deleted the app last month.
I quit reddit, cold turkey, the day they shut off free API access for 3rd parties. Except for a couple of fairly niche subs I haven’t missed it at all.
Same here. I’ve been better off ever since.
This is huge blow to archivism, thanks to corporate greed and enshittification of reddit. Worst MBA filled POS.
Fuck Reddit
Fuck Spez
So reddit will become even less valuable
Not that reddit isn’t hot garbage right now, and has been for a while actually, but there’s a lot of people here who have glazed over the reason why reddit instituted this policy.
AI companies are scraping the Wayback Machine. This is something that should concern all of us.
Why?
Circumventing sites with ‘no ai scraping’ rules
And what do I care about Reddit getting paid?
If the IA doesn’t complain about being used, then it’s fine for me. The ideal outcome would be, if the archive can make some arrangement where they scrape the data and provide it to everyone. That way, sites only get scraped once and not constantly hammered.
There are plenty of sites out there not owned by major conglomerates that have norobots and noscrape tags that AI companies can use Wayback as a way to circumvent their policies.
This isn’t about reddit, it’s about AI companies stealing everything on the internet and then selling it back to you while taking your job away.
This is why we can’t have nice things. Tell you what. I will have as much support for you, as you have for blue collar workers. Sound fair?
Since I’m a union worker, sounds good.
Ahh, the next Ronald Reagan.
Reddit warned my account ( first warn in 10 years ) and deleted the comment when I told a American he can strike peacefully to show the government they are against it.
I got a warn for recommending violence by an ai , the human that checked it agreed and didn’t remove the warn haha.
Reddit is just feared that their censorship goes public.
I was on Reddit for like 15 years, then got all my warnings and a ban in like a month or two earlier this year. Oh well, lol.
I just replied “Liar, or fucking liar.” To every republican lie I saw. Only took 2 days for a permaban. I feel if they can lie we should be able to call them out on it at least.
I was on reddit for 11 years before getting banned due to zionists. I have a throwaway reddit account now for porn and other shit, but I dont post.
deleted by creator
Nice of them to protect their (users’) content from AI scrapping. So that they can charge AI companies for it instead.
They aren’t doing that. They are protecting content from being scraped for free. Reddit is perfectly happy to charge for AI access to user-generated content.
No, that’s not what’s happening. They’re preventing scrapers from accessing the content at no charge. They’re totally willing to make deals for access to their content in exchange for money.
Almost, but they are really making it so they can charge ai companies for user data and not allow scrappers to get the data for free.
They can keep their shit for themselves, stopped caring a long time ago.
In the lieu of an IPO u/spez has actively destroyed everything that made Reddit good! Gate keeping the API thinking it’ll help with making some bigshot LLM some day lol
Lol every platform seems to live long enough to shoot themselves in the foot.
Phpbb/mybb/smf haven’t seemed to do that.
When reddit has mutated a few more times. They start erasing stuff themselves. It will be lost to time and that fills me with hope.
If you can’t archive something, did it ever really exist?
In a causal sense, yes. In a ‘the average person is fucking stupid’ sense, no.
Is that even possible?
Technologically no. Reddit sends out the data to 10s of millions of users as part of their normal operations. They need to try to block those who collect that data for the IA. Reddit has the very short end of the stick.
The problem is that evading such counter-measures may be criminal in the US. Obviously, EU laws are much harsher.
Not to mention all of Asia, South America, Africa…
Slightly related, can you explain how (a few times for me) an archived page I tried to revisit got erased?
I don’t know their take-down policy. Could be privacy, could be copyright.
I think they are shielded by Section 230 under US law. That means, if they don’t do take-downs when requested, they become liable just like the original uploader. So it depends on whether they think they can defend something as fair use. IDK what they do with requests under non-US laws.
Thanks for your detailed explanation.
When I look that up it’s specifically about ‘defamatory, illegal, or harmful content’.
That would be understandable to take down.
Never encountered that myself, the cases I’m referring to were totally legal content AFAIK.
Only very damaging or proof of something.
As a hypothetical example, let’s say an organisation posts it’s associated with Epstein in 1999 which now obviously is very inconvenient.
They understandably remove it from their website but it should stil be on the archive if captured before.
However, in similar controversial real cases it wasn’t.
So it appears certain forces have more influence to get them to remove content beyond what’s legally required.
Since then I always screenshot the archive page.Hmm. There are many things that could cause legal trouble for the Wayback Machine. I wouldn’t jump to conclusions.
You can see on Lemmy that many people would prefer to outlaw scraping, fair use, and all that. Well, not for the “good guys” obviously, but the law doesn’t work on vibes. The IA would be legally impossible in most countries. In the EU, it would be a major crime because of copyright and GDPR. It’s only the traditional US commitment to free speech and fair use that makes it possible at all.
The IA exists in a legally precarious position. That’s not because of any shady backroom dealing. If the crowd in this community had its way, it would be gone.
I know the EU has different (stricter) laws and that they vary between states. (Germany being particularly awful)
There is however some complicated form of fair use policy.
If the IA hosts music and books that might be problematic.
But I’m talking about archived webpages and information previously available to the public with zero commercial value that has been removed.
And this includes American sites.But I’m talking about archived webpages and information previously available to the public with zero commercial value that has been removed.
It is still “intellectual property”. Maybe the policy is to just oblige removal requests if the content doesn’t seem to be of public interest. Cause why not, right? Look at all the people here on Lemmy angry that their worthless posts are scraped or deleting them on Reddit. Obliging takedown requests is certainly the path of least resistance.