@TwilightKiddy @prousername bro really said hashing is a privacy violation??
🏳️🌈 | Autistic | Developer behind Routines, Just. Weather., and [REDACTED] ;) | Writer for AllThingsTech blog | Occasionally a YouTuber | USBC Shill | OLED enjoyer | Bird denier | Pluto defender | Oxford comma rights activist | Gluten haven’t-er | Intolerant of lactose | Fahrenheit > Celsius | Metric preferred (except Celsius)
@TwilightKiddy @prousername bro really said hashing is a privacy violation??
@zbyte64 where am I wrong? The process is effectively the same: you get a set of training data (a textbook) and a set of validation data (a test) and voila, I’m trained
To learn how to draw an image of a thing, you look at the thing a lot (training data) and try sketching it out (validation data) until it’s right
How the data is acquired is irrelevant, I can pirate the textbook or trespass to find a particular flower, that doesn’t mean I’m learning differently than someone who paid for it
@zbyte64 data quality, again, was out of the scope of what I was talking about originally
Which, again, was that legal precedent would suggest that the *how* is largely irrelevant in copyright cases, they’re mostly focused on *why* and the *scale of the operation*
I’m not getting sued for copyright infringement by the NYT because I used inspect element to delete content to read behind their paywall, OpenAI is
@zbyte64 1) In no way is quality a part of that equation and 2) In what other contexts is quality ever a part of the equation? I mean I can go look at some Monets and paint some shitty water lillies, is that somehow problematic?
@zbyte64 from what I understand, you’re referring to the process at scale—the amount of information the AI can take in is inhuman—which I’m not disagreeing with
None of which is relevant to my original point: the scale of their operations, which has already been used countless times in copyright law
The scale at which they operate and their intention to profit is the basis for their infringement, how they’re doing it would be largely irrelevant in a copyright case, is my point
@zbyte64 we’re saying the same thing
It’s a matter scale, not process
@zbyte64 you’re getting away from the original conversation
@zbyte64 with everything you see you are scraping data from your environment whether you want to or not
How does a child learn what pain is? How does a teenager learn what heartbreak is? It’s certainly not because they made the decision to find that out themselves
@Subverb that is, quite impressively, the opposite of what I said
Is a person infringing on copyright by producing content? No. It’s about intent and scale. Humans don’t just sit on this knowledge, they do something with it
There is nothing illegal about WHAT it’s doing, there is everything illegal about HOW and WHY
I very clearly stated that OpenAI’s intent and their scale at which they operate are blatant copyright infringement and that it has been backed up with decades of precedents
@Pika @flop_leash_973 This is largely my thoughts on the whole thing, the process of actually training the AI is no different from a human learning
The thing about that, is that there’s likely enough precedent in copyright law to actually handle that, with most copyright law it’s all about intent and scale and I think that’s likely where this will all go
Here the intent is to replace and the scale is astronomical, whereas an individual’s intent is to add and the scale is minimal
@Navigator @vzq That should probably be the first question then
@neme I like how the devs were like “eh, it’s not even that good of an app, whatever”
@return2ozma committing to a number of years of software updates is…odd, not necessarily in the sense that nobody else is doing it, but in the sense that there are so many variables that go into whether or not a device will be supported on an update it’s actually kind of hard to set that kind of deadline and truthfully stick to it
The same with the claims from Google and Samsung: I’ll believe it when I see it (after all, remember PixelPass?)
@TwilightKiddy I can see how you can get there, but the MITM would need to know the hashing algo, you can’t *really* just un-hash something, at least not reliably
But your original statement was that the hashing was the privacy violation, and that’s the part I took issue with, hashing is a generally accepted security measure, it is not inherently a privacy violation