• davel@lemmy.ml
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    2 days ago

    Also, I figure (if it hasn’t happened already) some federated instances out there are nefarious, set up to harvest data.

    [Citations needed] or it didn’t happen. There’s precious little extra information that a “nefarious” instance can harvest that any basic web scrapper can’t.

    • grey_maniac@lemmy.ca
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      1 day ago

      [Citations needed] or it didn’t happen.

      This is such a bullshit challenge. I often see it used to essentially bully someone into a side issue about citations. It’s a great way to avoid discussing the original issue.

      I have knowledge (that I rarely share) that I am absolutely not going to cite, because I’m not jeopardising sources, or clearances, or violating my obligations to the official secrets act just to play someone’s status games.

      If someone makes a claim, I am perfectly able to go find the relevant citations myself, if there are any. I am more interested in the structure and content of what they’re adding to the discussion.

      • davel@lemmy.ml
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        I often see it used to essentially bully someone into a side issue about citations. It’s a great way to avoid discussing the original issue.

        You may well have, but that’s not what I’m doing. I’m familiar with ActivityPub’s & Lemmy’s APIs, and I’m calling bullshit on OP’s hyperbolic claim without evidence or elaboration.

        • grey_maniac@lemmy.ca
          link
          fedilink
          arrow-up
          1
          ·
          23 hours ago

          So, from your knowledge of those APIs, this isn’t possible? I don’t need to develop a defensive protocol for it? I like to be comprehensive, especially with a potential (ideological and propaganda, if not literal) invasion from the new fascist state to my south, but if this is a low-level probability, I can put it way down my priority list.

          • davel@lemmy.ml
            link
            fedilink
            English
            arrow-up
            2
            ·
            23 hours ago

            If privacy is what you’re looking for, ActivityPub is never going to provide it, because it wasn’t designed for it and can’t be back-ported into it. You should log off and use (or create) something altogether else.

    • mox@lemmy.sdf.org
      link
      fedilink
      arrow-up
      9
      arrow-down
      2
      ·
      edit-2
      2 days ago

      [Citations needed] or it didn’t happen.

      I think this mindset is naïve and unrealistic.

      People were saying the same thing for decades in response to a small minority warning about government surveillance, often dismissing them with labels like “paranoid”. Eventually, Snowden came along and produced the citations, at extreme risk to himself and his loved ones. It’s an anomaly that they were ever revealed at all.

      History is replete with examples of bad stuff going on for ages before irrefutable evidence of it became widely known. In general, if something can be abused to someone’s advantage, it will be, and likely already is.

      There’s precious little extra information that a “nefarious” instance can harvest that any basic web scrapper can’t.

      You have a point there, but consider also that effective web scraping uses significantly more resources than having the data you want handed to you. Monitoring Lemmy through federation would be much more efficient.

    • transitinoir@slrpnk.net
      link
      fedilink
      English
      arrow-up
      5
      ·
      2 days ago

      Can’t an instance also collect IP-addreses and device info, if its owner adds some scripts to its web version?

      • davel@lemmy.ml
        link
        fedilink
        English
        arrow-up
        10
        ·
        2 days ago

        An instance owner can only collect the IP addresses/brower fingerprints of users logged in to their instance. In other words, only slrpnk.net could collect that information about you, because you are only directly connecting to slrpnk.net.

    • Ebby@lemmy.ssba.com
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      Credit where due, it is just my best guess. I have no evidence.

      I simply think if you have custom code on a machine to ingest data, creating a federation interface may be more suitable and stable in the long run than a scraper. The extra server load may draw attention or run amuck with security policies designed to obscure scrapers.

      But that is certainly an option.