Skip to content
  • Categories
  • World
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Zephyr)
  • No Skin
Collapse
Brand Logo

The Nexus of Discussions

  1. Home
  2. Categories
  3. Uncategorized
  4. LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

Scheduled Pinned Locked Moved Uncategorized
fedipactmetathreads
12 Posts 5 Posters 14 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • fedipact@cyberpunk.lolF fedipact@cyberpunk.lol

    LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

    "The tech giant is sidestepping guardrails that websites use to prevent being scraped, data show, in a move whistleblowers say is unethical and potentially illegal."

    ARTICLE: https://www.dropsitenews.com/p/meta-facebook-tech-copyright-privacy-whistleblower

    FULL PDF: https://www.dropsitenews.com/api/v1/file/b3555944-e204-4f5e-9a64-e44281b19a82.pdf

    #FediPact #meta #threads #AI

    fedipact@cyberpunk.lolF This user is from outside of this forum
    fedipact@cyberpunk.lolF This user is from outside of this forum
    fedipact@cyberpunk.lol
    wrote last edited by
    #2

    INSTANCES KNOWN TO HAVE BEEN SCRAPED BY META INCLUDE:

    • mastodon.social

    • mastodon.online

    • tech.lgbt

    • hackers.town

    • chaos.social

    • mastodon.org.uk

    • mastodont.cat

    • mastodon.de

    • mastodon.xyz

    • mastodon.coffee

    • mastodon.cloud

    • mastodon.scot

    • mastodonapp.uk

    • mastodon.green

    • mastodon.ml

    • mastodon.au

    • mastodon.eus

    • mastodonczech.cz

    • mastodon.sdf.org

    • mstdn.social

    • troet.cafe

    • techhub.social

    • tchncs.de

    • kolektiva.social

    • mamot.fr

    • defcon.social

    • meow.social

    • social.linux.pizza

    • ioc.exchange

    • eldritch.cafe

    • yiff.life

    • furry.engineer

    • infosec.exchange

    • blahaj.zone

    • woof.group

    • union.place

    • queer.party

    • sakurajima.moe

    • pawb.social

    • digipres.club

    • journa.host

    • corteximplant.net

    • corteximplant.com

    • octodon.social

    • bitbang.social

    • jorts.horse

    • tenforward.social

    • pnw.zone

    • spore.social

    • hear-me.social

    • neuromatch.social

    • vt.social

    • cosocial.ca

    • chitter.xyz

    • tooter.social

    • cloudisland.nz

    • social.seattle.wa.us

    • masto.es

    • nobigtech.es

    • mastodon.gal

    • masto.host

    • toot.community

    • pony.social

    • climatejustice.global

    • pleroma.envs.net

    • indiepocalypse.social

    • anarchism.space

    • disroot.org

    • dragonscave.space

    • toot.bike

    • fuzzies.wtf

    • norden.social

    • beige.party

    • ohai.social

    • freeradical.zone

    • metalhead.club

    • treehouse.systems

    • icosahedron.website

    • sunbeam.city

    • sunny.garden

    • zeroes.ca

    • ursal.zone

    • chaosfem.tw

    • mas.to

    • mathstodon.xyz

    • rubber.social

    • todon.nl

    • cupoftea.social

    • nerdculture.de

    • toad.social

    there're definitely more, i just did ctrl+f when i thought of an instance name so i definitely missed some. will be editing this list to add them as i think of them

    #FediPact #meta #threads

    fedipact@cyberpunk.lolF 1 Reply Last reply
    • fedipact@cyberpunk.lolF fedipact@cyberpunk.lol

      INSTANCES KNOWN TO HAVE BEEN SCRAPED BY META INCLUDE:

      • mastodon.social

      • mastodon.online

      • tech.lgbt

      • hackers.town

      • chaos.social

      • mastodon.org.uk

      • mastodont.cat

      • mastodon.de

      • mastodon.xyz

      • mastodon.coffee

      • mastodon.cloud

      • mastodon.scot

      • mastodonapp.uk

      • mastodon.green

      • mastodon.ml

      • mastodon.au

      • mastodon.eus

      • mastodonczech.cz

      • mastodon.sdf.org

      • mstdn.social

      • troet.cafe

      • techhub.social

      • tchncs.de

      • kolektiva.social

      • mamot.fr

      • defcon.social

      • meow.social

      • social.linux.pizza

      • ioc.exchange

      • eldritch.cafe

      • yiff.life

      • furry.engineer

      • infosec.exchange

      • blahaj.zone

      • woof.group

      • union.place

      • queer.party

      • sakurajima.moe

      • pawb.social

      • digipres.club

      • journa.host

      • corteximplant.net

      • corteximplant.com

      • octodon.social

      • bitbang.social

      • jorts.horse

      • tenforward.social

      • pnw.zone

      • spore.social

      • hear-me.social

      • neuromatch.social

      • vt.social

      • cosocial.ca

      • chitter.xyz

      • tooter.social

      • cloudisland.nz

      • social.seattle.wa.us

      • masto.es

      • nobigtech.es

      • mastodon.gal

      • masto.host

      • toot.community

      • pony.social

      • climatejustice.global

      • pleroma.envs.net

      • indiepocalypse.social

      • anarchism.space

      • disroot.org

      • dragonscave.space

      • toot.bike

      • fuzzies.wtf

      • norden.social

      • beige.party

      • ohai.social

      • freeradical.zone

      • metalhead.club

      • treehouse.systems

      • icosahedron.website

      • sunbeam.city

      • sunny.garden

      • zeroes.ca

      • ursal.zone

      • chaosfem.tw

      • mas.to

      • mathstodon.xyz

      • rubber.social

      • todon.nl

      • cupoftea.social

      • nerdculture.de

      • toad.social

      there're definitely more, i just did ctrl+f when i thought of an instance name so i definitely missed some. will be editing this list to add them as i think of them

      #FediPact #meta #threads

      fedipact@cyberpunk.lolF This user is from outside of this forum
      fedipact@cyberpunk.lolF This user is from outside of this forum
      fedipact@cyberpunk.lol
      wrote last edited by
      #3

      i'm gonna be editing that list as i think of more so be sure to view it directly on cyberpunk.lol to make sure you get the whole thingy

      #FediPact #meta #threads

      essjayjay@tech.lgbtE 1 Reply Last reply
      • fedipact@cyberpunk.lolF fedipact@cyberpunk.lol

        i'm gonna be editing that list as i think of more so be sure to view it directly on cyberpunk.lol to make sure you get the whole thingy

        #FediPact #meta #threads

        essjayjay@tech.lgbtE This user is from outside of this forum
        essjayjay@tech.lgbtE This user is from outside of this forum
        essjayjay@tech.lgbt
        wrote last edited by
        #4

        @mods

        Is this true WRT to tech.lgbt?

        @FediPact

        bluestarultor@tech.lgbtB 1 Reply Last reply
        • essjayjay@tech.lgbtE essjayjay@tech.lgbt

          @mods

          Is this true WRT to tech.lgbt?

          @FediPact

          bluestarultor@tech.lgbtB This user is from outside of this forum
          bluestarultor@tech.lgbtB This user is from outside of this forum
          bluestarultor@tech.lgbt
          wrote last edited by
          #5

          @essjayjay @mods @FediPact This is the first I'm personally hearing of it, but you do have to understand that scraping does not have to be a consensual process and scrapers have been doing all sorts of shady stuff to hide themselves. I can't personally speak more on the topic. However, I have raised it to the team to draft a proper response.

          victimofsimony@infosec.exchangeV 1 Reply Last reply
          • bluestarultor@tech.lgbtB bluestarultor@tech.lgbt

            @essjayjay @mods @FediPact This is the first I'm personally hearing of it, but you do have to understand that scraping does not have to be a consensual process and scrapers have been doing all sorts of shady stuff to hide themselves. I can't personally speak more on the topic. However, I have raised it to the team to draft a proper response.

            victimofsimony@infosec.exchangeV This user is from outside of this forum
            victimofsimony@infosec.exchangeV This user is from outside of this forum
            victimofsimony@infosec.exchange
            wrote last edited by
            #6

            @bluestarultor
            @essjayjay @mods
            @FediPact

            You said scraping was legal. Presuming we're talking about the U.S.A. here, can you explain how that can be in a country that presumes everything I write defaults to being subject to my personal copyright?

            thenexusofprivacy@infosec.exchangeT 1 Reply Last reply
            • victimofsimony@infosec.exchangeV victimofsimony@infosec.exchange

              @bluestarultor
              @essjayjay @mods
              @FediPact

              You said scraping was legal. Presuming we're talking about the U.S.A. here, can you explain how that can be in a country that presumes everything I write defaults to being subject to my personal copyright?

              thenexusofprivacy@infosec.exchangeT This user is from outside of this forum
              thenexusofprivacy@infosec.exchangeT This user is from outside of this forum
              thenexusofprivacy@infosec.exchange
              wrote last edited by thenexusofprivacy@infosec.exchange
              #7

              At least so far, individuals haven't succeed in copyright claims against web scrapers. Here's a good article on the US legal landscape as of a couple of years ago (with the caveat that it's by somebody who sees scraping as generally a good thing) https://blog.ericgoldman.org/archives/2023/08/web-scraping-for-me-but-not-for-thee-guest-blog-post.htm From a privacy perspective, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4884485 looks at the challenges.

              @VictimOfSimony @bluestarultor @essjayjay @FediPact

              bluestarultor@tech.lgbtB victimofsimony@infosec.exchangeV 2 Replies Last reply
              • thenexusofprivacy@infosec.exchangeT thenexusofprivacy@infosec.exchange

                At least so far, individuals haven't succeed in copyright claims against web scrapers. Here's a good article on the US legal landscape as of a couple of years ago (with the caveat that it's by somebody who sees scraping as generally a good thing) https://blog.ericgoldman.org/archives/2023/08/web-scraping-for-me-but-not-for-thee-guest-blog-post.htm From a privacy perspective, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4884485 looks at the challenges.

                @VictimOfSimony @bluestarultor @essjayjay @FediPact

                bluestarultor@tech.lgbtB This user is from outside of this forum
                bluestarultor@tech.lgbtB This user is from outside of this forum
                bluestarultor@tech.lgbt
                wrote last edited by
                #8

                @thenexusofprivacy @VictimOfSimony @essjayjay @FediPact Also literally no one in this thread said it was legal. XD

                Even the original article notes that it's illegal to be slurping up copyrighted works, but that they failed to convince the judge of meaningful damages meriting restitution.

                I said scraping is "not necessarily consensual" and that's because various sites have entered partnerships to sell off their users' creations with some half-assed nod to getting their consent.

                thenexusofprivacy@infosec.exchangeT 1 Reply Last reply
                • bluestarultor@tech.lgbtB bluestarultor@tech.lgbt

                  @thenexusofprivacy @VictimOfSimony @essjayjay @FediPact Also literally no one in this thread said it was legal. XD

                  Even the original article notes that it's illegal to be slurping up copyrighted works, but that they failed to convince the judge of meaningful damages meriting restitution.

                  I said scraping is "not necessarily consensual" and that's because various sites have entered partnerships to sell off their users' creations with some half-assed nod to getting their consent.

                  thenexusofprivacy@infosec.exchangeT This user is from outside of this forum
                  thenexusofprivacy@infosec.exchangeT This user is from outside of this forum
                  thenexusofprivacy@infosec.exchange
                  wrote last edited by
                  #9

                  Fair enough, I was just responding to @VictimOfSimony's question about scraping and copyright.

                  @bluestarultor @essjayjay @FediPact

                  victimofsimony@infosec.exchangeV 1 Reply Last reply
                  • thenexusofprivacy@infosec.exchangeT thenexusofprivacy@infosec.exchange

                    At least so far, individuals haven't succeed in copyright claims against web scrapers. Here's a good article on the US legal landscape as of a couple of years ago (with the caveat that it's by somebody who sees scraping as generally a good thing) https://blog.ericgoldman.org/archives/2023/08/web-scraping-for-me-but-not-for-thee-guest-blog-post.htm From a privacy perspective, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4884485 looks at the challenges.

                    @VictimOfSimony @bluestarultor @essjayjay @FediPact

                    victimofsimony@infosec.exchangeV This user is from outside of this forum
                    victimofsimony@infosec.exchangeV This user is from outside of this forum
                    victimofsimony@infosec.exchange
                    wrote last edited by
                    #10

                    @thenexusofprivacy
                    @bluestarultor
                    @essjayjay
                    @FediPact

                    This article seems to think the problem is that a third party is asserting the copyright. The fact that these class actions are becoming more popular with first parties seems to suggest you're mistaken. Also, the trespass issue I mentioned remains since there is no implied right of access to chattels for an illegal purpose. There's a tort here.

                    thenexusofprivacy@infosec.exchangeT 1 Reply Last reply
                    • thenexusofprivacy@infosec.exchangeT thenexusofprivacy@infosec.exchange

                      Fair enough, I was just responding to @VictimOfSimony's question about scraping and copyright.

                      @bluestarultor @essjayjay @FediPact

                      victimofsimony@infosec.exchangeV This user is from outside of this forum
                      victimofsimony@infosec.exchangeV This user is from outside of this forum
                      victimofsimony@infosec.exchange
                      wrote last edited by
                      #11

                      @thenexusofprivacy
                      @bluestarultor
                      @essjayjay
                      @FediPact

                      We do appreciate the response.

                      1 Reply Last reply
                      • victimofsimony@infosec.exchangeV victimofsimony@infosec.exchange

                        @thenexusofprivacy
                        @bluestarultor
                        @essjayjay
                        @FediPact

                        This article seems to think the problem is that a third party is asserting the copyright. The fact that these class actions are becoming more popular with first parties seems to suggest you're mistaken. Also, the trespass issue I mentioned remains since there is no implied right of access to chattels for an illegal purpose. There's a tort here.

                        thenexusofprivacy@infosec.exchangeT This user is from outside of this forum
                        thenexusofprivacy@infosec.exchangeT This user is from outside of this forum
                        thenexusofprivacy@infosec.exchange
                        wrote last edited by
                        #12

                        There are quite a few class actions in process and it'll be interesting to see how things play out. And even though the plaintiffs in the Meta case didn't succeed, the court certainly left the door open to other attempts -- and arguably even encouraged them. https://www.technologyreview.com/2025/07/01/1119486/ai-copyright-meta-anthropic/ is a good overview of the Meta and Anthropic cases, and as they point out the wins for the tech companies are less cut-and-dried than they seem at first.

                        Still, even though the answer may be different at some point, right now I think it's still true that so far individuals haven't succeeded in copyright claims against scrapers.

                        @VictimOfSimony @bluestarultor @essjayjay @FediPact

                        1 Reply Last reply
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        Please keep the community guidelines in mind!
                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • World
                        • Recent
                        • Tags
                        • Popular
                        • Users
                        • Groups