Content Classification System Post Mortem

about.iftas.org@about.iftas.org

Content Classification System Post Mortem

The IFTAS CCS project was a pilot project to provide CSAM detection and reporting for Mastodon servers. The bulk of the project ran for 26 weeks, and while we cannot afford to maintain the service any longer, the findings below can inform future projects. All numbers are rounded for readability.

Pilot Activity

CCS received posts from eight services with roughly 450,000 hosted user accounts, 30,000 active monthly.

Our participants represented a range of service sizes from <10 to >100,000 accounts, and a range of registration options (open registration, open subject to approval, invitation only).

During the pilot period, CCS received 3.9 million posts via webhook, or 23,000 per day. These posts represent messages that were authored by or interacted with by the participating services’ active users, leading to media being stored on the host service.

Just under 40% (1.55 million) of all posts received included one or more media attachments to classify, leading to 1.86 million media files to hash and match. Posts with no media were discarded by the service.

Of the 1.86 million files, small numbers were either unsupported formats (~2,000) or no longer available when CCS attempted to retrieve the media for classification (~1,600). An additional ~3,100 media files failed to download.

In total, of the 1.86 million media files sent to IFTAS for classification, 99.665% were hashed and matched.

The hash matcher flags media for human review if it finds a match, or a near match, and after review IFTAS filed 53 reports related to 80 media files with NCMEC. This works out to 4.29 matches per 100,000 media files. An additional number of media files that matched were beyond our human review expertise to adequately classify, and therefore we elected to not report these files.

All of the matched media and subsequent reports were of real human victims, none were fictional, drawn, or AI generated. We did not receive matches for “lolicon”.

We elected to match against a broad array of databases to ascertain their effectiveness, and we found that databases maintained by child hotline NGOs (e.g. NCMEC, Arachnid) were far more effective than databases available from commercial service providers. We saw a handful of false positives, and the vast majority of them came from commercial providers. If we had continued, we would have narrowed down the databases in use.

All matched media generated a notification to the affected service provider, and IFTAS performed any necessary media retention for law enforcement.

Context

4.29 matches per 100,000 may not sound like a large number. However, to be clear, this is a higher number than many services would expect to see, and it includes a broad range of media, from “barely legal” minors posted publicly, to intimate imagery shared without consent, to the very, very worst media imaginable. In some cases, it was apparent that users were creating accounts on host services to transact or pre-sale media before moving to an encrypted platform, under the belief that Mastodon would not be able to detect the activity.

There are 1.6 billion posts on the ActivityPub network today, and if our numbers hold true, this means there are currently many tens of thousands of copies of known CSAM on the network, likely significantly more as our service adopters by definition do not include providers that are not inclined to mitigate this issue, and criminals looking for anonymous accounts are likely to target less-moderated services.

If IFTAS found it happening so brazenly on the first servers we happened to look at, no doubt this activity is still occurring on servers that have no such protections. Mastodon is – at its simplest – a form of free, anonymous web hosting. The direct messaging feature precludes moderators and administrators from being aware of illegal content (it will never be reported by potential customers), and only a hash and match system is able to find these media and flag them.

Not only does inadvertently hosting CSAM revictimise the children involved, it also serves as an attack vector for the service to be targeted by law enforcement. We are aware of several instances of CSAM being uploaded for the express purpose of causing moderator trauma or an immediate report to law enforcement, leading to a significant amount of legal issues. This is essentially a form of swatting; simply upload CSAM, report it to the authorities, sit back and watch the server get taken down and possible criminal charges for the administrator.

Responsible Shutdown

We ensured that all webhooks were disabled by the host services, and once all review and reporting was completed, we hard-deleted all remaining data on the service, excepting the metadata and media required to be held for one year for possible law enforcement action. The AWS environment was then dismantled, deleted, and removed from service.

All associated staff and consultants were removed from the relevant IT services, and IFTAS retains no data nor metadata from any of the activity other than the bare minimum required by law pertaining to the encrypted media stored for law enforcement.

Some observed services that were clearly unmoderated and/or willing to host this content to the degree that federating with them would generate legal concerns were added to the IFTAS DNI denylist.

Next Steps

Moderation Workflow

We hope that Mastodon, Pixelfed, Lemmy and other platform developers will quickly implement safeguards within moderation dashboards to minimise moderator trauma.

Content moderators commonly experience trauma similar to those suffered by first responders. Even though the development team may have never reviewed traumatic content, the app or service will at some point deliver this traumatic content to users of the moderation workflow. When presenting reported content to a service provider or moderator:

Always show the report classification clearly, so the moderator is aware of the type of content they are about to review,
Blur all media until the moderator hovers to view greyscale version (re-blur when hover not detected or mouseleave event),
Greyscale all media until the moderator clicks to toggle full colour (allow toggle state back to greyscale),
Mute all audio until the moderator requests audio, and
Allow the moderator to reclassify the report.

CSAM Detection

If you are a service provider, lobby your web host or CDN provider to perform this service for you, and ask them if they have resources you can use.

Cloudflare offers a free service worldwide, if you are a Cloudflare customer, consider enabling this option.

If you are a web host that hosts a large number of Fediverse providers, consider adding this safeguard at the network level.

Free Support from Tech Coalition

Tech Coalition has a program aimed at small and medium services called “Pathways“, and they are very interested to hear from Mastodon and other Fediverse service providers. While this does not offer detection, it does offer background, guidance, and access to experts. Sign up to explore these options, and to demonstrate a good faith effort to address this issue. The more providers they hear from, the more likely we are to get better options.

Ongoing Work

We are aware of noteworthy efforts to continue this work. @thisismissem is working on a prototype implementation of HMA, and Roost is exploring an open source solution for small and medium size services.

Consider following and monitoring https://mastodon.iftas.org/@sw_isac to receive alerts when services are confirmed to be sources of this content.

A range of services and resources that can help mitigate this issue are available on our CSAM Primer page in the IFTAS Connect Library. We will continue to research and share resources that can help mitigate this issue for service providers. Please let us know if you are aware of additional resources we can add to this guide.

IFTAS intends to continue its relationships with INHOPE, NCMEC, Project Arachnid, Internet Watch Foundation and other organisations to advocate for the Fediverse, and to ensure these entities understand the network and have someone to talk to if they have questions.

To everyone who participated, asked to participate, or supported this project, thank you! We are extremely sad to have to end this project, but we have safeguarded the underlying codebase and – should the opportunity arise – we will restart with this or another resource to provide this service to any who need.

The Nexus of Discussions

Content Classification System Post Mortem