Sep 7 2023

Hashing and Matching is Core to Proactive CSAM Detection

Post By: Safer / 3 min read


The amount of and rate at which child sexual abuse material (CSAM) is found on the internet is escalating at unprecedented speed and scale and cannot be stopped solely with human intervention. In 2004, the National Center for Missing & Exploited Children (NCMEC) reviewed roughly 450,000 child sexual abuse material (CSAM) files. Fast forward to 2022, the NCMEC’s CyberTipline received over 88.3 million files of CSAM from electronic service providers alone. That’s an average of 1.7 million CSAM files reported per week.

In September 2019, The New York Times declared that “The Internet is Overrun With Images of Child Sexual Abuse.”

Hosting this content is a potential risk for every platform that hosts user-generated content—be it a profile picture or expansive cloud storage space. In September 2019, The New York Times declared that “The Internet is Overrun With Images of Child Sexual Abuse.” Since then, the number of image and video files reported annually to NCMEC has grown, and new threats have emerged.

Thorn is committed to empowering the tech industry with tools and resources to disrupt CSAM at scale. Hashing and matching is one of the most important pieces of technology that you can deploy to help keep your users and your platform protected from the risks of hosting this content, while also helping to disrupt the viral spread of CSAM and the cycles of revictimization.

What is hashing and matching?

Hashing is one of Safer’s foundational technologies. Safer uses perceptual and cryptographic hashing to convert a file into a unique string of numbers. This is called a hash value. It’s like a digital fingerprint for each piece of content.

Hashes are compared against Safer’s hash list that contains 29M+ hash values of previously reported and verified CSAM. The system looks for a match of the hash without ever seeing users’ content.

When and if a match is found, the file is queued for your team, to report to authorized entities who refer it to law enforcement in the proper jurisdiction. Safer offers a Reporting API to send reports directly to NCMEC or Royal Canadian Mounted Police (RCMP).

Only technology can tackle the scale of this issue

Millions of CSAM files are shared online every year. A large portion of these files are of previously reported and verified CSAM, and has been added to an NGO hash list. Hashing and matching is a programmatic way to disrupt thespread of child sexual abuse material.

Additionally, investigators and Trust and Safety teams can spend less time reviewing repeat content. This frees them up to prioritize high-risk content, where a child may be suffering ongoing abuse. Learn more about how our CSAM classifier helps find new CSAM.

By using this privacy-forward technology to constrain CSAM at scale, we can protect individual privacy and advance the fight against CSAM.

Safer is helping our customers protect their platforms

Our all-in-one solution for proactive CSAM detection, uses hashing and matching as a core part of its detection technology. With the largest database of verified CSAM hash values (29+ million hashes) to match against, Safer can cast a wide net to detect known CSAM.

In 2022, we hashed more than 42.1 billion images and videos for our customers. That empowered our customers to find 520,000 files of known CSAM on their platforms. To date, Safer has found 2.2M pieces of potential CSAM.

2.2M files of potential CSAM identified since Safer launched in 2019

Hashing and matching is crucial to protecting your users and your platform from the risks of hosting sexual abuse content. The more platforms that utilize this technology, the closer we will get to our goal of eliminating CSAM from the internet.

You've successfully subscribed to Safer: Building the internet we deserve.!