Age Verification: Yoti Gets Wrapped Up in Massive Privacy Scandal

A recent report suggests that age verification company, Yoti, is harvesting far more information than necessary and sharing it with invisible fourth parties.

Experts and observers have long warned about the dangers of age verification. One of those warnings involves the fact that age verification is a massive intrusion on people’s privacy – going far beyond just checking the age of its users. Another warning was that age verification systems today are ineffective at what they are tasked to do. Those concerns in multiple jurisdictions have largely gone ignored and, as a result, those concerns have gone from theoretical worries to playing out in real life.

Age verification systems are failing in spectacular fashion. Kids are defeating them with sharpies, pictures of golden retrievers, and VPNs. On the privacy front, people who end up trying to be honest about the situation and forking their personal information over are getting, well, screwed over. This comes in the form of the Discord data breach, the AgeGO scandal, and additional leaks and breaches.

Recently, I learned of another massive scandal on the age verification front. This time, it involves another major age verification vendor: Yoti. Yoti boasts of being used by 60% of the internet. It is used by PlayStation, Meta, and TikTok among others. According to researchers, Yoti is apparently harvesting far more information than is necessary to verify people’s ages. That’s bad in and of itself, but it gets worse. It also shares that information with ‘invisible fourth parties’. From Kotaku:

As spotted by Futurity, the report, titled “Papers, Please: A First Look at Age Verification on the Web,” was recently presented at the IEEE Symposium on Security and Privacy conference on May 18, states that Yoti’s data collection methods “paint a concerning picture of privacy and effectiveness of age verification.”

According to the report, Yoti’s age verification software “collects a significant amount of high-resolution data about the user’s device” during its checks, even though said information does not appear to be “necessary in estimating the age of a user.” This specifically includes information gathered from the device during the age verification process, such as “OS version strings, available RAM, connection type, and CPU architecture.” The report also states that the “uniquely identifiable” information could be used to allow for “unpermissioned tracking of the user’s device.”

However, the most worrying discovery is “that Yoti relies on sharing sensitive user information with several less user-visible fourth parties,” including the payment processor Stripe. The paper notes that Stripe “collects significant telemetry that could likely be used to uniquely identify a device,” which includes information scraped from the first-party website used to verify user’s age via Yoti’s software: “We find that the service collects significant private information beyond what is strictly necessary to verify age, including high-entropy browser and device metadata, and other granular telemetry.”

The report goes on to say that when Yoti was contacted about this, they said that this whole thing was just a “bug” and that they have since fixed the issue – something that researchers had no way of confirming. Almost sounds like someone who got caught more than anything else. How do you “accidentally” collect vast troves of telemetry information and share it with hidden parties afterwards? It seems pretty suspect to me anyway.

Anyway, if you want to read the original paper, you can check it out here (PDF). Here’s some fun little snippets from it:

5.5.1. High-entropy data collection. As noted in Section 5.3.1, Yoti collects a significant amount of high resolution data about the user’s device. It is unclear what the use of this data is, and we note that little information collected here appears to be necessary in estimating the age of a user, assuming that one is doing so purely from the image captured or the user’s ID. We further note that much of what is collected (OS version strings, available RAM, connection type, and CPU architecture) is also gathered by well-known fingerprinting libraries (e.g., [62]). Along with the user’s IP address, it is likely that this data is uniquely identifiable, allowing for unpermissioned tracking of the user’s device.

5.5.6. Additional Data Sharing. Table 3 lays out Yoti’s data sharing declarations per its privacy policies [64], [65]. During ID verification for a US driver’s license or state ID, Yoti queries the American Association of Motor Vehicle Administrators (AAMVA) to check the validity of the data extracted from the user’s ID. In states that do not allow this through the AAMVA, and for other checks of US documents, Yoti relies on Aristotle, a DC-based data broker [66]. It also relies on Veratad, a US-based provider of identity verification, age verification, and fraud prevention [67].

6. Concluding Discussion & Recommendations Our observations paint a concerning picture of privacy and effectiveness of age verification. Compliance is low only roughly 14% of sites self-labeling as adult content perform age verification in states with mandates. Worse, sites that do comply via the dominant provider subject users to significant privacy risks. These results have important implications for future technical and policy designs, which we outline below.

This is obviously not good. Is it surprising? Not really. The threat of these companies selling personal information to third parties was always something that people like us warned about. In this case, that information is getting “shared” by those parties. If money is not exchanging hands for said information, this isn’t really much of an improvement. The concerns of where that information ends up is still there.

At any rate, people sometimes ask why people like myself are so dead set against age verification in the first place. This is an excellent example of why that is. Contrary to some accusations, it’s not because we don’t care about children accessing porn. It’s the negative ramifications of the proposed systems that is the problem. The solution is substantially worse than the “problem”.

Drew Wilson on Mastodon, Bluesky and Facebook.