Day 13 of Article 13 Passage: Filter Developers Admit No System's Perfect

Article 13 would effectively require platforms to use upload filters. One reporter canvassed different developers for answers. Those answers aren’t good.

For quite some time now, one of the criticisms of the now passed copyright directive is that upload filters are, at best, prone to mistakes. At worst, critics say, upload filters are a broken technology that doesn’t exist in any reasonable capacity.

Already, investment money and innovation has been heading for the exits now that Europe has become hostile to small business and startups. It’s as if business knows that Europe has become an unworkable location thanks to the censorship machines. This goes over top of the free speech consequences such technology brings to the table. Still, common sense didn’t stop lobbyists and politicians from pushing through this law anyway.

So, one German reporter decided to canvass different content detection software companies to get, pardon the pun, an unfiltered reality check on the situation. In a report on Spiegel Online (German) the reporter seemingly set out to answer a few simple questions. One question: how much does such censorship technology cost? Another question: How accurate is identification software out there anyway?

The first place the reporter started was YouTube. That platform is known for the heavily criticized ContentID system. While some say that the system has improved, users still have long criticized the system for false positives. So, it’s by no means perfect. Still, YouTube has a unique position in that they developed their technology in-house. By some estimates, this cost the company around $100 million to build. Obviously, for small and medium sized businesses online, the term “sticker shock” seems like a gross understatement.

From there, the reporter went on to US-based Audible Magic. That developer does offer technology used to identify copyrighted works. However, the company admits, it’s more used for information purposes rather than censorship purposes. They do boast of no false positives, but of course, it’s only good at finding possible matches and leaves the rights holders or platforms to act on what they find. So, even if the technology is used, it will still require manpower to fully implement such a system. He also offered prices. The costs start at $1,000 USD a month. That’s for 10,000 files. From there, the prices gradually increase as more information is put into the database. 30,000 files means the system costs $2,000 per month.

Of course, we’re not talking about 10,000 files. We’re talking about somewhere in the neighbourhood of at least tends of millions of works. If you, say, have 30 million works, that’s 100 times the optimistic value of 30,000 files. Multiply the price by 100 and you get $200,000 per month. Even if you get some kind of 50% off, you’re still looking at $100,000 a month or $1.2 million per year. How many small startups have an extra $1.2 million per year in their budget just burning a hole through their pocket anyway? You might get lucky and have some sort of Sears killing venture capital to get you off the ground for the first year, but after that, what are the odds of making that kind of money to begin with?

Now, factor in the possible manpower needed to go in and confirm that said upload is legitimate or said upload is straight up piracy. Even if you could get away with 100 people at $1,200 per month (which is, optimistically, part time and minimum wage), you’re still looking at $120,000 per month which at least doubles the cost. All this, of course, is just to make the business in compliance with European copyright laws. While supporters of censorship might say that it’s just the cost of doing business, but when the cost of doing business is this astronomical, it’s by no means a small solvable problem.

After that, the reporter then spoke about a German-based company and a Chinese based company. While they do boast of different methods of identifying works, they both said that the system is not really meant for such a massive online filter.

This leaves a very important technological dilemma: what do you do in this situation. The report spoke to another IT company and came to the conclusion that there is a choice to be made here: either employ armies of content “controllers” or accept mistakes. The former is a huge cost (as we pointed out with the Audible Magic example). Meanwhile, the latter is a question of tolerance. How many mistakes will both the multinational corporations and the user base are willing to tolerate?

For small businesses and would-be startups, there really is no upside to any of this. Either your service is going to be terrible or your probably going bankrupt. There’s really little wonder why so much innovation and investment money is fleeing the continent right now thanks to the law.

Drew Wilson on Twitter: @icecube85 and Google+.