Earlier in the year, Google filed an application to patent new methods, systems, and media for what the giant calls “identifying videos containing objectionable content” that are uploaded to a social site or video service.
For example, YouTube – though the filing doesn’t explicitly name this platform.
The patent application, which has just been published this month, is somewhat different from other automated “methods and systems” Google and other giants, notably Microsoft, already have to power their censorship apparatus; with this one, the focus is more on how AI can be added to the mix.
More and more often, various countries are introducing censorship laws where the speed at which content is removed or accounts blocked is a major requirement made of social media companies. Google could have this in mind when the patent’s purpose is said to be to improve on detecting objectionable content quickly, “for potential removal.”
No surprise here, but what should be the key question – namely, what is considered as “objectionable content” – is less of a definition and more a list that can be further expanded, variously interpreted, etc., and the list includes such items as violence, pornography, objectionable language, animal abuse, and then the cherry on top – “and/or any other type of objectionable content.”
The filing details how Google’s new system works, and we equally unsurprisingly learn that AI here means machine learning (ML) and neural networks. This technology is supposed to mimic the human brain but comes down to a series of equations, differentiated from ordinary algorithms by “learning” about what an image (or a video in this case) is, pixel by pixel.
It seems from the description of the process we are looking at a type of generative adversarial network (GAN) system – two neural networks “competing” while “learning.”
Google says it creates “embeddings” from content or even metadata and compares that to other known objectionable content (both “float” in two separate “multi-dimensional spaces” that contain other “embeddings”).
That’s one way of saying, “huge data sets are needed here.”
Some observers see this and other similar efforts to produce “AI-based” censorship technology as logical, given the volume of content the platforms deal with.
But then there are all sorts of problems, including both false positives and negatives in the never-ending hunt for “disinformation” and “harms.”