Content moderation using AI

Everything in moderation, including moderation.

5 min readMay 27, 2022

Content moderation on Github. Source : Wiki

In this digital era, everyday, billions of images, posts, tweets, blogs, reviews, testimonials, comments, videos are being created and shared on various social media sites and a variety of communication channels. A lot of the content being generated by users of these respective platforms like Twitter, Facebook, Youtube, Tiktok are often unregulated and requires continuous monitoring. They could contain potentially malicious contents such as abuses, pornographic images, nudity, racial slurs or other unwanted content. They are required to be moderated, filtered & removed for protections and preserving fundamental rights.

Moderation generally refers to the practice of monitoring submissions and applying a set of rules which define what can be accepted and what is not.

There is a lot of debate regarding content moderation. Should content be moderated or not, and both sides have come up with equally good points. This is an opinion piece about hate speech law. On the other hand, there are equally good arguments for why hate speech comes under free speech and thus is a right.

Content moderation isn’t just limited to social platforms. Online retailers also use content moderation tools to display only quality, business-friendly content to consumers. A hotel booking website, for example, may leverage AI to scan all hotel room images and remove any that violate site rules (e.g., no people can be visible in a photo).

Besides, content moderation can be a useful tool in fighting misinformation, propaganda and user targeting. On the other hand, content moderation shouldn’t breach free speech, dissent and alternate opinion. Thus it must tread very carefully.

Now where does AI find itself amongst all of this? The amount of data created on the internet daily is 2.5 billion gigabytes, for reference, a 300 page book in data terms is anywhere from 400–800 kilobytes. The amount of data created daily is truly mindboggling. It is therefore obvious that manual moderation of websites like Facebook, Twitter and other social media is unreasonable ask. Thus AI content moderation seems like an obvious alternative, or at least AI assisted content moderation.

Even though social media companies have deployed AI content moderation algorithms, they haven’t been effective, and questions have been raised about their algorithms being biased.Twitter algorithms, however, have been found to be unbiased by research.

There are a few reasons why AI is not very effective at content moderation :

The context of a post often determines whether it violates the law or a content guideline, which the machine learning system ignores. Some information, such as the speaker’s identity or the sender and receiver of a message, might be included in a machine learning tool’s analysis, but this has significant privacy consequences. Other sorts of context, such as historical, political, and cultural context, are far more difficult to uncover using technology.

Let’s return for a moment to the earlier fact that social media was designed to maximize engagement. An accidental byproduct of that is the development of a polarizing environment, and thus it might be a conflict of interest for the algorithms which generate higher engagement versus the ones that moderate content, and by taking a look at the current scenario we can’t be sure of which one takes precedence.

Lack of representative and well-annotated training datasets
Machine learning systems develop their ability to detect and categorise different content based on the datasets we trained them on. Many systems are trained using publicly available labelled datasets; if these datasets do not include examples of speech in a variety of languages and from a variety of groups or communities, the resulting algorithms cannot comprehend these groups’ communication. Thus even if an algorithm works well for one language, it might not be effective at all for another. Sarcasm, nuance and context are really tough for AI to understand.

Facebook marked legitimate news articles about the coronavirus as spam at the outset of the pandemic. It mistakenly banned a Republican Party Facebook page for more than two months. And, it flagged posts and comments about the Plymouth Hoe, a public landmark in England, as offensive.

Things become much trickier when the content itself can’t be easily classified even by humans. A 2017 study by researchers in New York and Doha on hate speech detection found human coders reached a unanimous verdict in only 1.3% of cases.

“So long as people disagree about what crosses the line, no AI will be able to come up with a decision that all people view as legitimate and correct,” said Mitchell Gordon, a computer science PhD at Stanford.

However, the problem is tricky. Failing to flag content can have even more dangerous effects. The shooters in both the El Paso and Gilroy shootings published their violent intentions on 8chan and Instagram before going on their rampages. Robert Bowers, the accused perpetrator of the massacre at a synagogue in Pittsburgh, was active on Gab, a Twitter-esque site used by white supremacists. Misinformation about the war in Ukraine has received millions of views and likes across Facebook, Twitter, YouTube and TikTok.

Another issue is that many AI-based moderation systems exhibit racial biases that need to be addressed in order to create a safe and usable environment for everyone.

It ultimately boils down to this, AI based content moderation is essentially asking AI to understand human culture — a phenomenon too fluid and subtle to be described in simple, machine-readable rules.

Facebook reported in 2019 that its AI moderation systems successfully spotted 99.9% of spam, 99.3% of terrorist propaganda, 99.2% of child nudity and sexual exploitation, 98.9% of violent and graphic content, and 96.8% of adult nudity and sexual activity.

However, when it came to content involving drugs, firearms, hate speech, and bullying and harassment, Facebook’s AI performed far lower (at 83.3%, 69.9%, 65.4%, and 14.1%, respectively).

Content moderation in moderation
Content moderation could affect privacy with government regulation. Content moderation by social media companies in conjunction with governments could help fight racism, incitement of violence and targeting of certain communities. It could also be used to curb protected speech.

This can be seen with the Online Safety Bill introduced in the UK.UN human rights experts have also expressed concerns on new and draft laws in other countries including Australia, Brazil, Bangladesh, France, Singapore and Tanzania.

Unlike most things, where AI will end up replacing humans, content moderation is one of those things where both will need to work in conjunction with humans at the helm.

A report of the World Economic Forum stated that by the year 2025, there will be the creation of approximately 463 exabytes of data on daily basis. This quantity of content every day makes it intractable for the human moderators to maintain the pace of the work, even if it’s a big and highly skilled team. Even if we had 10 billion people working on content moderation (which is ridiculous), by rough calculation, that would mean 46 Gigabytes of data per person and in terms of images (23,000 images daily), which is a lot. And with ever increasing prowess of AI systems, a switch to a system, where humans take care of the high level stuff and the rest is taken care of by AI, is imperative. No matter how good AI systems become, deciding what is okay to say and what isn’t will always be a matter of opinion.

Content moderation using AI

Everything in moderation, including moderation.

Written by Abhi Avasthi