Lentis/Content Moderation

From Wikibooks, open books for an open world
Jump to navigation Jump to search


This chapter goes into depth about how content is moderated. Content moderation is the practice of monitoring and applying a predetermined set of guidelines to user-generated submissions to determine if the content (a post, in particular) is permissible or not.[1]

Categorization by Purpose[edit]

Laws and Morality[edit]

Content moderation can serve to maintain a clean network environment. Web search engines like Google and Bing implicitly conduct content moderation. Websites with illegal content such as slave auctions, smuggling, and drug trading are removed from public view, leading to the term "Dark Web".

In addition, there are many techniques used to moderate explicit content. One example is language filtering. In many chat rooms, there is usually a feature called "chat filter" which replaces socially offensive words with asterisks or other symbols. Even though it might not be able to completely stop verbal abuse, it tries to maintain a clean environment. Another example is video censorship. Besides age-restriction, video products are usually modified to remove certain content from an audience. For example, in Japanese anime, if the episode contains blood or nudity, those scenes will be covered by mosaic tiling or dots.

National Security[edit]

Information on classified military secrets are prohibited from being exposed to the public. If a picture of a US military base is made public, the FBI will quickly remove it and arrest the person responsible. Facetious posts might be exempted but will still be watched. For example, there were rumors that Area 51 contained alien technology. However, there was never any proof. Detailed discussion of sensitive technology such as quantum encryption, gene-targeting viruses, and nuclear reaction control are also being monitored.

Political Purpose[edit]

Content moderation can be regulated by the government. It is possible to direct the public opinion by controlling the information received by the public, along with self-efficacy campaigns.

Categorization by Method[edit]


Pre-moderation is a style of content moderation that is employed by companies whom care about their image above all else. Every piece of content released is curated and reviewed to make sure that it doesn't hurt their brand in any way or cause any legal issues.[2] Although pre-moderation isn't feasible for any platform that experiences a large influx of data, such as a social media platform, it can be helpful for company blogs or similar sites.


Post-moderation refers to a type of content-moderation in which content, once it's submitted to a platform, can be reviewed and taken down at any time if it's found that the post violates a site policy.[2] Post-moderation specifically is a form of blanket policy that applies to most platforms currently in use. Most companies will always reserve the right to remove content from their platforms if they find it to have violated any of their terms or conditions.

Reactive Moderation[edit]

Reactive moderation is a type of moderation in which a platform relies on their community in order to review and screen posts. The individuals viewing the content become responsible for determining whether or not the content is appropriate. If the content isn't appropriate, they are tasked with reporting it so that a moderator can view and delete if necessary.[2] This type of moderation is used on most social media sites, as it allows the site to leverage their large community as a solution to the influx of content.

Distributed Moderation[edit]

Distributed moderation is similar to reactive moderation in that it entrusts the community with moderating content, but rather than reporting only inappropriate content, the users vote on every piece of content submitted.[2] This most often leads to a form of group-think, in which the masses become able to determine when a form of content is not permissible.

Automated Moderation[edit]

Automated moderation is a type of moderation that specifically relies on automated tools to filter content.[2] This may include word filters, algorithms using text/word analysis, and more. Many believe that the expansion of this form of moderation will end up becoming the future of content moderation. Most sites currently use some form of automated moderation in their suite of content moderation tools, although in some cases the field hasn't progressed enough to be suitable by itself.

Status Quo[edit]

Regarding the status quo, there are three main companies that are similar in size and scope but employ different forms of content moderation to moderate their expansive communities. These are Facebook, Reddit, and YouTube.


Facebook mainly employs a type of reactive moderation, in which the community is responsible for flagging and reporting any explicit content. Not only this, Facebook also uses a lot of automated moderation, not so much for removing content as for detecting duplicate accounts.[3] Facebook is also the company that puts forth the highest investment into content moderation, and as such they're objectively the platform that is most successful at removing explicit content. However, the moderators tasked with cleaning up the posts end up suffering. Every day, just coming to work, they're exposed to the "worst of humanity". Many end up developing PTSD or depression, and can't continue working as a result.[4]


Reddit uses a style of content moderation they dubbed as "layered moderation". At its core, this is a combination of distributed moderation and reactive moderation. Users are responsible for "up-voting" and "down-voting" posts, acting as a form of moderator in which they curate high-quality information for other users to see. They can also report posts for "subreddit" moderators to manually review and escalate/remove if necessary.[5] Besides this, Reddit also employs a few tools for automated moderation, including the "AutoModerator", a bot that helps to automate a lot of the manual tasks that "subreddit" moderators must undergo.[6]


YouTube is unique in that they employ the most automated tools of any of the platforms mentioned. Not only are their algorithms used for recommending videos, but also for content moderation.[7] YouTube is also the one company mentioned where people can actually make a living by uploading content. As such, one of YouTube's main forms of moderation is "demonetization".[8] For offending accounts, YouTube also has a "three-strike" system in place. After their first warning, they undergo a series of progressive punishments until, if nothing changes, their account becomes banned.[9]

Case Study: Hong Kong[edit]

Initially, the 2019 Hong Kong Protest were just citizens peacefully marching against an extradition bill. It has since become violent. The protest was reported and interpreted with huge discrepancy in different places, leading to different reactions to the event. Content moderation has been confirmed to play a significant role in this case.

In mainland China, it was reported as a “rebellion” and “insurgence with conspiracy"[10][11], while in the United States, ABC refers to it as "pro-democracy" protests[12] and a fight for freedom. CNN reported that some NBA fans are also supporting the protest[13], which looks like a social norm campaign. There were reports on HK police abuse[14]. Some people in America have called for action to help the protesters.[15]

However, it was found that certain viewpoints are being hidden from the United States public. Facebook and Twitter were reported to be manipulating the story through content moderation, and have deleted nearly a thousand Chinese accounts.[16] All the removed Chinese accounts simply stated anti-protest opinions, but the sites claimed that those accounts were associated with the Chinese government.[17] Even though content moderation is not the primary reason some American people strongly favor the protest, it definitely affects public opinion.


Freedom of Speech[edit]

The use of content moderation by social media platforms has led to concerns about the implications on freedom of speech on these platforms. One reason for these concerns is the lack of transparency regarding rules governing content moderation. David Kaye, UN Special Rapporteur on freedom of opinion and expression, called the murkiness of the rules governing content moderation, "One of the greatest threats to online free speech today" adding that "companies impose rules they have developed without public input and enforced with little clarity"[18]. The different expectations of what content should be removed among users has only increased these concerns. An example of this is the reaction to Facebook's decision to not remove a doctored video of Nancy Pelosi, slowed down to make Pelosi appear inebriated. While some were frustrated by Facebook's inaction to contain the spread of misinformation, others applauded the company for protecting the freedom of speech on the platform[19].

Human Moderation[edit]

Contract Labor[edit]

Tech companies predominately use outsourced contract labor for moderation work. This allows companies to scale their operations globally at the expense of the workers, who are paid much less than salaried employees. At Cognizant, a contractor in Arizona supplying content moderation for Facebook, moderators made $15 and hour which is dwarfed by the median Facebook employee salary of $240,000 annually[4].

Psychological toll[edit]

Moderators manually review the most disturbing content on the internet, often without proper resiliency training and other services necessary to prepare them[20]. Moderators are also held to high standards when moderating content, with Facebook setting a target of 95% accuracy on moderator decisions[4], creating a chaotic environment with high turnover as many moderators are unable to maintain this accuracy. Companies try to help moderators cope with "wellness time", meant to allow traumatized workers to take a break. At Cognizant, employees were only allotted nine minutes of wellness time per day and this time was monitored to make sure workers were using this time correctly[4]. The long term effects of the exposure to disturbing content have led to former moderators developing PTSD-like symptoms. One example is Selena Scola, a former moderator for Facebook, who is suing the company after getting PTSD, arguing that the company does not have proper mental health services and monitoring in place for its content moderators[21].


The future of content moderation will include an increased focus on using AI and Machine Learning to automate moderation processes. The use of artificial neural networks and deep-learning technology have already helped automate tasks such as speech recognition, image classification, and natural language processing, lessening the burden on human moderators[22]. These applications of AI can make more precise moderation decisions than human moderators, but are only as effective as the extensiveness of their training. Currently there is an insufficient amount of examples of content to train AI models[22]. This lack of data leads to AI models being easily confused when content is presented in ways different than in training. Current AI solutions are also unable to comprehend context and intent that may be crucial to determining whether to remove a post. This can be seen in the discrepancy between Facebook's automated tools detection of nudity and hate speech, which are accurately detected 96% and 38% of the time respectively[23]. Because of these limitations with AI, a mix of automated moderation and human moderation will likely be the norm for some time.


There are some generalizable lessons that can be taken from the case of Content Moderation. One of these lessons is how transparency can affect user trust. The lack of transparency in moderation guidelines and enforcement is incredibly frustrating for users and lead to users reaching their own conclusions about why their posts are taken down, such as bias or believing it to be a false positive. Transparency would alleviate this problem which is why many are calling on tech companies to adopt guidelines such as The Santa Clara Principles to make the moderation process more transparent. Others can also learn from tech companies use of contract labor. For a dangerous job such as content moderation, low wages and insufficient benefits puts a large financial burden on workers who develop mental health conditions from their time as moderators.

Chapter Extension[edit]

Extensions to the casebook chapter could explore in more detail the current AI and Machine Learning technologies used today, the presence of bias in the moderation process, and how the phenomena of fake news will change the moderation process.


  1. Content Moderation[1]
  2. a b c d e Six Types of Content Moderation You Need to Know About[2]
  3. How does Facebook moderate its extreme content[3]
  4. a b c d The Secret Lives of Facebook Moderators in America[4]
  5. Reddit Security Report -- October 30, 2019[5]
  6. Full AutoModerator Documentation[6]
  7. YouTube Doesn't Know Where Its Own Line Is[7]
  8. The Yellow $: a comprehensive history of demonetization and YouTube’s war with creators[8]
  9. Community Guidelines Strike Basics[9]
  10. Truth about US behind HK Protest[10]
  11. Reinforcement Has Arrived in HK against the Rebellion[11]
  12. Hong Kong pro-democracy protests continue[12]
  13. NBA fans protest China with pro-Hong Kong T-shirt giveaway in Los Angeles[13]
  14. Hong Kong Police Crack Down on Student Protesters[14]
  15. Protect the rights of people in Hong Kong[15]
  16. Twitter and Facebook bans Chinese accounts amidst Hong Kong protests[16]
  17. Hong Kong protests: Twitter and Facebook remove Chinese accounts[17]
  18. UN Expert: Content moderation should not trample free speech[18]
  19. The Thorny Problem of Content Moderation and Bias[19]
  20. Underpaid and overburdened: the life of a Facebook moderator[20]
  21. Content Moderator Sues Facebook, Says Job Gave Her PTSD[21]
  22. a b Human Help Wanted: Why AI Is Terrible at Content Moderation[22]
  23. The Impossible Job: Inside Facebook’s Struggle to Moderate Two Billion People[23]