Facebook’s algorithms for detecting hate speech are working harder than ever. If only we knew how good they are at their jobs.
Tuesday the social network reported a big jump in the number of items removed for breaching its rules on hate speech. The increase stemmed from better detection by the automated hate-speech sniffers developed by Facebook’s artificial intelligence experts.
The accuracy of those systems remains a mystery. Facebook doesn’t release, and says it can’t estimate, the total volume of hate speech posted by its 1.7 billion daily active users.
Facebook has released quarterly reports on how it is enforcing its standards for acceptable discourse since May 2018. The latest says the company removed 9.6 million pieces of content it deemed hate speech in the first quarter of 2020, up from 5.7 million in the fourth quarter of 2019. The total was a record, topping the 7 million removed in the third quarter of 2019.
Of the 9.6 million posts removed in the first quarter, Facebook said its software detected 88.8 percent before users reported them. That indicates algorithms flagged 8.5 million posts for hate speech in the quarter, up 86 percent from the previous quarter’s total of 4.6 million.
In a call with reporters, Facebook chief technology officer Mike Schroepfer touted advances in the company’s machine learning technology that parses language. “Our language models have gotten bigger and more accurate and nuanced,” he said. “They’re able to catch things that are less obvious.”
Schroepfer wouldn’t specify how accurate those systems now are, saying only that Facebook tests systems extensively before they are deployed, in part so that they do not incorrectly penalize innocent content.
He cited figures in the new report showing that although users had appealed decisions to take down content for hate speech more often in the most recent quarter—1.3 million times—fewer posts were subsequently restored. Facebook also said Tuesday it had altered its appeals process in late March, reducing the number of appeals logged, because Covid-19 restrictions shut some moderation offices.
Facebook’s figures do not indicate how much hate speech slips through its algorithmic net. The company’s quarterly reports estimate the incidence of some types of content banned under Facebook’s rules, but not hate speech. Tuesday’s release shows violent posts declining since last summer. The hate speech section says Facebook is “still developing a global metric.”
The missing numbers shroud the true size of the social networks’s hate speech problem. Caitlin Carlson, an associate professor at Seattle University, says the 9.6 million posts removed for hate speech look suspiciously small compared with Facebook’s huge network of users, and users’ observations of troubling content. “It’s not hard to find,” Carlson says.