Two new fact-checking browser extensions are trying something really challenging: automating the fact-checking process. By generating algorithmic scores for news online, these extensions are predicting whether particular web pages are likely to be true or false. We wondered if these products could really provide such a critical service, so we ran an analysis. Our finding? They are ambitious, but they are not quite ready for prime time.
During the course of several weeks, we ran 219 stories from 73 different media organizations through these extensions — NewsCracker and FactoidL— and tracked the algorithmic scores assigned to each story. The stories ranged from hard news and long-form features to sports and entertainment.
NewsCracker, founded and developed in 2017 by three 18-year-old college students, is available for download on the Chrome Web Store. According to its website, NewsCracker uses machine learning technology and statistical analysis “to contribute to the movement against ‘fake news’ by helping everyday Internet users think more critically about the articles they read.”
NewsCracker does not promise the truth, but it does “come pretty close.” Web pages receive ratings on a one to 10 scale for headline strength, neutrality and accuracy, which are then averaged into one overall score. NewsCracker trusts the article when the overall score is above 8.0, and it does not trust the article when the score is below 6.0. Articles scoring between 6.0 and 8.0 trigger a cautionary warning.
According to NewsCracker’s website, ratings are generated according to several criteria, including preliminary scores assigned to specific websites, the number of news outlets reporting on the same story, the number and sourcing of quotations, the number of biased words or phrases and the sentence length and structure. To assess the validity of a story’s factual claims, NewsCracker identifies “the five most important factual claims” and checks for their repetition in related news coverage.
Of the 219 stories we tested, 145 received ratings above 8.0, 65 received ratings between 6.0 and 8.0 and seven received ratings below 6.0 — meaning 66 percent of stories were dubbed trustworthy while only 3 percent were labeled “fake news.” NewsCracker “could not detect any news to score” from the final two stories we tested, both of which came from The Chronicle at Duke University.
The Washington Post had the highest average overall score, at 9.4, with Reuters finishing not far behind. InfoWars, Twitchy and American Thinker recorded the lowest average overall scores.
Significantly, local and campus news organizations — including The Durham Herald-Sun, The Boston Globe and The Chronicle at Duke University — had average overall scores below known fake news producer YourNewsWire.com as well as several other hyperpartisan outlets, such as Breitbart News. This may be because local news coverage is not often repeated elsewhere.
Additionally, the methodology, through which five facts are cross-checked against other coverage, may have the effect of penalizing outlets for original reporting. One BuzzFeed News story — which cites several sources by name, directly references related coverage and was eventually picked up by The Washington Post — received a 5.6 accuracy rating on the grounds that “many claims could not be verified.”
FactoidL — a project from Rochester Institute of Technology student Alexander Kidd also available for download on the Chrome Web Store — does not promise much from its algorithm, which it calls “Anaxagoras.” In fact, the extension’s online description warns that it is “currently very hit-or-miss.”
According to its description, FactoidL “is meant to be a quick, automated fact-checking tool that compares sentences you read to another source.”
FactoidL’s formula is simple. it identifies the number of fact-checkable statements — which it calls “factoids” — in any given story, and then Anaxagoras cleans each “factoid” by removing all “unimportant words” and queries Wikipedia for matches to the remaining words or phrases. For any web page, users can see the number and list of “factoids” as well as an accuracy percentage for the page.
This process is currently defective — most likely because only statements that align with Wikipedia descriptions are identified as true or accurate. The 219 stories tested turned out an average of approximately 60 factoids and an average accuracy percentage of approximately 0.9 percent. Of these 219 stories, 154 were rated as 0 percent accurate, while 12 were rated as 5 percent accurate or higher and only one was rated as high as 10 percent accurate.
The story with the highest number of “factoids” — from YourNewsWire.com — registered 2,645 “factoids,” but many could be discounted as claims that were not factual. FactoidL has a tendency, for example, to mark the dateline, byline and headline of a story as “factoids.” It often counts opinion statements, as well.
Where NewsCracker is not yet ready for prime time, FactoidL has a long way to go. Very few news articles from reputable journalistic outlets are actually less than 10 percent accurate. The fact that FactoidL rated all stories tested by the Lab as less than 10 percent accurate implies that the extension is not just “hit-or-miss” with its algorithm; it is missing every time.
The code powering FactoidL is available on GitHub, and interested parties can provide feedback or even volunteer to contribute.
The future is bright
Any new technology is going to hit some bumps along the way, with bugs and breakdowns to be expected. These young developers are trying something really ambitious in a way that is both innovative and exciting. We admire the spirit of their extensions and hope to see them developed further.