Squash report card: Improvements during State of the Union … and how humans will make our AI smarter
We've had some encouraging improvements in the AI powering our experimental fact-checking technology. But to make Squash smarter, we're calling in a human.
By Bill Adair – February 23, 2020 | Print this article
Squash, the experimental pop-up fact-checking product of the Reporters’ Lab, is getting better.
Our live test during the State of the Union address on Feb. 4 showed significant improvement over our inaugural test last year. Squash popped up 14 relevant fact-checks on the screen, up from just six last year.
That improvement matches a general trend we’ve seen in our testing. We’ve had a higher rate of relevant matches when we use Squash on videos of debates and speeches.
But we still have a long way to go. This month’s State of the Union speech also had 20 non-relevant matches, which means Squash displayed fact-checks that weren’t related to what the president said. If you’d been watching at that moment, you probably would have thought, “What is Squash thinking?”
We’re now going to try two ways to make Squash smarter: a new subject tagging system that will be based on a wonderfully addictive game developed by our lead technologist Chris Guess; and a new interface that will bring humans into the live decision-making. Squash will recommend fact-checks to display, but an editor will make the final judgment.
Some background in case you’re new to our project: Squash, part of the Lab’s Tech & Check Cooperative, is a revolutionary new product that displays fact-checks on a video screen during a debate or political speech. Squash “hears” what politicians say, converts their speech to text and then searches a database of previously published fact-checks for one that’s related. When Squash finds one, it displays a summary on the screen.
For our latest tests, we’ve been using Elasticsearch, a tool for building search engines that we’ve made smarter with two filters: ClaimBuster, an algorithm that identifies factual claims, and a large set of common synonyms. ClaimBuster helps Squash avoid wasting time and effort on sentences that aren’t factual claims, and the synonyms help it make better matches.
Guess, assisted by project manager Erica Ryan and student developers Jack Proudfoot and Sanha Lim, will soon be testing a new way of matching that uses natural language processing based on the subject of the fact-check. We believe that we’ll get more relevant matches if the matching is based on subjects rather than just the words in the politicians’ claims.
But to make that possible, we have to put subject tags on thousands of fact-checks in our ClaimReview database. So Guess has created a game called Caucus that displays a fact-check on your phone and then asks you to assign subject tags to it. The game is oddly addictive. Every time you submit one, you want to do another…and another. Guess has a leaderboard so we can keep track of who is tagging the most fact-checks. We’re testing the game with our students and staff, but hope to make it public soon.
We’ve also decided that Squash needs a little human help. Guess, working with our student developer Matt O’Boyle, is building an interface for human editors to control which matches actually pop up on users’ screens.
The new interface would let them review the fact-check that Squash recommends and decide whether to let it pop up on the screen, which should help us filter out most of the unrelated matches.
That should eliminate the slightly embarrassing problem when Squash makes a match that is comically bad. (My favorite: one from last year’s State of the Union when Squash matched the president’s line about men walking on the moon with a fact-check on how long it takes to get a permit to build a road.)
Assuming the new interface works relatively well, we’ll try to do a public demo of Squash this summer.
Slowly but steadily, we are making progress. Watch for more improvements soon.