“Squash”

Squash report card: Improvements during State of the Union … and how humans will make our AI smarter

We've had some encouraging improvements in the AI powering our experimental fact-checking technology. But to make Squash smarter, we're calling in a human.

By Bill Adair – February 23, 2020 | Print this article

Squash, the experimental pop-up fact-checking product of the Reporters’ Lab, is getting better.

Our live test during the State of the Union address on Feb. 4 showed significant improvement over our inaugural test last year. Squash popped up 14 relevant fact-checks on the screen, up from just six last year.

That improvement matches a general trend we’ve seen in our testing. We’ve had a higher rate of relevant matches when we use Squash on videos of debates and speeches.

But we still have a long way to go. This month’s State of the Union speech also had 20 non-relevant matches, which means Squash displayed fact-checks that weren’t related to what the president said. If you’d been watching at that moment, you probably would have thought, “What is Squash thinking?”

We’re now going to try two ways to make Squash smarter: a new subject tagging system that will be based on a wonderfully addictive game developed by our lead technologist Chris Guess; and a new interface that will bring humans into the live decision-making. Squash will recommend fact-checks to display, but an editor will make the final judgment.

Some background in case you’re new to our project: Squash, part of the Lab’s Tech & Check Cooperative, is a revolutionary new product that displays fact-checks on a video screen during a debate or political speech. Squash “hears” what politicians say, converts their speech to text and then searches a database of previously published fact-checks for one that’s related. When Squash finds one, it displays a summary on the screen.

For our latest tests, we’ve been using Elasticsearch, a tool for building search engines that we’ve made smarter with two filters: ClaimBuster, an algorithm that identifies factual claims, and a large set of common synonyms. ClaimBuster helps Squash avoid wasting time and effort on sentences that aren’t factual claims, and the synonyms help it make better matches.

Guess, assisted by project manager Erica Ryan and student developers Jack Proudfoot and Sanha Lim, will soon be testing a new way of matching that uses natural language processing based on the subject of the fact-check. We believe that we’ll get more relevant matches if the matching is based on subjects rather than just the words in the politicians’ claims.

But to make that possible, we have to put subject tags on thousands of fact-checks in our ClaimReview database. So Guess has created a game called Caucus that displays a fact-check on your phone and then asks you to assign subject tags to it. The game is oddly addictive. Every time you submit one, you want to do another…and another. Guess has a leaderboard so we can keep track of who is tagging the most fact-checks. We’re testing the game with our students and staff, but hope to make it public soon.

We’ve also decided that Squash needs a little human help. Guess, working with our student developer Matt O’Boyle, is building an interface for human editors to control which matches actually pop up on users’ screens.

The new interface would let them review the fact-check that Squash recommends and decide whether to let it pop up on the screen, which should help us filter out most of the unrelated matches.

That should eliminate the slightly embarrassing problem when Squash makes a match that is comically bad. (My favorite: one from last year’s State of the Union when Squash matched the president’s line about men walking on the moon with a fact-check on how long it takes to get a permit to build a road.)

Assuming the new interface works relatively well, we’ll try to do a public demo of Squash this summer. 

Slowly but steadily, we are making progress. Watch for more improvements soon.

Back to top

Beyond the Red Couch: Bringing UX Testing to Squash

As automated fact-checking gains ground, it's time to learn how to make pop-up content crystal clear on video screens.

By Andrew Donohue – October 28, 2019 | Print this article

Fact-checkers have a problem.

They want to use technology to hold politicians accountable by getting fact-checks in front of the public as quickly as possible. But they don’t yet know the best ways to make their content understood. At the Duke Reporters’ Lab, that’s where Jessica Mahone comes in.

Jessica Mahone is designing tests to help Duke Reporters’ Lab researchers figure out how to clearly share fact-checks live during broadcasts. Photo by Andrew Donohue

The Lab is developing Squash, a tool built to bring live fact-checking of politicians to TV. Mahone, a social scientist, was brought on board to design experiments and conduct user experience (UX) tests for Squash. 

UX design is the discipline focused on making new products easy to use. A clear UX design means that a product is intuitive and new users get it without a steep learning curve. 

“If people can’t understand your product or find it hard to use, then you are doomed from the start. With Squash, this means that we want people to comprehend the information and be able to quickly determine whether a claim is true or not,” Mahone said

For Squash, fact-check content that pops up on screens needs to be instantly understood since it will only be visible for a few seconds. So what’s the best way?

Bill Adair, the director of the Duke Tech & Check Cooperative, organized some preliminary testing last year that he dubbed the red couch experiments. The tests revealed more research was needed to understand the best way to inform viewers. 

“I originally thought that all it would take is a Truth-O-Meter popping up on screen,” Adair said. “Turns out it’s much more complicated than that.”

Sixteen people watched videos of Barack Obama and Donald Trump delivering State of the Union speeches while fact-checks of some of what they said appeared on the screen. Ratings were true, false or something in between. Blink, a company specializing in UX testing, found that participants loved the concept of real-time fact-checking and would welcome it on TV broadcasts. But the design of the pop-up fact-checks often confused them.

It’s not just the quality of content that counts. Viewers must understand what they see very quickly. Squash may one day share fact-checks during live events, including State of the Union addresses.

Some viewers didn’t understand the fact-check ratings such as true or false when they were displayed. Others assumed the presidents’ statements must be true if no fact-check was shown. That’s a problem because Squash doesn’t fact-check all claims in speeches. It displays published previously fact-checks for only the claims that match Squash’s finicky search algorithm. 

The red couch experiments were “a very basic test of the concept,” Mahone said. “What they found mainly is that there was a need to do more diving in and digging into the some questions about how people respond to this. Because it’s actually quite complex.”

Mahone has developed a new round of tests scheduled to begin this week. These tests will use Amazon Mechanical Turk, an online platform that relies on people who sign up to be paid research subjects.

“One thing that came out of the initial testing was that people don’t like to see a rating of a fact-check,” Mahone said. “I was a little skeptical of that. Most of the social science research says that people do prefer things like that because it makes it a lot easier for them to make decisions.”

In this next phase, Mahone will recruit about 500 subjects. A third will see a summary of a fact-check with a PolitiFact TRUE icon. Another third will see a summary with the just the label TRUE. The rest will see just a summary text of a fact-check.

Each viewer will rank how interested they are in using an automated fact-checking tool after viewing the different displays. Mahone will compare the results.

After finding out if including ratings works, Mahone and three undergraduate students, Dora Pekec, Javan Jiang and Jia Dua, will look at the bigger picture of Squash’s user experience. They will use a company to find about 20 people to talk to, ideally individuals who consistently watch TV news and are familiar with fact-checking.

Participants will be asked what features they would want in real-time fact-checking.

“The whole idea is to ask people ‘Hey, if you had access to a tool that could tell you if what someone on TV is saying is true or false, what would you want to see in that tool?’ ” Mahone said. “We want to figure out what people want and need out of Squash.”

Figuring out how to make Squash intuitive is critical to its success, according to Chris Guess, the Lab’s lead technologist. Part of the challenge is that Squash is something new and viewers have no experience with similar products.

“These days, people do a lot more than just watch a debate. They’re cooking dinner, playing on their phone, watching over the kids,” Guess said. “We want people to be able to tune in, see what’s going on, check out the automated fact-checks and then be able to tune out without missing anything.”

Reporters’ Lab researchers hope to have Squash up and running for the homestretch of the 2020 presidential campaign. Adair, Knight Professor of the Practice of Journalism and Public Policy at Duke, has begun reaching out to television executives to gauge their interest in an automated fact-checking tool. 

“TV networks are interested, but they want to wait and see a product that is more developed.” Adair said. 

 

Back to top