The lessons of Squash, our groundbreaking automated fact-checking platform

Squash began as a crazy dream.

Soon after I started PolitiFact in 2007, readers began suggesting a cool but far-fetched idea. They wanted to see our fact checks pop up on live TV.

That kind of automated fact-checking wasn’t possible with the technology available back then, but I liked the idea so much that I hacked together a PowerPoint of how it might look. It showed a guy watching a campaign ad when PolitiFact’s Truth-O-Meter suddenly popped up to indicate the ad was false.

Bill Adair’s original depiction of pop-up fact-checking.

It took 12 years, but our team in the Duke University Reporters’ Lab managed to make the dream come true. Today, Squash (our code name for the project, chosen because it is a nutritious vegetable and a good metaphor for stopping falsehoods) has been a remarkable success. It displays fact checks seconds after politicians utter a claim and it largely does what those readers wanted in 2007.

But Squash also makes lots of mistakes. It converts politicians’ speech to the wrong text (often with funny results) and it frequently stays idle because there simply aren’t enough claims that have been checked by the nation’s fact-checking organizations. It isn’t quite ready for prime time.

As we wrap up four years on the project, I wanted to share some of our lessons to help developers and journalists who want to continue our work. There is great potential in automated fact-checking and I’m hopeful that others will build on our success.

When I first came to Duke in 2013 and began exploring the idea, it went nowhere. That’s partly because the technology wasn’t ready and partly because I was focused on the old way that campaign ads were delivered — through conventional TV. That made it difficult to isolate ads the way we needed to.

But the technology changed. Political speeches and ads migrated to the web and my Duke team partnered with Google, Jigsaw and to create ClaimReview, a tagging system for fact-check articles. Suddenly we had the key elements that made instant fact-checking possible: accessible video and a big database of fact checks.

I wasn’t smart enough to realize that, but my colleague Mark Stencel, the co-director of the Reporters’ Lab, was. He came into my office one day and said ClaimReview was a game changer. “You realize what you’ve done, right? You’ve created the magic ingredient for your dream of live fact-checking.” Um … yes! That had been my master plan all along!

Fact-checkers use the ClaimReview tagging system to indicate the person and claim being checked, which not only helps Google highlight the articles in search results, it also makes a big database of checks that Squash can tap.

It would be difficult to overstate the technical challenge we were facing. No one had attempted this kind of work beyond doing a demo, so there was no template to follow. Fortunately we had a smart technical team and some generous support from the Knight Foundation, Craig Newmark and Facebook.

Christopher Guess, our wicked-smart lead technologist, had to invent new ways to do just about everything, combining open-source tools with software that he built himself. He designed a system to ingest live TV and process the audio for instant fact-checking. It worked so fast that we had to slow down the video.

To reduce the massive amount of computer processing, a team of students led by Duke computer science professor Jun Yang came up with a creative way to filter out sentences that did not contain factual claims. They used ClaimBuster, an algorithm developed at the University of Texas at Arlington, to act like a colander that kept only good factual claims and let the others drain away.

Squash works by converting audio to text and then matching the claim against a database of fact-checks.

Today, this is how Squash works: It “listens” to a speech or debate, sending audio clips to Google Cloud that are converted to text. That text is then run through ClaimBuster, which identifies sentences the algorithm believes are good claims to check. They are compared against the database of published fact checks to look for matches. When one is found, a summary of that fact check pops up on the screen.

The first few times you see the related fact check appear on the screen, it’s amazing. I got chills. I felt was getting a glimpse of the future. The dream of those PolitiFact readers from 2007 had come true.

But …

Look a little closer and you will quickly realize that Squash isn’t perfect. If you watch in our web mode, which shows Squash’s AI “brain” at work, you will see plenty of mistakes as it converts voice to text. Some are real doozies.

Last summer during the Democratic convention, former Iowa Gov. Tom Vilsack said this: “The powerful storm that swept through Iowa last week has taken a terrible toll on our farmers ……”

But Squash (it was really Google Cloud) translated it as “Armpit sweat through the last week is taking a terrible toll on our farmers.”

Squash’s matching algorithm also makes too many mistakes finding the right fact check. Sometimes it is right on the money. It often correctly matched then-President Donald Trump’s statements on China, the economy and the border wall.

But other times it comes up with bizarre matches. Guess and our project manager Erica Ryan, who spends hours analyzing the results of our tests, believe this often happens because Squash mistakenly thinks an individual word or number is important. (Our all-time favorite was in our first test, when it matched a sentence by President Trump about men walking on the moon with a Washington Post fact-check about the bureaucracy for getting a road permit. The match occurred because both included the word years.)

Squash works by detecting politicians’ claims and matching them with related fact-checks. (Screengrab from Democratic debate)

To reduce the problem, Guess built a human editing tool called Gardener that enables us to weed out the bad matches. That helps a lot because the editor can choose the best fact check or reject them all.

The most frustrating problem is that a lot of time, Squash just sits there, idle, even when politicians are spewing sentences packed with factual claims. Squash is working properly, Guess assures us, it just isn’t finding any fact checks that are even close. This happened in our latest test, a news conference by President Joe Biden, when Squash could muster only two matches in more than an hour.

That problem is a simple one: There simply are not enough published fact checks to power Squash (or any other automated app).

We need more fact checks – As I noted in the previous section, this is a major shortcoming that will hinder anyone who wants to draw from the existing corpus of fact checks. Despite the steady growth of fact-checking in the United States and around the world, and despite the boom that occurred in the Trump years, there simply are not enough fact checks of enough politicians to provide enough matches for Squash and similar apps.

We had our greatest success during debates and party conventions, events when Squash could draw from a relatively large database of checks on the candidates from PolitiFact, and The Washington Post. But we could not use Squash on state and local events because there simply were not enough fact-checks for possible matches.

Ryan and Guess believe we need dozens of fact checks on a single candidate, across a broad range of topics, to have enough to make Squash work.

More armpit sweat is needed to improve voice to text – We all know the limitations of Siri, which still translates a lot of things wrong despite years of tweaks and improvements by Apple. That’s a reminder that improving voice-to-text technology remains a difficult challenge. It’s especially hard in political events when audio can be inconsistent and when candidates sometimes shout at each other. (Identifying speakers in debates is yet another problem.)

As we currently envision Squash and this type of automated fact-checking, we are reliant on voice-to-text translations, but given the difficulty of automated “hearing,” we’ll have to accept a certain error level for the foreseeable future.

Matching algorithms can be improved – This is one area that we’re optimistic about. Most of our tests relied on off-the-shelf search engines to do the matching, until Guess began to experiment with a new approach to improve the matching. That approach relies on subject tags (which unfortunately are not included in ClaimReview) to help the algorithm make smarter choices and avoid irrelevant choices.

The idea is that if Squash knows the claim is about guns, it would find the best matches from published fact checks that have been tagged under the same subject. Guess found this approach promising but did not get a chance to try the approach at scale.

Until the matching improves, we’ve found humans are still needed to monitor and manage anything that gets displayed — as we did with our Gardener tool.

Ugh, UX – The simplest part of my vision, the Truth-O-Meter popping up on the screen, ended up being one of our most complex challenges. Yes, Guess was able to make the meter or the Washington Post Pinocchios pop up, but what were they referring to? This question of user experience was tricky in several ways.

First, we were not providing an instant fact check of the statement that was just said. We were popping up a summary of a related fact check that was previously published. Because politicians repeat the same talking points, the statements were generally similar and in some cases, even identical. But we couldn’t guarantee that, so we labeled the pop-up “Related fact-check.”

Second, the fact check appeared during a live, fast-moving event. So we realized it could be unclear to viewers which previous statement the pop-up referred to. This was especially tricky in a debate when candidates traded competing factual claims. The pop-up could be helpful with either of them. But the visual design that seemed so simple for my PowerPoint a decade earlier didn’t work in real life. Was that “False” Truth-O-Meter for the immigration statement Biden said? Or the one that Trump said?

Another UX problem: To give people time to read all the text (the related fact checks sometimes had lengthy statements), Guess had them linger on the screen for 15 seconds. And our designer Justin Reese made them attractive and readable. But by the end of that time the candidates might have said two more factual claims, further confusing viewers that saw the “False” meter.

So UX wasn’t just a problem, it was a tangle of many problems involving limited space on the screen (What should we display and where? Will readers understand the concept that the previous fact check is only related to what was just said?), time (How long should we display it in relation to when the politician spoke?) and user interaction (Should our web version allow users to pause the speech or debate to read a related fact check?). It’s an enormously complicated challenge.

* * *

Looking back at my PowerPoint vision of how automated fact-checking would work, we came pretty close. We succeeded in using technology to detect political speech and make relevant fact checks automatically pop up on a video screen. That’s a remarkable achievement, a testament to groundbreaking work by Guess and an incredible team.

But there are plenty of barriers that make it difficult for us to realize the dream and will challenge anyone who tries to tackle this in the future. I hope others can build on our successes, learn from our mistakes, and develop better versions in years to come.

A better ClaimReview to grow a global fact-check database

It’s now much easier for fact-checkers to use ClaimReview, a tagging tool that logs fact-checks published around the world into one database. The tool helps search engines — and readers — find non-partisan fact-checks published globally. It also organizes fact-check content into structured data that automated fact-checking will require.

Currently, only half of the roughly 160 fact-checking organizations that the Duke Reporters’ Lab tracks globally use ClaimReview. In response, Google and the Duke Reporters’ Lab have developed an easier method of labelling the articles to help both recruit more users and expand a vital fact-check data set.

The locations of only some fact-checkers tracked by the Reporters’ Lab are visible here. A revised ClaimReview may help more log their fact-checks into a growing, global database.

ClaimReview was created in 2015 after a conversation between staff at Google and Glenn Kessler, the Washington Post fact-checker. Kessler wanted Google to highlight fact-checks in its search results. Bill Adair, director of the Duke Reporters’ Lab,  was soon brought in to help.

Dan Brickley from, Justin Kosslyn from Google and Adair developed a tagging system based on the schemas maintained by, an organization that develops structured ways of organizing information. They created a universal system for fact-checkers to label their articles to include the claim checked, who said it and a ruling on its accuracy. “It’s the infrastructure that provides the atomic unit of fact-checking to search engines,” Adair said.

Initially, ClaimReview produced a piece of code that fact-checkers copy and pasted into their online content management system. Google and other search engines look for the code when crawling content. Next, Chris Guess of Adair’s team developed a ClaimReview widget called Share the Facts, a content box summarizing fact-checks that PolitiFact, and the Washington Post can publish online and share on social media.

The latest version of ClaimReview no longer requires users to copy and paste the code, which can behave inconsistently on different content management systems. Instead, fact-checkers only have to fill out Google form fields similar to what they used previously to produce the code.

While the concept of ClaimReview is simple, it opens to the door to more innovation in fact-checking. It organizes data in ways that can be reused. By “structuring journalism, we can present content in more valuable ways to people,” said Adair.

By labeling fact-checks, the creators effectively created a searchable database of fact-checks, numbering about 24,000 today. The main products under development at the Reporters’ Lab, from FactStream to Squash, rely on fact-check databases. Automated fact-checking especially requires a robust database to quickly match untrue claims to previously published fact-checks.

Bill Adair presenting at Tech & Check 2019

The database ClaimReview builds offers even more possibilities. Adair hopes to tweak the fields fact-checkers fill in to provide better summaries of the fact-checks and provide more information to readers. In addition, Adair envisions ClaimReview being used to tag types of misinformation, as well as authors and publishers of false content. It could also tag websites that have a history of publishing false or misleading articles.

The tagging already is already benefiting some fact-check publishers. “ClaimReview helps to highlight and surface our fact-checks on Google, more than the best SEO skills or organic search would be able to achieve,” said Laura Kapelari, a journalist with Africa Check. ClaimReview has increased traffic on Africa Check’s website and helped the smaller Africa Check compete with larger media houses, she said. It also helps fact-checkers know which facts have already been investigated, which reduces redundant checks.

Joel Luther, the ClaimReview project manager in the Reporters’ Lab, expects this new ClaimReview format will save fact-checkers time and decrease errors when labeling fact-checks. However, there is still room to grow. Kapelari wishes there was a way for the tool to automatically grab key fields such as names in order to save time.

The Reporters’ Lab has a plan to promote ClaimReview globally. Adair is already busy on that front. Early this month, a group of international fact-checkers and technologists met in Durham for Tech & Check 2019, an annual conference where people on this quest share progress on automated fact-checking projects intended to fight misinformation. Adair, an organizer of Tech & Check, emphasized new developments with ClaimReview, as well as its promise for automating fact-checking.

Not much would be possible without this tool, he stressed. “It’s the secret sauce.”

Reporters’ Lab students are fact-checking North Carolina politicians

Duke Reporters’ Lab students expanded vital political journalism during a historic midterm campaign season this fall with the North Carolina Fact-Checking Project.

Five student journalists reviewed thousands of statements that hundreds of North Carolina candidates vying for state and federal offices made online and during public appearances. They collected newsy and checkable claims from what amounted to a firehose of political claims presented as fact.

Duke computer science undergraduates with the Duke Tech & Check Cooperative applied custom-made bots and the ClaimBuster algorithm to scrape and sort checkable political claims from hundreds of political Twitter feeds.

Editors and reporters then selected claims the students had logged for most of the project’s 30 plus  fact-checks and six summary articles that the News and Observer and PolitiFact North Carolina published between August and November.

Duke senior Bill McCarthy

Duke senior Bill McCarthy was part of the four-reporter team on the project, which the North Carolina Local News Lab Fund supported to expand local fact-checking during the 2018 midterms and beyond in a large, politically divided and politically active state.

“Publishing content in any which way is exciting when you know it has some value to voters, to democracy,” said McCarthy, who interned at PolitiFact in Washington, D.C. last summer. “It was especially exciting to get so many fact-checks published in so little time.”

Reporters found politicians and political groups often did not stick with the facts during a campaign election season that that fielded an unusually large number of candidates statewide and a surge in voter turnout.

The N.C. Fact-Checking Project produces nonpartisan journalism

NC GOP falsely ties dozens of Democrats to single-payer health care plan,” read one project fact-check headline. “Democrat falsely links newly-appointed Republican to health care bill,” noted another.  The fact-check “Ad misleads about NC governors opposing constitutional amendments” set the record straight about some Democratic-leaning claims about six proposed amendments to the state constitution.

And on and on.

Digging for the Truth

Work in the lab was painstaking. Five sophomores filled weekday shifts to scour hundreds of campaign websites, social media feeds, Facebook and Google political ads, televised debates, campaign mailers and whatever else they could put their eyes on. Often they recorded one politician’s attacks on an opponent that might, or might not, be true.

Students scanned political chatter from all over the state, tracking competitive state and congressional races most closely. The resulting journalism was news that people could use as they were assessing candidates for the General Assembly and U.S. Congress as well as six proposed amendments to the state constitution.

The Reporters’ Lab launched a mini news service to share each fact-checking article with hundreds of newsrooms across the state for free.

One of more than 30 N.C. Fact-Checking Project articles

The Charlotte Observer, a McClatchy newspaper like the N&O, published several checks. So did smaller publications such as Asheville’s Citizen-Times  and the Greensboro News and Record. Newsweek cited  a fact-check report by the N&O’s Rashaan Ayesh and Andy Specht about a fake photo of Justice Kavanaugh’s accuser, Christine Blasey Ford, shared by the chairman of the Cabarrus County GOP, which WRAL referenced in a roundup.

Project fact-checks influenced political discourse directly too. Candidates referred to project fact-checks in campaign messaging on social media and even in campaign ads. Democrat Dan McCready, who lost a close race against Republican Mark Marris in District 9, used project fact-checks in two campaign ads promoted on Facebook and in multiple posts on his Facebook campaign page, for instance.

While N&O reporter Andy Specht was reporting a deceptive ad from the Stop Deceptive Amendments political committee, the group announced plans to change it.

The fact-checking project will restart in January, when North Carolina’s reconfigured General Assembly opens its first 2019 session.


Lessons learned from fact-checking 2018 midterm campaigns

Five Duke undergraduates monitored thousands of political claims this semester during a heated midterm campaign season for the N.C. Fact-Checking Project.

That work helped expand nonpartisan political coverage in a politically divided state with lots of contested races for state and federal seats this fall. The effort resumes in January when the project turns its attention to a newly configured North Carolina General Assembly.

Three student journalists who tackled this work with fellow sophomores Alex Johnson and Sydney McKinney reflect on what they’ve learned so far.

Lizzie Bond

Lizzie Bond: After spending the summer working in two congressional offices on Capitol Hill, I began my work in the Reporters’ Lab and on the N.C. Fact-Checking Project with first-hand knowledge of how carefully elected officials and their staff craft statements in press releases and on social media. This practice derives from a fear of distorting the meaning or connotation of their words. And in this social media age where so many outlets are available for sharing information and for people to consume it, this fear runs deep.

Yet, it took me discovering one candidate for my perspective to shift on the value of our work with the N.C. Fact-Checking Project. That candidate, Peter Boykin, proved to be a much more complicated figure than any other politician whose social media we monitored. The Republican running to represent Greensboro’s District 58 in the General Assembly, Boykin is the founder of “Gays for Trump,” a former online pornography actor, a Pro-Trump radio show host, and an already controversial, far-right online figure with tens of thousands of followers. Pouring through Boykin’s nearly dozen social media accounts, I came across everything from innocuous self-recorded music video covers to contentious content, like hostile characterizations of liberals and advocacy of conspiracy theories, like one regarding the Las Vegas mass shooting which he pushed with little to no corroborating evidence.

When contrasting Boykin’s posts on both his personal and campaign social media accounts with the more cautious and mild statements from other North Carolina candidates, I realized that catching untruthful claims has a more ambitious goal that simply detecting and reporting falsehoods. By reminding politicians that they should be accountable to the facts in the first place, fact-checking strives to improve their commitment to truth-telling. The push away from truth and decency in our politics and toward sharp antagonism and even alternate realities becomes normalized when Republican leaders support candidates like Boykin as simply another GOP candidate. The N.C. Fact-Checking Project is helping to revive truth and decency in North Carolina’s politics and to challenge the conspiracy theories and pants-on-fire campaign claims that threaten the self-regulating, healthy political society we seek.

Ryan Williams

Ryan Williams: I came into the Reporters’ Lab with relatively little journalism experience. I spent the past summer working on social media outreach & strategy at a non-profit where I drafted tweets and wrote the occasional blog post. But I’d never tuned into writing with the immense brevity of political messages during an election season. The N.C. Fact-Checking Project showed me the importance of people who not only find the facts are but who report them in a nonpartisan, objective manner that is accessible to an average person.

Following the 2016 election, some people blamed journalists and pollsters for creating false expectations about who would win the presidency. I was one of those critics. In the two and a half months I spent fact-checking North Carolina’s midterm races, I learned how hard fact-checkers and reporters work. My fellow fact-checkers and I compiled a litany of checkable claims made by politicians this midterm cycle. Those claims, along with claims found by the automated claim-finding algorithm ClaimBuster were raw material for many fact-checks of some of North Carolina hottest races. Those checks were made available for voters ahead of polling.

Now that election day has come and gone, I am more than grateful for this experience in fact-finding and truth-reporting. Not only was I able to hone research skills, I gained a deeper understanding of the intricacies of political journalism. I can’t wait to see what claims come out of the next two years leading up to, what could be, the presidential race of my lifetime.

Jake Sheridan

Jake Sheridan: I’m a Carolina boy who has grown up on the state’s politics. I’ve worked on campaigns, went to the 2012 Democratic National Committee in my hometown of Charlotte and am the son of a long-time news reporter. I thought I knew North Carolina politics before working in the Reporter’s Lab. I was wrong.

While trying to wrap my head around the 300-plus N.C. races, I came to better understand the politics of this state. What matters in the foothills of the Piedmont, I found out, is different than what matters on the Outer Banks and in Asheville. I discovered that campaigns publicly release b-roll so that PACs can create ads for them and saw just how brutal attack ads can be. I got familiar with flooding and hog farms, strange politicians and bold campaign claims.

There was no shortage of checkable claims. That was good for me. But it’s bad for us. I trust politicians less now. The ease with which some N.C. politicians make up facts troubles me. Throughout this campaign season in North Carolina, many politicians lied, misled and told half truths. If we want democracy to work — if we want people to vote based on what is real so that they can pursue what is best for themselves and our country — we must give them truth. Fact-checking is essential to creating that truth. It has the potential to place an expectation of explanation upon politicians making claims. That’s critical for America if we want to live in a country in which our government represents our true best interests and not our best interests in an alternate reality.


Duke students tackle big challenges in automated fact-checking

Three Duke computer science majors advanced the quest for what some computer scientists say is the Holy Grail in fact-checking this summer.

Caroline Wang, Ethan Holland and Lucas Fagan tackled major challenges to creating an automated system that can both detect factual claims while politicians speak and instantly provide fact-checks.

That required finding and customizing state-of-art computing tools that most journalists would not recognize. A collective fondness for that sort of challenge helped, a lot.

Duke junior Caroline Wang

“We had a lot of fun discussing all the different algorithms out there, and just learning what machine learning techniques had been applied to natural language processing,” said Wang, a junior also majoring in math.

Wang and her partners took on the assignment for a Data+ research project. Part of the Information Initiative at Duke, Data+ invites students and faculty to find data-driven solutions to research challenges confronting scholars on campus.

The fact-checking team convened in a Gross Hall conference from 9 am to 4 pm every weekday for 10 weeks to help each other figure out how to help achieve live fact-checking, a goal of Knight journalism professor Bill Adair and other practitioners of accountability journalism.

Their goal was to do something of a “rough cut” of end-to-end automated fact-checking: to convert a political speech to text, identify the most “checkable” sentences in the speech and then match them with previously published fact-checks.

The students concluded that Google Cloud Speech-to-Text API was the best available tool to automate audio transcriptions. They then submitted the sentences to ClaimBuster, a project at the University of Texas at Arlington that the Duke Tech & Check Cooperative uses to identify statements that merit fact-checking. ClaimBuster acted as a helpful filter that reduced the number of claims submitted to the database, which in turn reduced processing time.

They chose Google Cloud speech-to-text because it can infer where punctuation belongs, Holland said. That yields text divided into complete thoughts. Google speech-to-text also shares transcription results while it processes the audio, rather than waiting until translation is done. That speeds up how fast the new text can get moved to the next steps along a fact-checking pipeline.

Duke junior Ethan Holland

“Google will say: This is my current take and this is my current confidence that take is right. That lets you cut down on the lag,” said Holland, a junior whose second major is statistics.

Their next step was finding ways to match the claims from that speech with the database of fact-checks that came from the Lab’s Share the Facts project. (The database contains thousands of articles published by the Washington Post, and PolitiFact, each checking an individual claim.)

To do that, the students adapted an algorithm that the open-source research outfit OpenAI released in June, after the students started working together. The algorithm builds on The Transformer, a new neural network computing architecture that Google researchers published just six months prior.

Duke sophomore Lucas Fagan

The architecture alters how computers organize trying to understand written language. Instead of translating a sentence word by word, The Transformer weighs the importance of each word to the meaning of every other word. Over time that system helps machines discern meaning in more and more sentences more quickly.

“It’s a lot more like learning English. You grow up hearing it and your learn it,” said Fagan, a sophomore also majoring in math.

Work by Wang, Holland and Fagan is expected to help jumpstart a Bass Connections fact-checking team that started this fall. Students on that team will continue the hunt for better strategies to find statements that are good fact-check candidates, produce pop-up fact-checks and create apps to deliver this accountability journalism to more people.

Tech & Check has $1.2 million in funding from the John S. and James L. Knight Foundation, the Facebook Journalism Project and the Craig Newmark Foundation to tackle that job.

At Tech & Check, some new ideas to automate fact-checking

Last week, journalists and technologists gathered at Duke to dream up new ways that automation could help fact-checking.

The first Tech & Check conference, sponsored by the Duke Reporters’ Lab and Poynter’s International Fact-Checking Network, brought together about 50 journalists, students and computer scientists. The goal was to showcase existing projects and inspire new ones.

Tech and Check photo
At Tech & Check, groups of students, journalists and technologists dreamed up new ideas to automate fact-checking.

The participants included representatives of Google, IBM, NBC News, PolitiFact, Full Fact, and the WRAL-TV. From the academic side, we had faculty and Ph.D students from Duke, the University of North Carolina, University of Texas-Arlington, Indiana University and the University of Michigan.

The first day featured presentations about existing projects that automate some aspect of fact-checking; the second day, attendees formed groups to conceive new projects.

The presentations showcased a wide variety of tools and research projects. Will Moy of the British site Full Fact did a demo of his claim monitoring tool that tracks the frequency of talking points, showing how often politicians said the phrase over time. Naeemul Hassan of the University of Texas at Arlington showed ClaimBuster, a project I’ve worked on, that can ingest huge amounts of text and identify factual claims that journalists might want to check.

IBM’s Ben Fletcher showed one of the company’s new projects known as Watson Angles, a tool that extracts information from Web articles and distills it into a summary that includes key players and a timeline of events. Giovanni Luca Ciampaglia, a researcher at Indiana University, showed a project that uses Wikipedia to fact-check claims.

On the second day, we focused on the future. The attendees broke into groups to come up with new ideas for research. The groups had 75 minutes to create three ideas for tools or further research. The projects showed the many ways that automation can help fact-checking.

One promising idea was dubbed “Parrot Score,” a website that could build on the approach that Full Fact is exploring for claim monitoring. It would track the frequency of claims and then calculate a score for politicians who use canned phrases more often. Tyler Dukes, a data journalist from WRAL-TV in Raleigh, N.C., said Parrot Score could be a browser extension that showed the origin of a claim and then tracked it through the political ecosystem.

Despite the focus on the digital future of journalism, we used Sharpies and a lot of Post-It notes.
Despite the focus on the digital future of journalism, we used Sharpies and a lot of Post-It notes.

Two teams proposed variations of a “Check This First” button that would allow people to verify the accuracy of a URL before they post it on Facebook or in a chat. One team dubbed it “ChatBot.” Clicking it would bring up information that would help users determine if the article was reliable.

Another team was assigned to focus on ways to improve public trust in fact-checkers. The team came up with several interesting ideas, including more transparency about the collective ratings for individual writers and editors as well as a game app that would simulate the process that journalists use to fact-check a claim. The app could improve trust by giving people an opportunity to form their own conclusions as well as demonstrating the difficult work that fact-checkers do.

Another team, which was focused on fact-checker tools, came up with some interesting ideas for tools. One would automatically detect when the journalists were examining a claim they had checked before.  Another tool would be something of a “sentence finisher” that, when a journalist began typing something such as “The unemployment rate last month…” would finish the sentence with the correct number.

The conference left me quite optimistic about the potential for more collaboration between computer scientists and fact-checkers. Things that never seemed possible, such as checking claims against the massive Wikipedia database, are increasingly doable. And many technologists are interested in doing research and creating products to help fact-checking.

Reporters’ Lab, IFCN to host conference about automated fact-checking

The Reporters’ Lab and Poynter’s International Fact-Checking Network will host “Tech & Check”, the first conference to explore the promise and challenges of automated fact-checking.

Tech & Check, to be held March 31-April 1 at Duke University, will bring together experts from academia, journalism and the tech industry. The conference will include:

  1. Demos and presentations of current research that automates fact-checking
  2. Discussions about the institutional challenges of expanding the automated work
  3. Discussions on new areas for exploration, such as live fact-checking and automated annotation.

Research in computational fact-checking has been underway for several years, but has picked up momentum with a flurry of new projects.

While automating fact-checking entirely is still the stuff of science fiction, parts of the fact-checking process such as gathering fact-checkable claims or matching them with articles already published seem ripe for automation. As natural language processing (NLP) and other artificial intelligence tools become more sophisticated, the potential applications for fact-checking will increase.

Indeed, around the world several projects are exploring ways to make fact-checking faster and smarter through the use of technology. For example, at Duke University, an NSF-funded project uses computational power to help fact-checkers verify common claims about the voting records of members of Congress. The University of Texas-Arlington has developed a tool called ClaimBuster that can analyze long transcripts of debates and suggest sentences that could be fact-checked. At Indiana University, researchers have experimented with a tool that uses Wikipedia and knowledge networks to verify simple statements. Fact-checkers in France, Argentina, the U.K. and Italy are also doing work in this field.

The conference is made possible with support by, among others, the Park Foundation. More details will be published in the coming weeks.

Researchers and journalists interested in attending the conference should contact the International Fact-Checking Network at

Reporters’ Lab projects featured at Computation + Journalism conference

Two projects from the Duke Reporters’ Lab were featured at the 2015 Computation + Journalism Symposium, which was held over the weekend at Columbia University in New York.

The two-day conference included presentations about Structured Stories NYC, an experiment that involved three Duke students covering events in New York, and a separate project that is exploring new ways to automate fact-checking.

Structured Stories, which uses a unique structured journalism approach to local news, was the topic of a presentation by David Caswell, a fellow at the Reynolds Journalism Institute.

Caswell explained Structured Stories in a presentation titled the Editorial Aspects of Reporting into Structured Narratives.

Structured Stories NYC is one of the boldest experiments of structured journalism because it dices the news into short events that can be reassembled in different ways by readers. The site is designed to put readers in charge by allowing them to adjust the depth of story coverage.

On the second day of the conference, Reporters’ Lab Director Bill Adair and Naeemul Hassan, a Ph.D. student in computer science at the University of Texas-Arlington, made a presentation that Adair said was “a call to arms” to automate fact-checking. It was based on a paper called The Quest to Automate Fact-Checking that they co-authored with Chengkai Li and Mark Tremayne of the University of Texas-Arlington, Jun Yang of Duke, James Hamilton of Stanford University and Cong Yu of Google.

At the conference, Naeemul Hassan explained how the UT-Arlington computer scientists used machine learning to determine the attributes of a factual claim.
At the conference, Naeemul Hassan explained how the UT-Arlington computer scientists used machine learning to determine the attributes of a factual claim.

Adair spoke about the need for more research to achieve the “holy grail” of fully automated, instant fact-checking. Hassan gave a presentation about ClaimBuster, a tool that analyzes text and predicts which sentences are factual claims that fact-checkers might want to examine.

The Reporters’ Lab is working with computer scientists and researchers from UT-Arlington, Stanford and Google on the multi-year project to explore how computational power can assist fact-checkers.

