Reporters’ Lab, IFCN to host conference about automated fact-checking

The March 31-April 1 conference will showcase new research to use computational power to help fact-checkers.

By Bill Adair – January 21, 2016 | Print this article

The Reporters’ Lab and Poynter’s International Fact-Checking Network  will host “Tech & Check”, the first conference to explore the promise and challenges of automated fact-checking.

Tech & Check, to be held March 31-April 1 at Duke University, will bring together experts from academia, journalism and the tech industry. The conference will include:

  1. Demos and presentations of current research that automates fact-checking
  2. Discussions about the institutional challenges of expanding the automated work
  3. Discussions on new areas for exploration, such as live fact-checking and automated annotation.

Research in computational fact-checking has been underway for several years, but has picked up momentum with a flurry of new projects.

While automating fact-checking entirely is still the stuff of science fiction, parts of the fact-checking process such as gathering fact-checkable claims or matching them with articles already published seem ripe for automation. As natural language processing (NLP) and other artificial intelligence tools become more sophisticated, the potential applications for fact-checking will increase.

Indeed, around the world several projects are exploring ways to make fact-checking faster and smarter through the use of technology. For example, at Duke University, an NSF-funded project uses computational power to help fact-checkers verify common claims about the voting records of members of Congress. The University of Texas-Arlington has developed a tool called ClaimBuster that can analyze long transcripts of debates and suggest sentences that could be fact-checked. At Indiana University, researchers have experimented with a tool that uses Wikipedia and knowledge networks to verify simple statements. Fact-checkers in France, Argentina, the U.K. and Italy are also doing work in this field.

The conference is made possible with support by, among others, the Park Foundation. More details will be published in the coming weeks.

Researchers and journalists interested in attending the conference should contact the International Fact-Checking Network at factchecknet@poynter.org

Back to top

Week 6 of Structured Stories: Could we do this from a warehouse in Durham?

Students on the team explore whether they could actually work from a remote location far from the city they're covering.

By Natalie Ritchie & Rachel Chason – July 14, 2015 | Print this article

Field notes by the Structured Stories NYC team: Ishan Thakore, Natalie Ritchie and Rachel Chason.

When Bill visited our New York office last week, we talked about how the project was going and, more specifically, the utility of original reporting. The lesson from last week’s blog post was that attending meetings isn’t really critical for Structured Stories. At one point, Bill asked, “Could we operate Structured Stories NYC from a warehouse in Durham?”

Our quick reply — probably so.

As we mulled it over, we all agreed. We could have done this anywhere.

Because so many resources are available online, from court documents to live videos of committee hearings, remote reporting is both feasible and efficient.

Traditional reporters still need the immediate access to sources, the details of a scene and the off-hand remarks that can only be caught in person. But for us, the situation is different.

While most news organizations focus more on breaking news, we have preferred in-depth, historical research that provides background and context to recent events. And the archived news articles, historical records and statistics that we need to describe those events and stories can all be found online.

Granted, if we weren’t in New York, Ishan might not have developed his relationships with WNYC reporters, Natalie wouldn’t have talked to Josh Mohrer and Rachel wouldn’t have met police brutality protesters in Union Square.

At the end of the day, however, we all would’ve been able to create the same number of events whether in New York or in a warehouse in Durham. Remote reporting is uniquely feasible in this Structured Stories project.

But being disconnected from the stories we’re covering has been something of a downside to the project.

For three budding journalists who enjoy getting out and talking to people, Structured Stories NYC has not been quite what we expected. Inputting events has at times felt tedious, and we’re largely cloistered in our office all day. While some people might find this work rewarding, we doubt traditional journalists would if they had to do it full-time.

But we think there might be a good balance in this scenario: a beat reporter who spends most of the day covering the news in a traditional way and concludes with an hour or two structuring stories.

That would give the reporter a more well-rounded job experience and provide Structured Stories with the expertise of a skilled journalist.

Back to top

At the end of each year, we struggle to give a 12-month period of our lives an identity. Time Magazine picks a Person of the Year, Barbara Walters whittles everything down into an hour’s worth of interviews—always, a winner must be named.

Last year, several stories in different states formed a common thread about the relationship between race and crime. The summer’s heat boiled over with weeks of protest in Ferguson, Mo., following the shooting death of 18-year-old African American Michael Brown by a police officer. And then the unrest spread after grand juries failed to indict officers involved in the deaths of Brown and Eric Garner, whose fatal strangulation by a New York City police officer was captured in a viral YouTube video.

Just as we got set to turn the calendar toward a new year, New York was again struck by tragedy, this time in its police department when police officers Rafael Ramos and Wenjian Liu were ambushed and executed in what is being considered an act of retribution for Garner’s death.

These developments, both tragic and captivating, were my primary motivation for digging into the Durham Police Department in an exercise of web scraping, an automated process that copies content from websites, allowing you to analyze or republish it.

Every web page you visit on the Internet is nothing more than a series of tables and lists. And although some pages are more complicated than others and contain many moving parts, there are always you can pull data directly out of pages with short lines of code and easy-to-use widgets. The best part: as long as your code remains active, your data will continue to update in real-time.

Some background on my project: During the fall semester of my senior year, I encountered a late-college crisis of sorts as a humanities major with an extensive journalism background but few quantitative skills. Tasked with a research project to help complete my Public Policy degree, I decided to use the opportunity as an excuse to learn new computer skills, which is how I settled on web scraping.

My journey began at square one — the absolute basics of coding. After a few weeks spent learning the ins and outs of HTML, CSS and Python, I was ready to learn how basic scrapers worked. The next couple weeks were spent learning about scrapers and doing scraping exercises from a textbook before compiling a list of dozens of Durham city and county organizations I could potentially scrape. This ultimately landed me on the Durham Police Department, which publishes an intriguing list of unsolved homicides on its website for all to see.

Screen Shot 2015-01-26 at 3.38.08 PM
A map of Durham’s unsolved homicides indicates that the majority of cold cases are located on the city’s eastern half, farther away from Duke University.

After painstaking trial and error (and mostly error), I developed a scraper using a Google Doc that pulled all of Durham County’s unsolved homicide victims, dates and locations into a spreadsheet—25 years worth of cold cases. Using Python and the web app ScraperWiki, I wrote a loop that pulled every sub-URL present on the page into a long list, extracted the victims’ individual pages and inserted them into the sheet. This allowed me to write a second scraper that pulled full descriptions of the homicide out of each victim’s page. I then plotted my data on an interactive map.

Some context and reactions to my analysis of Durham’s unsolved homicides:

  • Currently, there are just 28 unsolved homicides in the last 25 years. To put that in perspective, Durham had 30 homicides in 2013 and has since solved all but one of them.
  • Of the 24 unsolved homicides that took place within Durham city limits, 20 of them took place on the city’s East side. East Durham is less developed and more poverty-stricken than the West side, home to Duke University, the city’s downtown area and most of its urban gentrification.
  • Of the 28 unsolved homicides, five of the victims were white (17.9 percent)—17 were African American (60.7 percent) and six were Hispanic (21.4 percent). There had also been only one unsolved homicide with a white victim in the past 18 years. For reference, Durham County’s 2013 Census statistics indicate that 42.1 percent of residents were white, 13.5 percent Hispanic and 38.7 percent African American.
Back to top