Three Duke computer science majors advanced the quest for what some computer scientists say is the Holy Grail in fact-checking this summer.
Caroline Wang, Ethan Holland and Lucas Fagan tackled major challenges to creating an automated system that can both detect factual claims while politicians speak and instantly provide fact-checks.
That required finding and customizing state-of-art computing tools that most journalists would not recognize. A collective fondness for that sort of challenge helped, a lot.
Duke junior Caroline Wang
“We had a lot of fun discussing all the different algorithms out there, and just learning what machine learning techniques had been applied to natural language processing,” said Wang, a junior also majoring in math.
Wang and her partners took on the assignment for a Data+ research project. Part of the Information Initiative at Duke, Data+ invites students and faculty to find data-driven solutions to research challenges confronting scholars on campus.
The fact-checking team convened in a Gross Hall conference from 9 am to 4 pm every weekday for 10 weeks to help each other figure out how to help achieve live fact-checking, a goal of Knight journalism professor Bill Adair and other practitioners of accountability journalism.
Their goal was to do something of a “rough cut” of end-to-end automated fact-checking: to convert a political speech to text, identify the most “checkable” sentences in the speech and then match them with previously published fact-checks.
The students concluded that Google Cloud Speech-to-Text API was the best available tool to automate audio transcriptions. They then submitted the sentences to ClaimBuster, a project at the University of Texas at Arlington that the Duke Tech & Check Cooperative uses to identify statements that merit fact-checking. ClaimBuster acted as a helpful filter that reduced the number of claims submitted to the database, which in turn reduced processing time.
They chose Google Cloud speech-to-text because it can infer where punctuation belongs, Holland said. That yields text divided into complete thoughts. Google speech-to-text also shares transcription results while it processes the audio, rather than waiting until translation is done. That speeds up how fast the new text can get moved to the next steps along a fact-checking pipeline.
Duke junior Ethan Holland
“Google will say: This is my current take and this is my current confidence that take is right. That lets you cut down on the lag,” said Holland, a junior whose second major is statistics.
Their next step was finding ways to match the claims from that speech with the database of fact-checks that came from the Lab’s Share the Facts project. (The database contains thousands of articles published by the Washington Post, FactCheck.org and PolitiFact, each checking an individual claim.)
To do that, the students adapted an algorithm that the open-source research outfit OpenAI released in June, after the students started working together. The algorithm builds on The Transformer, a new neural network computing architecture that Google researchers published just six months prior.
Duke sophomore Lucas Fagan
The architecture alters how computers organize trying to understand written language. Instead of translating a sentence word by word, The Transformer weighs the importance of each word to the meaning of every other word. Over time that system helps machines discern meaning in more and more sentences more quickly.
“It’s a lot more like learning English. You grow up hearing it and your learn it,” said Fagan, a sophomore also majoring in math.
Work by Wang, Holland and Fagan is expected to help jumpstart a Bass Connections fact-checking team that started this fall. Students on that team will continue the hunt for better strategies to find statements that are good fact-check candidates, produce pop-up fact-checks and create apps to deliver this accountability journalism to more people.
Tech & Check has $1.2 million in funding from the John S. and James L. Knight Foundation, the Facebook Journalism Project and the Craig Newmark Foundation to tackle that job.
The Reporters’ Lab has been awarded an ONA Challenge Grant for a project that will develop new forms of journalism to cover local government in New York City.
Structured Stories NYC will use a structured journalism approach to cover major stories in New York this summer. It will be a new form of storytelling, a networked account of local news that accumulates over time and enables the local community to quickly access, query, and contribute to sprawling and complex local government stories.
The project will be run by the Duke Reporters’ Lab in conjunction with Structured Stories, a news platform being developed by former Yahoo! product director David Caswell, and WNYC Radio, New York’s flagship public radio station.
The Duke team will be headed by Bill Adair, the Knight Professor of the Practice of Journalism and Public Policy. The students are Ishan Thakore, Natalie Ritchie and Rachel Chason.
The students will spend the summer covering local government in New York and will be publishing on structuredstoriesnyc.com. They will meet periodically with journalists from WNYC’s newsroom who will help the students select topics to follow.
The Reporters’ Lab will receive $35,000 for the project. For more details about it, see our entry.
Structured journalism is a new way to present the news. Instead of the traditional news article, it dices the news into smaller fields that readers can sort, tally and combine in different ways. Examples of structured journalism include Homicide Watch, which tracks homicides in several cities, and PolitiFact’s Truth-O-Meter fact-checking.
ONA is the world’s largest association of digital journalists. The ONA Challenge Fund was created in 2014 to encourage journalism programs to experiment with new ways of providing news and information. This year’s winning projects cover issues ranging from poverty to juvenile justice, and food truck lines to logging.
The capital of Iran’s fact-checking movement is not in Tehran, but Toronto.
When Farhad Souzanchi wanted to promote government accountability in his home country of Iran and track the campaign promises of President Hassan Rouhani, his only choice was to open an office in Canada, more than 6,000 miles away. For the last 18 months, the Rouhani Meter — a unique fact-checking website because it is run remotely from another country — has broken new ground in fact-checking journalism.
Since Hassan Rouhani was sworn in as Iran’s seventh president Aug. 3, 2013, Souzanchi and has team have been tracking and updating a list of promises made during Rouhani’s campaign and the first 100 days of presidency. The project, a collaborative effort between ASL19, a research organization that helps Iranians circumvent Iran’s internet censorship, and the University of Toronto’s Munk School of Global Affairs, has researched 73 promises and rated them as Achieved, In Progress, Not Achieved or Inactive.
“When Rouhani came, he campaigned on hope and presented himself as a moderate. He said he would fix the image of Iran on the international stage, and with that came a lot of exciting promises,” Souzanchi said. “Our main goal was to promote conversation over these issues — government accountability and government transparency.”
There is virtually no transparency in Iran, which ranks 173rd out of 180 countries in Reporters Without Borders’ 2014 World Press Freedom Index. The Rouhani Meter is currently the only active fact-checking project in the entire Middle East.
In a world with a 24-hour news cycle and a growing global fact-checking movement, politicians in countries with a free press are growing accustomed to having their words scrutinized. In the United States, White House aides and members of Congress often cite fact-checking websites. But you won’t find Iranian officials citing the Rouhani Meter—they won’t even acknowledge the site’s existence.
“President Rouhani once said that people are monitoring us through the Internet. It was an indirect mention of it,” Souzanchi said. “But they haven’t addressed Rouhani Meter directly. They don’t want to legitimize it.”
To date, the Rouhani Meter has followed 73 promises made be Iranian President Hassan Rouhani during his campaign and first 100 days in office.
Working from across the Atlantic Ocean, access to reliable information is the biggest challenge the Rouhani Meter staff faces in its day-to-day reporting. Iran’s government maintains tight control over public information. ASL19 policies dictate that their reporting cannot involve collaboration with sources inside Iran, which would pose a risk to the sources’ safety.
The Rouhani Meter is forced to follow the Iranian press and collaborate with journalists working outside the country to check the president’s promises, a tactic that has impressed researchers who study the global fact-checking movement.
“It’s hard to imagine how you go about that without having access to data from the government or groups within the country,” said Lucas Graves, assistant professor at University of Wisconsin-Madison’s School of Journalism and Mass Communication. “With how complicated and nuanced these questions very often get, even a seemingly-straight-forward fact-check sometimes takes several days to research. Having seen these processes up close, I can’t imagine the difficulties of having to do this from halfway around the world.”
Without an error to date, the site’s painstakingly meticulous process has paid dividends.
Of the 73 registered promises on the Rouhani Meter, 11 percent are considered “Achieved” and 36 percent are designated “In Progress.” Five percent of promises are labeled “Not Achieved,” with the remaining 48 percent inactive. Promises on the site are broken down into four categories—socio-cultural, domestic policy, economy and foreign policy, which were the pillars of Rouhani’s campaign.
A sample of some of the socio-cultural promises the Rouhani Meter is currently tracking.
Some promises are easy to check. For example, Rouhani’s promise to re-open Iran’s House of Cinema was easily verified when the theater was opened Sept. 12 by deputy culture minister Hojjatollah Ayoubi. Rouhani’s plan to establish a Ministry of Women is yet to come to fruition, so the promise is designated as “Not Achieved.” Other promises are much more difficult to track, particularly those involving the economy. With little economic data available (and healthy doses of skepticism about that data’s validity), tracking Rouhani’s pledge to increase Iran’s economic growth poses a major challenge. The promise is currently designated by the Rouhani Meter as “In Progress.”
Since its launch on the day of Rouhani’s inauguration, the site has been visited more than 20 million times by 3.6 million unique visitors across the world. The Rouhani Meter is available in English, but the site’s Farsi version makes up more than 95 percent of its traffic. Reports on the site are often written in Farsi before being translated to English, but Souzanchi said that process varies.
Viewing the site from inside Iran presents a challenge all its own. A month after the site launched, it was blocked by the Iranian government. It can still be accessed with Internet circumvention tools and virtual private networks.
Souzanchi indicated that a lack of mainstream accessibility does not affect readership. Internet circumvention is a way of life in the tech-savvy nation of Iran, where nearly three-fourths of the country’s population is under the age of 40.
Iranians are accustomed to using circumvention tools so they can access popular websites Facebook and Twitter, so they can easily use them to see the Rouhani Meter.
“It hasn’t been a problem reaching people,” Souzanchi said.
Despite the Rouhani Meter’s goal to give Iranian citizens access to information, the project has some opponents inside the country’s borders. Much of this is because Souzanchi was inspired to start the site after seeing the Morsi Meter in Egypt, which tracked promises made by President Mohamed Morsi until he was overthrown in a coup.
Because Morsi was ultimately overthrown, conservative Iranians have attacked the Rouhani Meter because they fear the website conspires to carry out similar plots in Iran—a claim that Souzanchi says is not true.
“My answer to those who accuse Rouhani Meter of overthrowing President Rouhani is that our project is not about that,” Souzanchi said. “It is about encouraging political accountability in government. We, and I believe all healthy promise tracking platforms, are focused on accurate reporting based on strong research. Our reports on promises, which may be sometimes positive or negative, are always backed by the best data we have access to.
“In order to be a reliable and transparent source of information, promise trackers cannot and will not side with or against political leadership. Meters and fact-checking websites are ultimately there to help citizens to make informed, evidence-based decisions in a democratic process—and if we did our job, encourage healthy discussion.”
As the site continues to grow, the Rouhani Meter team has launched the Majlis Monitor, a new website that tracks activities in the Iranian parliament. Souzanchi also is looking for ways to expand its coverage to Iranians around the world.
A more challenging long-term goal is the expansion from promise-checking into fact-checking, which Graves said would be an even tougher task for an organization that works remotely. But the organization that refuses to let an ocean, opaque government activities and censored internet access stand in their way thinks it is up to the challenge.
“Through close collaborations with experts, activists, Iran-focused institutions and of course crowdsourcing hopefully we can overcome the challenges of limited access to information as much as possible,” Souzanchi said. “As ASL19’s motto goes, ‘There is always a way!’”
Update, March 17: We clarified our description of the unusual remote approach of the Rouhani Meter to say that it is “a unique fact-checking website because it is run remotely from another country.” As far as we know, it is the only one run entirely from another country, but there are some sites in which fact-checkers in one nation also fact-check claims from another nation.
There’s been lots of harrumphing about the decline in local coverage of Congress. Many Washington bureaus have been closed and there are fewer reporters covering congressional delegations.
But is the coverage as weak as the critics suspect?
To find out, students in my Washington in a New Media Age class examined how the local media covered their representatives in Congress last year. Using the Nexis and America’s News databases, the students tallied stories about their lawmakers and analyzed the content.
The resultsjustify the harrumphing. With few exceptions, local coverage of lawmakers is skimpy and superficial. The students found that coverage is particularly anemic for incumbents who are heavily favored — a group that has grown as more districts have been gerrymandered.
The student findings reveal an unexpected side effect of gerrymandering. It hasn’t just skewed the composition of congressional districts, it has become a justification for less news coverage. When a race is likely to be lopsided, editors often conclude they don’t need to cover the race or provide even the most basic coverage of an incumbent. So once a House member has a safe seat, they are likely to receive less scrutiny by the news media.
The average House member was mentioned in 160 news stories in print, online and television outlets, according to the data the students collected. That number sounds pretty respectable at first. But the number varied widely depending whether the seat was considered up for grabs. It was high for a closely contested seat such as Colorado’s 6th District (310 mentions) and low for the least competitive seats, such as the heavily Democratic 11th District in Virginia (51).
The students found little coverage by television stations, although it’s difficult to draw conclusions for all markets because of wide variations in how coverage is archived.
Even when the overall number is high, it doesn’t tell the full story. When the students examined the articles, they found a large portion had little or no discussion of policy or issues. And even when the coverage dealt with issues, it often provided little substance, the students found.
Student Thamina Stoll spent several hours reviewing the coverage of Rep. Loretta Sanchez, D-Calif, but came away with only a vague idea about what kind of lawmaker she is. “I still have no clue other than that she enjoys taking pictures for Christmas Cards, isn’t as involved in the immigration debate as she should be and that she appears to stress the importance of education,” Stoll wrote. “How should a voter feel comfortable voting for her again?”
Jordan DeLoatch, a student from the Raleigh-Durham area, found 171 mentions of his representative, Republican George Holding. But much of the coverage was shallow. “There was no fact-checking, no following up and no real attempt to dig deeper into the race,” DeLoatch wrote.
There were a few notable exceptions. The Denver Post and other news organizations in Colorado provided some good enterprise coverage of GOP Rep. Mike Coffman. And despite its national and international focus, the Washington Post did some good coverage of lawmakers in the Washington area.
But more often, the students found shallow reporting and a lack of questioning. News organizations, shrunken by the disruption of the digital age, have scaled back their accountability journalism. Many are more willing to publish a lawmaker’s op-ed than to assign a reporter who will ask critical questions.
Student Allie Eisen, writing about the 11th District in North Carolina around Asheville, found the coverage to be fawning and uncritical. She summed it up by saying that incumbent Republican Mark Meadows “is in the business of writing his own local headlines, and is wildly successful at doing so.”