Facebook’s terms of service obstruct important journalistic research

Understanding how misinformation spreads on Facebook is an issue of enormous importance, and it’s exactly the type of public interest journalistic research that Cameron Hickey set out to do.

Hickey, who worked for Miles O’Brien Productions and created content for PBS NewsHour, developed an approach to identify previously unknown sources of misinformation on Facebook.

“If you have a set of sources of misinformation, and you want to identify new such sources, a key place to look is the people who already engage with the known, existing sources,” Hickey said. “When I looked at those pages, for example, my grandmother showed up as having liked them. By looking at the other pages she liked, I found hundreds of others that were not previously identified as junk news sources.”

Although the social media platform offers several application programming interfaces (APIs), these do not enable the type of work Hickey wants to pursue. It’s possible to get the content of pages through Facebook’s API, but retrieving the pages that users “like” is not.

Hickey could go through these pages one by one. Or, he could build a tool to extract this information automatically by “scraping”.

“This would have necessitated directly extracting content from Facebook pages,” he said. But Hickey was advised that the potential legal risk for moving forward with the project was prohibitively high.

Both the Department of Justice and Facebook have interpreted violations of a website’s terms of service to be violations of the Computer Fraud and Abuse Act—a vague and overly broad computer crime law that has been used to indict and imprison freedom of information activists and security researchers. Because of this, some legal experts are concerned that researchers whose projects involve automated data collection could theoretically be prosecuted under CFAA.

Hickey’s proposed project isn’t the only type of journalistic research that could violate Facebook’s terms of service. Kashmir Hill, a journalist at Gizmodo Media, has been investigating how Facebook maps users’ social relationships and suggests connections with new people that users may know. The algorithms by which Facebook generates the “People You May Know” (PYMK) list are opaque, and even after Facebook suggested Hill connect with a relative she did not know existed, the platform would not tell her how it made that recommendation.

(Disclosure: Hill is married to Freedom of the Press Foundation Executive Director Trevor Timm.)

In January, Hill and her colleague Surya Mattu at Gizmodo’s Special Projects Desk released an application that allows people to study the friend recommendations Facebook generates. Soon after, Facebook got in touch.

“I discussed the general concept of the PYMK inspector with the team with respect to whether it is possible to build the inspector in a policy compliant manner and our engineers confirmed that our Platform does not support this,” Facebook’s head of policy Allison Hendrix wrote in an email in February.

Hill said that Facebook communicated that the tool violated the platform’s terms, in part because it collects data using automated means that Facebook does not make available through its API, and asked that the “People You May Know Inspector” be taken down. (After changes were made to address Facebook’s security concerns, the platform still objected to the inspector. The tool remains live.)

Facebook’s terms of service are obstructing public interest journalism and academic research by unilaterally prohibiting automated data collection, and projects like Hickey’s and Hill’s are of critical importance.

The Knight First Amendment Institute at Columbia University has sent an open letter to Facebook’s executives proposing that the platform modify its terms of service to “create a safe harbor for certain journalism and research on its platform.” Freedom of the Press Foundation is in support of this effort, which is critical for the future of digital investigative journalism and accountability for social media giants.

The consequences that journalists and researchers could face for using digital investigative tools, the letter states, are serious. “Their accounts may be suspended or disabled. They risk legal liability for breach of contract. The Department of Justice and Facebook have both at times interpreted the Computer Fraud and Abuse Act to prohibit violations of a website’s terms of service,” the Knight First Amendment Institute wrote in its letter to Facebook.

The mere possibility of legal action can have a significant chilling effect, and force journalists and researchers to modify proposed projects to avoid this risk, even when doing so may make their work less valuable to the public.

The “safe harbor” the letter proposes would provide that journalists and researchers do not violate Facebook’s terms of service by collecting publicly available data through automated means, so long as such projects are in the public interest and do not compromise the privacy of Facebook users.

The Knight First Amendment Institute asks for a response from Facebook about the suggested terms of service modifications by September 7.

Understanding Facebook’s influence on discourse and society requires innovative, investigative digital research methods like Hickey’s and Hill’s. As they stand now, Facebook’s terms of service prohibit the use of basic tools necessary to study the platform’s impacts, making algorithmic accountability legally risky if not impossible.