Archiving the alternative press threatened by wealthy buyers

Parker Higgins

Director of Special Projects

An archivist at work in the stacks.
The U.S. National Archives

Freedom of the Press Foundation is launching an online archives collection in partnership with Archive-It, a service developed by the Internet Archive to help organizations preserve online content. Our collection, focusing on news outlets we deem to be especially vulnerable to "billionaire problem," aims to preserve sites in their entirety before their archives can be taken down or manipulated.

Archive-It collections grab snapshots of specified Web sites at a moment in time. Some institutions use Archive-It to capture collections of sites connected to particular social movements or historical events. UCLA, for example, maintains a collection documenting sites pertaining to the Occupy Wall Street protests. Another collection consists of snapshots of news and primary source Web documents from the Ukraine conflict.

To start our collection, we used Archive-It to crawl the entirety of Gawker.com, which we conducted amidst speculation that its archives might be purchased by a hostile party. Reported suitors have included Peter Thiel—who bankrolled the legal campaign that ultimately crushed the site—and more recently Mike Cernovich, who the site once described as a “D-list right-winger.”

We also captured a copy of L.A. Weekly shortly after its new owners—the identity of whom was initially concealed, even from its employees—restructured the operation and eliminated most of the writing jobs. At the time, one former employee published a short article titled “Who Owns L.A. Weekly,” which has since been removed from the site—though you can still view the version we captured. Since our crawl of the site, former employees have reported that stories are being "republished," validating our concerns about the integrity of the archive.

In these cases, and with all future sites added to this collection, the crawls we initiate through Archive-It will not just appear on our collection page, but will also be fed into the Internet Archive's Wayback Machine. The Wayback Machine is often the first stop for researchers seeking content that is no longer available online, so ensuring these sites are available there is an important way to reinforce the notion that this material is not irretrievably gone.

There are larger structural issues that render news outlets vulnerable to the billionaire problem. Those issues may be beyond the scope of any single organization to address. Our earlier work in this area includes gotham-grabber, which aims to limit the professional harm a vindictive media owner could do to the careers of individual journalists. We continue to extend that tool to work with additional outlets, including this weekend to cover The Toast, after its former editor reported that its archives will be shuttered; if you are a journalist who needs PDF backups of your work from archives that may not stick around, please get in touch.

Those efforts help individual journalists. But another important thing we can do to reduce the effectiveness of this kind of attack on press freedom is to commit ourselves to the wholesale preservation of threatened sites.

In this case, we seek to reduce the "upside" for wealthy individuals and organizations who would eliminate embarrassing or unflattering coverage by purchasing outlets outright. In other words, we hope that sites that can't simply be made to disappear will show some immunity to the billionaire problem.

Like projects like this?

Donate to support more of them.

Read more about Billionaires

Some questions for those who are cheering Gawker's demise

Gawker.com, the pioneering and controversial media blog, officially died yesterday. It was killed by billionaire Peter Thiel in his successful quest to bankrupt Gawker Media Group through a series ...

Saving the Gothamist archives from journalism's 'billionaire problem'

We're releasing new software to create PDF archives of stories written by individual journalists formerly employed at DNAinfo or the Gothamist network of sites.