Preserving threatened archives for Splinter and Deadspin writers

Parker Higgins

Director of Special Projects

US National Archives

The U.S. National Archives

The political news outlet Splinter shuttered and its sister site Deadspin imploded last month under management decisions made by their new owners. The sudden collapse of these sites came as a shock, but it sadly wasn’t unprecedented: the hazards of private equity ownership of media outlets are a well-documented and growing problem.

In response to earlier examples of this trend, we’ve developed archiving software and partnered with the Internet Archive over the past two years to save journalists’ work that may be at risk of deletion. The situation at Splinter and Deadspin certainly qualified, as the conflict heated up and management started to tamper with site content it found personally embarrassing.

And so, responding to direct requests from dozens of former Splinter and Deadspin writers and their colleagues, we’ve created and delivered over 50,000 articles as PDFs—nearly 200,000 pages—in the month since Splinter’s shutdown was announced.

To generate these documents, we used a tool we originally created in 2018, dubbed ‘gotham-grabber’ after the alternative media site Gothamist that was shut down suddenly by its owners. Gotham-grabber addresses the problem of vanishing portfolios by automating the process of collecting links to every article by a particular writer and creating clean, well-formatted, searchable PDFs for each.

These steps are necessary because working journalists depend on portfolios of their work when applying for jobs or freelance assignments, and destabilizing archives can jeopardize their livelihood. A precarious press is not a free one, and our work to empower journalists entails making it harder for monied interests to throw their careers into disarray. (We continue to generate more of these PDF portfolio back-ups as we hear from affected reporters, so please reach out if you would like our help.)

We’ve also released gotham-grabber as free software, and have heard reports from writers who have used it to back up their own pages on Kinja, the content management system shared by Splinter, Deadspin and others.

In an article in the Columbia Journalism Review this week, journalist Tiffany Stevens spoke with several writers whose work had been taken offline, and the need for offline backups was a recurring theme. For example, journalist Jessica Wakeman said, “This is a new problem, but the answer might be that we as writers have to save every single thing we write as a PDF or that we have to print it out and put it in a binder and go the analog route, which seems crazy.”

Saving individual PDFs of hundreds or thousands of articles from a career in a quick-paced blog newsroom can be prohibitive, and in many cases, articles designed for a screen can’t be seamlessly converted into PDFs. Reporter Julia Sklar told Stevens, “There aren’t that many foolproof tools out there for helping digital journalists keep an archive of their work. All the ways I tried, the PDFs were broken or missing whole paragraphs or weirdly formatted because they didn’t have the original embedded ads or pictures and tweets and stuff that were in the dynamic webpage.”

Our gotham-grabber software is designed to overcome those hurdles, but doing so increases its complexity, makes the process considerably more resource intensive and requires customization for each site it is run on.

The loss of digital archives doesn’t just harm a newsroom’s journalists: in these cases the public suffers too. To combat this, we’ve also used our partnership with the Internet Archive’s Archive-It project—which we originally launched to preserve other elements of what former Splinter editor Alex Pareene has called “the rude press”—to create full online backups of these sites, to ensure their record is preserved for the public in places like the Internet Archive’s Wayback Machine.

The obvious gold standard is for news archives to remain available and in place online, with their original URL, layout and context intact. Unfortunately, that kind of preservation in amber isn't easy even with well-intentioned stewards. Where the keepers of the archive are negligent, or even hostile, it's impossible.

Like projects like this?

Donate to support more of them.