Web Archiving Roundup: March 19, 2018

Here’s your Web Archiving Roundup for March 19, 2018:


Web Archiving Roundup: March 5, 2018

Here’s your Web Archiving Roundup for March 5, 2018:

Web Archiving Roundup: February 19, 2018

Here’s your Web Archiving Roundup for February 19, 2018:

  • Archives Unleashed at the British Library: working with web archive data from the International Internet Preservation Consortium’s ‘National Olympic and Paralympic Committees‘ collection, a group of researchers asked: ‘What is the gender distribution of National Olympic Committees?’ (Archived link.)
  • The Really Old Website Resurrector: faced with defunct copies of departmental websites stored on CDs and DVDs, the University of North Carolina at Chapel Hill University Archives tested their ability to temporarily host the websites again using Archive-It and what’s now called the Really Old Website Resurrector. Code for the Resurrector is up on GitHub.
  • Link Rot for Lawyers: a blog post in which Perma.cc asks: ‘Is link rot a problem for law firms?‘ Hint: yes. In fact, after reviewing court filings made in the last five years by three of the largest law firms in the United States, Perma.cc found that ‘over 80% had at least one broken link’ and that, ‘on average, these briefs contained around six broken links each, and one brief contained seventeen broken links.’ (Archived link.)
  • Erasing history: from the Columbia Journalism Review — ‘When an online news outlet goes out of business, its archives can disappear as well. The new battle over journalism’s digital legacy.’ (Archived link.)
  • Watch Improving the Robustness of the Arquivo.pt Web Archive, which ‘provides an overview of the architecture and functioning of the system that supports the Arquivo.pt web archive.’
  • Ethics & Archiving the Web‘s full conference schedule is out, and tickets are now available for purchase. Ethics & Archiving the Web will be held March 22-24 at the New Museum in New York. (And for those who cannot attend: conference sessions and some workshop sessions will be livestreamed.)
  • You still have time to submit proposals to Web Archiving Histories and Futures, to be held November 13-15 in Wellington, New Zealand. Proposals are due February 28, and may cover: building web archives, maintaining web archive content and operations, using and researching web archives, and / or web archive histories and futures.

Web Archiving Roundup: February 5, 2018

Happy February, Roundtablers! Here’s your Web Archiving Roundup for February 5, 2018:

  • Archiving the alternative presses threatened by wealthy buyers: in partnership with Archive-It, Freedom of the Press Foundation is launching an online archives collection, focused ‘on news outlets we deem to be especially vulnerable to the billionaire problem,’ and ‘aims to preserve sites in their entirety before their archives can be taken down or manipulated.’ (Archived link.)
  • A New Playback Tool for the UK Web Archive: the UK Web Archive will be working with Rhizome to build a version of pyweb (Python Wayback) that they ‘hope will greatly improve the quality of playback for access to our archived content.’ (Archived link.)
  • And, speaking of: Webrecorder has released an updated version of pyweb, ‘a major refactoring and improvement’ of the ‘core engine’ that powers Webrecorder. (Archived link.)
  • The International Internet Preservation Consortium Content Development Group would like your help to archive websites from around the world related to the 2018 Winter Olympic and Paralympic Games! Submit seeds via this Google Form.

Web Archiving Roundup: January 22, 2018

Here’s your Web Archiving Roundup for January 22, 2018:

  • A Case for Digital Squirrelsin First Monday, authors Lindsay Kistler Mattock, Colleen Theisen, and Jennifer Burek Pierce look at ‘the myth of YouTube as an archive’ and discuss their ‘recommendations for developing new practices for archiving YouTube content to support scholarly research.’ (Archived link.)
  • An update from Cobwebfrom the University of California Los Angeles, Harvard University, and California Digital Library — and, with a production launch in 2018 — Cobweb seeks to empower ‘specialists, digital curators, and researchers’ by allowing them to ‘establish thematic web archiving collecting projects; nominate web resources for capture; claim nominated web resources with an intention to capture them; and contribute descriptions of those web resources that have been captured.’ (Archived link.)
  • Rhizome receives $1 million from the Andrew W. Mellon Foundationthe money ‘will support Webrecorder’s implementation in institutional contexts, while upgrading capture and usability for all users.’ (Archived link.)
  • We’re all Bona Fideon On Archivy, Bergis Jules argues that preserving cultural heritage on the web should be an inclusive and community-centered effort. ‘Archiving social media content,’ he writes, ‘should be a shared professional and community responsibility because it not only stretches our resources further, but it can also help to ensure that the records we end up creating are more representative of marginalized people.’ (Archived link.)
  • You still have time to let the International Internet Preservation Consortium know what you need when it comes to web archiving training: fill out this survey, and help the Consortium in its quest to develop materials for all types of training, be it technical, curatorial, or training for practitioners and researchers.

Web Archiving Roundup: January 8, 2018

Happy New Year, Roundtablers! Here’s your Web Archiving Roundup for January 8, 2018:

  • Meet the Librarians Saving the Internet: at Science Friday, Lauren J. Young profiles a few of the digital librarians who ‘continue to preserve our history’ by navigating ‘through a labyrinth of dispersed personal accounts on the web that have come and gone through time. (Archived link.)
  • Read .supDigital’s interview with Dragan Espenchied, Preservation Director for Rhizome (and Webrecorder.io). Then, read more from Jasmine Mulliken and .supDigital on web archiving. (Archived link.)
  • On their blog, Old Dominion University urges you to Link to Web Archives, not Search Engine Caches. Why? Because ‘Search Engine caches are useful for covering transient errors in the live web, but they are not archives and thus not suitable for long-term access.’ (Archived link.)
  • The Library of Congress will no longer archive every public tweet. Read the Library’s ‘Update on the Twitter Archive at the Library of Congress’ here. (Archived link.)
  • The International Internet Preservation Consortium wants to know what you need when it comes to web archiving training. Fill out this survey and help the Consortium in its quest develop materials for all types of training: be it technical, curatorial, or training for practitioners and researchers.
  • Dust off your résumé and sharpen your Python skills: Rhizome seeks a Senior Backend Developer to work on Webrecorder’s backend infrastructure. Applications are due by January 16, 2018, and can be sent to webrecorderjobs@rhizome.org.

Web Archiving Roundup: December 11, 2017

Happy December, Roundtablers! Here’s your Web Archiving Roundup for December 11, 2017:

  • Sustaining the Software that Preserves Access to Web Archives: on Digital Preservation Day, Andrew Jackson took a look at open source tools that enable access to web archives, and asked us to think about what comes next.
  • Speaking of moving forward — how do you move a web archive? On their blog, the National Archives details what went into moving 120 terabytes of data, on seventy drives, from Internet Memory Research’s data centre in Paris, to the Archives site in Kew, and, finally, to the Cloud. (Archived link.)
  • For the Digital Preservation Coalition, David S. H. Rosenthal writes about how we might be Losing the Battle to Archive the Web.
  • And, at the Atlantic, Alexis C. Madrigal writes that Future Historians Probably Won’t Understand Our Internet, and That’s Okay. Today, he notes: ‘there is more data about more people than ever before, however, the cultural institutions dedicated to preserving the memory of what it was to be alive in our time, including our hours on the internet, may actually be capturing less usable information than in previous eras.’ Still, as Nick Seaver says, ‘Is it terrible that not everything that happens right now will be remembered forever? Yeah, that’s crappy, but it’s historically quite the norm.’ (Archived link.)
  • Web Archiving Histories and Futures: the International Internet Preservation Consortium has announced its Call for Papers for its annual conference, to be held at the National Library of New Zealand in Wellington from November 13-15, 2018. Abstracts should be 300 to 500 words in length, and may touch upon topics related to: building web archives, maintaining web archive content and operations, using and researching web archives, web archive histories and futures, and more. Proposals are due February 28, 2018. 

Web Archiving Roundup: November 27, 2017

Here’s your post-Thanksgiving Web Archiving Roundup for November 26, 2017:

Web Archiving Roundup — Gothamist edition: November 13, 2017

Gothamist shutdown:
On Thursday, November 2, it was announced that the online-only, city-centric news outlets Gothamist and DNAinfo had been abruptly shuttered — archives and all — by owner Joe Ricketts in response to the organization’s vote to unionize. Both online newspapers, Gothamist (and LAist, DCist, Chicagoist, and SFist) and DNAinfo were updated numerous times each day, with a focus on local news, events, food, and culture.

This special edition of the Web Archiving Roundup takes a look at what others are saying about Gothamist and DNAinfo — and online news — in the wake of their sudden shutdown.

  • Archive, archive, archive: NiemanLab links to several external efforts to archive both Gothamist and DNAinfo, and reminds us of the risks of ‘billionaire-funded media.’ (Archived link.)
  • What We Lose in the Disappearing Digital Archive: on Splinter, David Uberti writes: ‘It’s likely that additional existing [online] publications will close in the face of economic upheaval, leaving their sites vulnerable to technical failure without consistent upkeep.’ Uberti also speaks with Abbie Grotke, web archiving team lead at the Library of Congress, who discusses the difficulties of capturing online news. (Archived link.)
  • When your server crashes, you could lose decades of digital news content — forever: in 2014, the Columbia Missourian suffered a server crash and ‘in less than a second, the newspaper’s digital archive of fifteen years of stories and seven years of photojournalism were gone forever.’ What’s worse, as Edward McCain writes, is that ‘very little is known about the policies and practices of news organizations when it comes to born-digital content.’ (Archived link.)
  • If a Pulitzer-finalist 34-part series of investigative journalism can vanish from the web, anything can: written in 2015, ‘Raiders of the Lost Web‘ argues that ‘the web, as it appears at any one moment, is a phantasmagoria. It’s not a place in any reliable sense of the word. It is not a repository. It is not a library. It is a constantly changing patchwork of perpetual nowness. You can’t count on the web, okay? It’s unstable. You have to know this.’ (Archived link.)

Tools and additional links:

Conference alert: on November 15 and 16, follow along with Dodging the Memory Hole, a conference dedicated to the issue of preserving born-digital news content.

Web Archiving Roundup: November 6, 2017

Here are a few quick links on recent web archiving topics:

  • Remembering October 1. Multiple Las Vegas institutions are joining forces to document last month’s horrific mass shooting, its aftermath, and the community’s response using a multi-tech approach to web archiving. The project is actively accepting contributions from the general public. Live link
  • History of Syria’s war at risk as YouTube reins in content. Excerpt: “Syrian activists fear all that history could be erased as YouTube moves to rein in violent content. In the past few months, the online video giant has implemented new policies to remove material considered graphic or supporting terrorism, and hundreds of thousands of videos from the conflict suddenly disappeared without notice. Activists say crucial evidence of human rights violations risks being lost — as well as an outlet to the world that is crucial for them.” Live link
  • Archiving the Belgian web. The Royal Library of Belgium launched Preserving Online Multiple Information: towards a Belgian strategy (PROMISE) on 1 June 2017, and aims to develop a federal strategy for the preservation of the Belgian web. Live link
  • Visualizing the changing web. With support from the National Endowment for the Humanities and the Institute of Museum and Library Services, the Web Science and Digital Libraries Research Group at Old Dominion University aims to visualize webpage changes over time.  Live link
  • Web archiving labor. Jessica Ogden explores digital labor in relation to web archiving in “Web Archiving as Maintenance and Repair.” Live link
  • Evaluating a web archiving program. The Dutch National Library asks, “How can we improve our web collecting?” Live link
  • Open call. Rhizome announces its open call for participation in its National Forum on Ethics and Archiving the Web. Proposals are due November 14, 2017: Live link