Call for Web Archiving Section Committee Steering Members (deadline extended)

The Web Archiving Section is excited to accept nominations for the following Steering Committee positions for the 2022-2023 year!

  • Vice-Chair/Chair-Elect
  • Secretary
  • Communications Manager
  • Education Coordinator
  • Student Member

If you or someone you know would like to run for a position on the the Web Archiving Section Steering Committee please fill out this form by August 2022 with the following:​

  • Candidate Name
  • Job Title and Institution, if applicable
  • Bio and Candidate Statement (1-2 paragraphs)
  • Title of Steering Committee position sought

Position descriptions can be found below. Please keep in mind that membership in the Web Archiving Section is required in order to participate in elections through candidacy or casting a ballot. You may only run for one position. To learn more about the Web Archiving Section, check out the Web Archiving Section microsite and the Web Archiving Section blog

We look forward to hearing from you!

Position Descriptions:

Vice-Chair/Chair-Elect: The Vice Chair serves for two years, the first year as Chair-Elect and the second year as Chair.

  • Supports duties and responsibilities of the Chair as assigned.
  • Operates as acting Chair in the absence of the Chair.
  • Serves as member of the Steering Committee.
  • Fulfills all responsibilities specified in Section IX: Sections of the SAA Governance Manual.

Secretary (two-year term)

  • In consultation with Chair and Vice Chair establishes all Steering Committee meetings.
  • Calls for and distributes agenda items for Steering Committee meetings. 
  • Records meeting minutes and distributes them to the Steering Committee. 
  • Serves as member of the Steering Committee.

Education Coordinator (two-year term)

  • Serves as the section’s liaison to SAA Education Committee.
  • Arranges informal online meet-ups for members.
  • Prepares educational experiences, such as guest speakers, etc.
  • Serves as member of the Steering Committee.

Communications Manager (one-year term)

  • Maintains and updates the section’s microsite, blog, and Twitter feed.
  • Keeps section’s email list recipients informed on section news, events, and regular activities.
  • Serve as a member of the Steering Committee. 

Student Member (one-year term)

  • Serves as a liaison to SAA student chapters and groups. 
  • Serves as a member of the Steering Committee. 
  • Must be an actively enrolled student and student member of SAA at the time of election.

Lessons from pandemic web archiving

This post was written by Web Archiving Section chair Tori Maches, Digital Archivist at UC San Diego.

When my supervisor first spoke with me about COVID-19 web archiving, it was late February 2020, COVID-19 hadn’t been declared a pandemic yet, and there was no confirmed community spread in San Diego. A few weeks later, the UC San Diego campus had been evacuated, the governor had issued a stay-at-home order, and I was adapting to a new normal while documenting the campus and county COVID-19 response.

Over the last 17 months, I’ve captured web content related to the campus and San Diego County COVID-19 response from campus, local government, and local news sites. Although I’ve been responsible for UC San Diego’s web archiving work since I started working here three years ago, the COVID-19 collection has felt like a crash course. With that in mind, this blog will cover challenges I’ve run into during this roller coaster of a time, as well as lessons I’ve learned and continue to learn.

Documenting a pandemic meant dealing with both logistical and emotional challenges. Once pages stop explicitly mentioning COVID-19, for instance (e.g. talking about measures or guidelines, but assuming a reader will know why they exist), it can be harder to identify relevant material. To avoid missing material, I kept an eye on local news to see if anything new was happening, and paid attention to new linked pages and resources when I did QA for recent crawls. Coworkers also gave me updates about initiatives, policy changes, and available information, all of which informed my collecting. This helped me balance my web archiving work with other job duties, while still collecting new material as needed. Since I was the only person working on the collection for the most part, this was crucial.

Scope creep was an even greater challenge. The usual “digital FOMO” multiplies tenfold when you’re documenting a historic, rapidly changing event – it’s hard to argue with the impulse to capture everything when the world feels like it’s on fire. I had to step back, ask myself “should I be capturing this,” and recognize that sometimes the answer would be no. Some content was duplicative, was available in “good enough” form elsewhere, or would require a lot of care and thought to capture the right way. I kept my scope to the campus and county response partly because that was my original charge and I didn’t have the bandwidth to expand, and because I wasn’t sure I could capture social media during an ongoing traumatic event in a sensitive way.

Self-care was also a pretty significant challenge. Working on this collection was incredibly rewarding, and I’m proud of how it’s turned out. At the same time, documenting my hometown’s pandemic response is one of the hardest things I’ve ever done. It took me a while to build breaks into my QA workflows, or ask for help when I needed it, but it was absolutely worth it. Some of my coworkers were able to help with QA for a while, which was invaluable. Talking with friends and colleagues who have worked with trauma records helped as well. It was a good reminder that many of us have done and are doing this kind of work, and we’re not doing it alone.

I’m proud of the work I’ve done over the past year and a half. I’ve documented policy changes, testing initiatives, calls for volunteers, messages of hope and encouragement, and people working together to survive and protect each other in an almost unprecedented time. I’ve learned once again that archival work is about managing and mitigating loss, especially when working with high volumes of material. Keeping my collection scope in mind, remembering I wasn’t the only person documenting this time, and thinking about the logistical and ethical implications of what I collected all helped. I also learned that sometimes, the most productive thing to do is take a break. Most of all, pandemic web archiving was a good reminder that archival work isn’t done in a vacuum, even if you’re the only person working on a collection day to day.

What do you get when you hand a Master’s student a web archiving program?: Part Two

The following is a guest post by Grace Moran, Graduate Web Archiving Assistant – Library Administration, University of Illinois Library

In the previous installment of this series, I explored the complementary issues of metadata and access for web archives. In this second and final installment, the more human aspects of this year’s endeavor come to the fore: policy and personnel. I will also briefly describe how the current pandemic has informed web-archiving efforts at the University of Illinois.

If you did not read the previous blog post, let me revisit my background. I am the Graduate Web Archiving Assistant, working for Dr. Christopher Prom, the Associate Dean for Digital Strategies at the University of Illinois. This year, I have been charged with participating in day-to-day activities related to web archiving such as running crawls and doing quality assurance. I also engage in high-level organizational thinking about the future stewardship of the library’s burgeoning web archiving program. The program began in 2015, and since then the University of Illinois has captured 5 TB worth of data using our Archive-It subscription. Multiple units take part in the curatorial side of this endeavor: the University Archives, the American Library Association Archives, Faculty Papers, and the International & Area Studies Library. We are hoping to start running crawls for the Illinois History & Lincoln Collections in the near future.

As I noted, this post is focused on policy and personnel. These are two areas that currently present a challenge for my institution. We do not have centralized documentation, a web archiving-specific collection development policy, or a position other than my own dedicated to web archiving. The following is what I envision the program looking like in the future and will be highlighted in my final report to my supervisor at the end of the academic year.

What does policy mean for web archiving? I believe institutional web archiving policies and procedures should be composed of the following:

  • A collection development policy unique to the institution’s web archiving program (a general organization-wide collection development policy does not suffice given the unique nature of the content being collected)
  • A clear, centralized workflow outlining how crawls are to be run, troubleshooting documentation, and chain-of-command for web archiving
  • A statement on copyright and ethics in web archiving (Niu in “An Overview of Web Archiving,” cited below, touches on copyright)

Though it may be extremely obvious to some readers, it is worth saying: policy should be public. As someone who works for a public university, I am painfully aware of the importance of policy accessibility for our stakeholders.

What about personnel? Who should be running a web-archiving program? How many people should be involved? Of course, this is something that varies from institution to institution; however, my experience has made clear the need for a dedicated point person. This person could be:

  • A graduate web archiving position, like my own, working 20 hours a week to coordinate crawls across units, run Quality Assurance, and populate metadata fields.
  • A civil service or academic professional position with at least a 50% appointment to web archiving. If an institution is looking to grow their web archiving program, they should consider making this a 100% appointment for the first couple years and then having the point person slowly transition towards additional activities related to digital strategies of the library.

I should note here that a graduate web archiving assistant is a great way to support your web archiving program (yes, I am biased) but there are some drawbacks to placing this responsibility on the shoulders of a temporary employee. If you are just beginning your program, you may find that a part-time position does not fulfill the needs of your program. Additionally, there are advantages to long-term employees who have institutional knowledge and memory and therefore, understand the administrative history of digital programs within your organization. Time is lost when it is necessary to re-train someone for a position annually or bi-annually.

Side Note: I want to make clear that graduate employees are so important. They bring a fresh set of eyes to problems and the opportunity to learn from a graduate position can be absolutely priceless for someone like me. Please consider funding your graduate students, there are a great number out there pursuing an unfunded MLIS and paying off student debt for years to come.

Finally, I want to highlight a unique opportunity given to web archiving programs this past year. COVID-19 has devastated lives the world over; it has also provided inspiration for innovation and creativity. This has been true at the University of Illinois at Urbana-Champaign. From a novel saliva-based PCR test, to rigorous testing protocols, to creating a ventilator in 12 days, the institution has tackled this problem head-on. Bethany Anderson, the Natural & Applied Sciences Archivist, and I have collaborated over the past year to run crawls to document COVID-19 at the university and celebrate what we have accomplished in one of the darkest years we have seen. To check out pages we have documented, you can visit https://archive-it.org/collections/13880. This is a great example of how web archiving allows us to document important moments now and preserve the historical record (which is increasingly electronic) for the benefit of future researchers.

I hope that you have identified with some part of this blog series; my hope is that if we create a dialogue about our triumphs and struggles, we can all learn something.

Sources Mentioned

Niu, Jinfang, “An Overview of Web Archiving” (2012). School of Information Faculty Publications. 308.
https://scholarcommons.usf.edu/si_facpub/308

“COVID-19 Response at the University of Illinois” (2021). The Board Trustees of the University of Illinois Urbana-Champaign. https://archive-it.org/collections/13880

For further questions, I can be reached at gmoran6@illinois.edu

Web Archiving Roundup: November 2020

Welcome to a new year with the Web Archiving Section! 

The new section Steering Committee met for the first time earlier this month and we’re all very excited about the coming year! One of our goals is to make the blog more active. We’d like to invite you to participate by submitting news, announcements, and other topics of interest. If you have a topic that you’d like to expand on, we’re also looking for guest contributors. Please send items and suggestions to April Feldman. Hope to hear from you soon.

Since we’ve all shared our “brief introductions” with the section via our candidate statements we’ve decided not to repeat ourselves. Instead, in this post, we’d like to introduce all of you to our favorite web collections. Our only requirement was that it be a collection we’ve personally worked on.

Tori Maches, Digital Archivist, UC San Diego

My favorite UCSD web archive collection is our San Diego Local Governments collection. It’s one of our oldest collections, dating back to 2007, and covers local municipal and government agency websites, as well as websites related to local government activity. I grew up in San Diego, and remember interacting with some of these websites when they looked the way they did in the earliest crawls. Recontextualizing websites you used in high school as historical objects is a weird experience, and it’s a good reminder of how quickly things change online.

Melissa Wertheimer, Music Reference Specialist, Library of Congress

I’m excited to share the LC Commissioned Composers Web Archive. This was my first venture with web archives, and I’ve been curating it since 2018. I’m a flutist who specializes in contemporary repertoire, so the topic felt natural! I think web archives are perfect for a digital record of the Music Division’s active commissioning of living jazz and classical composers. This collection is a resource in itself, but also leads users to our unique collection materials through abstracts with links to finding aids for composers’ papers and online catalog records for commissions’ manuscripts and electronic files.

Ryder Kouba, Collections Archivist, University of Hong Kong (Pok Fu Lam)

The Egyptian Politics and Revolution Collection (started by Stephen Urgola and Carolyn Runyon) documented Egyptian politics from 2011-present day (though politics have been dead since 2014 or so). As many archivists have written about collecting in the US, concerns over privacy were paramount, though difficult in a (for a time) fluid situation. Website blocking and censorship also made our work more difficult in finding content, but more important in providing access to blocked websites through the Wayback Machine (which was temporarily blocked in Egypt, logically we had concerns over exactly why the government decided to reverse that decision).

April Feldman, University Archivist, California State University, Northridge

We don’t have an active web-archiving program per se, so I haven’t worked on a favorite collection yet. We’re trying to implement a starter program, something small and scalable. In the meantime, we’ve just started using the Wayback Machine to capture limited Cal State Northridge websites and Wakelet for websites and social media posts related to CSUN’s Covid-19 response. I’m a project team of one right now, trying to figure it out as I go (love that OJT!) I’m completely open to suggestions or tales of woe if anyone wants to share.

Kiera Sullivan, University Records Processor, UC San Diego

The collection that I would like to highlight is the UC San Diego Web Archives collection. This is perhaps unsurprising, considering my role in the University Archives! This large collection captures a wealth of information from websites across the campus administration and community. Part of what I love about this collection is that although it is already extensive, there is a ton of room to grow and refine. This year, I will be working on identifying gaps in our collecting of student and community content, scoping crawls to address those gaps, and also improving the existing metadata across the collection.

Allison Fischbach, Research and Archives Associate, Towson University / MLIS student in Archives and Digital Curation, University of Maryland iSchool

The collection I am most proud of is our COVID-19 University Response Collection. This is the first web collection I have curated, and it is especially precinct as the pandemic continues. What I like about this collection is that it includes both valuable information about how institutional operations responded to COVID-19, as well as how the community continues to enact positive change. It is a collection that continues to grow, and one I know will have monumental value to future users. I feel fortunate to be able to work with it as a student archivist.

Web Archiving Roundup: August 2020

To start: Join us August 25 at 4:30 PM EST for the Web Archiving Section Meeting! Since March, archivists and information professionals have been focused on documenting COVID-19 and its effects on their communities. Web archiving is at the forefront of the documentation effort. The meeting will consist of a guided discussion focusing on the methods in which archivists go about creating spontaneous and event-based collections, considering all aspects of the web archiving lifecycle, from collection development to scoping to description and access efforts. We are most interested in hearing about collecting frameworks that your institution is working on, so come discuss with us! We also will do some light section business and talk about elections. Registration link here: zoom.us/meeting/register/…

On to other news:

Publications from Stanford University Press’s digital initiative now have nearly complete web-archive versions thanks to a 2020 partnership with Webrecorder. The blog post goes into detail on technical specs, specifically ReplayWeb.page and WACZ. Webrecorder also launched a forum for discussions and announcements about the software.

Rhizome’s Conifer introduces Periphery, a tool for collection owners to define how missing resources are expressed during the replay of archived content.

The “Archiving the Black Web” project was started at the African-American Research Library and Cultural Center in Fort Lauderdale.

The Joint Conference on Digital Libraries took place August 1-5 and all of  their papers are now available online. Talks related to web archiving include, Making Recommendations from Web Archives for, The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives, Identifying Documents In-Scope of a Collection from Web Archives, and  The Case For Alternative Web Archival Formats To Expedite The Data-To-Insight Cycle.

From IIPC: A descriptionof the new Robustify link service from Memento and an overview of the Danish coronavirus web collection.

Archives Unleashed was also busy! They released their 2017-2020 community report and announced their partnership with Archive-it to collaborate and integrate services to provide easy to use scalable tools to researchers using web archives.

See ya on Tuesday!

 

 

 

 

Web Archiving Roundup: July 2020

Time for our monthly round-up! Hope everyone is staying well and safe!

From IIPC: The Content Development Group reports on their 2019-2020 work, including the COVID-19 and Climate Change collections; Library and Archives Canada reports on their COVID-19 collecting; and what’s new from the Croatian Web Archive. IIPC is also hosting two upcoming webinars on visualizing web archives and Jupyter Notebooks.

Twitter got harder to capture :/

Archive-it announced their virtual partner meeting on October 7, as well as announced new staff, and other updates. Their next open call is July 29.

It’s all available online…until it’s not!

Video and article exploring base memes (the original meme a derivative meme is based off of) in web archives.

Only one article this month from Internet Histories: Forensic approaches to evaluating primary sources in internet history research: reconstructing early Web-based archival work (1989–1996) by James A. Hodges.

And a summary of a study reviewing WARC validation tools.

SAA is meeting virtually August 3-7!

Web Archiving Roundup: June 2020

Documenting the Now published its Archivists Supporting Activists list, a call to action to archivist and memory workers to help support activists documenting violence by police toward black people.

The IIPC blog was busy this month: National Széchényi Library reports back on another year of web archiving on the blog; summaries of the National Library of LuxembourgNational Library of New Zealand, Bibliothèque et Archives nationales du Québec, and the Bibliothèque nationale de France‘s coronavirus collecting; a blog post on the future of playback, specifically discussing pyweb and OpenWayback; announcement of the IIPC training program; discussion of Glam Workbench, which provides researchers with examples, tools, and documentation to help them explore and use the online collections of libraries, archives, and museums; exploration of previous annual meetings since this year’s meeting was cancelled; and reflections of training new web archivists.

IIPC also opened their discretionary funding program, proposals are due September 15.

The UK Web Archive blog discussed their2019 UK General Election web archives; case study on using Webrecorder for UK politician’s social media pages; and WARCnet— “a network is to promote high-quality national and transnational research” of web domains and events on the web.

Call for contributions for the 4th RESAW Conference on June 17-18, 2021 on mainstream vs marginal content in Web history and Web archives due October 15, 2020.

Webrecorder changed its named to Conifer!

Internet Histories has articles on international internet governance in the 1990s; selective disembodiment in the early Wired magazine; history of cyberactivism in Brazil; efficacy of civil society in global internet governance; and Weibo.

https://replayweb.page/ launched, which allows for replay of archived websites in your browser.

Announcement of version 0.80.0 of the Archives Unleashed Toolkit and revamped user documentation.

Archive-it shared the results of their 2020 State of the WARC survey.

 

Web Archiving Roundup: May 2020

Thank you to everyone who sent me their COVID-19 web archiving projects. You can view the list in our last blog post. I am still collecting submissions, so feel free to shoot me an email at nmg266 at nyu dot edu.

On to the roundup:

From the IIPC blog, the folks at Bibliotheca Alexandrina (BA) and the National Library of New Zealand (NLNZ)  share an update on their tool for scalable web archive visualization: LinkGate.

Archives Unleashed released their Spring newsletter! The newsletter features information about the Archives Unleashed Toolkit,  Cloud, and Notebooks; a summary of the remote New York datathon; web archiving articles and resources for use during COVID-19 pandemic; and other presentations from the team.

Archive-it and NYARC share the national forum report on Advancing Art Libraries and Curated Web Archives including the recording of 2020 ARLIS/NA Web Archiving SIG Meeting.

Archive-it has also wrote a number of blog posts and updates, given the uptick in web archiving due to the pandemic. These updates give information on special pricing and cost sharing, resources, community news on COVID-19 web archiving projects, as well as introductory programming for people new to web archiving.

The British Library releases its Brexit collection, as well as describes their work on COVID-19 collecting and how to use the collections while the reading room is closed.

There are a number of new articles on using web archives as primary resources, including Helena Byrne on using web archives to review football history, Will Mari on the racial, gendered, and class-based origins of the early internet, and Michael Stevenson & Anne Helmond on legacy systems.

The Digital Preservation Coalition has a number of posts related to web archiving and the coronavirus pandemic, including If These WARCs Could Talk: Learning from Archived Web & Social Media Covid-19 Collections and Capturing the UK Government Response to the Coronavirus (COVID-19) Pandemic at The National Archives UK. There is also a nice write up on how researchers use the archived web.

I presented at METRO with Mark Graham (Internet Archive), Gary Price (INFOdocket), and Alexander Thurman (Columbia University) on Documenting the Present Moment.

Is the WARC the best web archiving format?

The Library of Congress web archiving team celebrated its 20th anniversary! They also posted an interview with a retiring web archivist, Gina Jones. The LoC web archiving team also was featured in the New York Times.

Presentations from the WARCNet kickoff meeting can be found on the WARCNet website. The aim of the WARCnet network is to promote high-quality national and transnational research that will help us to understand the history of (trans)national web domains and of transnational events on the web, drawing on the increasingly important digital cultural heritage held in national web archives.

Version 2.3 of Social Feed Manager was released!

The International Journal of Digital Humanities is putting out a call for a special issue on digital humanities and web archives. Abstracts are due June 1, 2020.

Web Archiving and Digital Libraries workshop as part of JCDL 2020 is looking for submissions. Submissions are due on June 6, 2020.

The race to save the first draft of coronavirus history from internet oblivion in the MIT Technology Review.

Wow, that was a lot. Thanks to my colleague Amy for helping me with this roundup. Stay healthy everyone!

Web Archiving Round-up: COVID-19 Edition

As part of an initiative of the section I have worked to compile a list of projects started by archivists and librarians across the country in response to the COVID-19 pandemic. As of this writing, over 40 institutions have submitted their work to to the spreadsheet.  It ranges from international efforts to capture websites associated with COVID-19 to localized efforts from institutions across the United States and Canada. Archivists and librarians have used many different tools to do this work, including Archive-it, Webrecorder, wget, twarc, and other custom tools. The work that these folx have done is both important and emotionally challenging and I want to acknowledge this labor. People are powering this effort, as much as we think this work is automatic, it involves a significant amount of human labor to select, appraise, assess the websites for quality assurance, and adequately describe these materials. I also want to acknowledge all essential workers during this time of crisis, without them, I would not be able to continue to do this work from the safety of my home in New York.

Click here to view the COVID-19 Web Archives Spreadsheet.

As of right now, the spreadsheet is only open to comment. Do you have a collecting project that you want to be included in the spreadsheet? Shoot me an email at nicole dot greenhouse at nyu dot edu.

Stay safe, stay healthy.

-Nicole

Web Archiving Roundup: March 2020

In addition to the IIPC’s collaborative collecting project on the coronavirus, send me your institutionally curated COVID-19 web archiving projects. I will compile them and post them to the section website. My email address is nmg266 at nyu dot edu.

We are also looking for proposals for participation in the Web Archiving Section’s annual section meeting on theme of “web archiving and rights.”

Multiple web archiving events have been postponed due to the coronavirus. Archives Unleashed has changed to a remote-only event. The IIPC Web Archiving Conference and General Assembly has been postponed until September 28-30, 2020. Engaging with Web Archives is also postponed.

Archive-it is hosting a webinar on April 1 on the Wayback API. And the Community Webs project received additional funding! Other Archive-it news can be found on their blog.

In addition to the establishment of the coronavirus web archive, the IIPC has multiple announcements, including the newly elected steering committee member organizations and a published address from IIPC Chair on IIPC funded projects, consortium agreement renewals, collaboration with CLIR,  training materials, and collaborative collecting. The IIPC blog also featured a guest post from the National Library of Australia on their migration to Python Wayback.

New version of MemGator was released!

Happy 15th birthday to the UK Web Archive!

And lastly, here is a GitHub archiving individual stories from Wuhan during the coronavirus pandemic.

Stay home, wash your hands, stay healthy, and archive websites!