COVID Tracking Project records

Collection

Identifier: MSS-2022-74

Staff Only

MSS 2022-74 COVID Tracking Project records on Calisphere

Go to file

Scope and Contents

The COVID Tracking Project records consists of COVID-19 data products, data creation and quality records, organizational records, correspondence, and code repositories.

The Project existed entirely online, with no physical headquarters, as reflected by the nature of the records. Files primarily came from Google Drives created under the covidtracking.com account, which were used to create, manage, and share documents, spreadsheets, presentations, and other files. Other sources of records included Github for code, Amazon S3 for screenshot storage, Airtable for project management, and Front for correspondence.

The Teams series documents the activities of the Project's teams, which managed tasks such as data entry, data quality, data and web infrastructure, and community development. As all of the Project's work was done by a team, these records represent the organization's internal operations, and encompass the full breadth of its activities. When possible, team lead files are present as Google Takeout exports, which include the lead's Google Drive and GMail records.

The Archive Files contains files selected by Project contractors and volunteers for preservation in the collection. This series includes data spreadsheets, training materials, community records, data quality notes, and personnel records.

The Datasets represent the final published data products created by the Project. These were the Project's main output and its primary function, with the Testing and Outcomes (TACO) dataset being the best known. Included are the final COVID Racial Data Tracker (CRDT) and Long-Term Care (LTC) datasets, and annotations, which contain "per-state, per-metric structured notes on state reporting practices".

Screenshots contain records of the data sources used by Project teams. The majority of items in this series are screenshots of government websites; however, other data formats, such as Excel spreadsheets and PDFs are included. Each primary data product had its own collection of screenshots; in addition, unpublished and secondary research, such as variants and vaccinations, had their sources captured. The screenshots were designed to provide a record of provenance for the Project's data and a stable backup for potentially unstable data.

The Slack series includes export of internally public discussions from covidtracking.slack.com, which was the Project's primary hub for communication and internal activity. Access copies of discussions were generated by the archivists, which include files and images associated with Slack messages.

Social Media contains records from the Project's Twitter and Instagram accounts (including Twitter Direct Messages), along with newsletters emailed to a mailing list.

The Github Repositories series includes public code and data repositories published by the Project. Github was used to maintain active code for the website, API, and other technical infrastructure for the Project; in addition, certain research data was published on Github. One repository, issues, used Github's issues feature as a public forum to discuss errors, questions, and features.

Dates

Creation: 2020-2022

Creator

COVID Tracking Project (Organization)

Language of Materials

English .

Conditions Governing Access

The UCSF Archives and Special Collections policy places access restrictions on material with privacy issues for a specific time period from the date of creation. Restrictions are noted at the series level. This collection will be reviewed for sensitive content upon request. Contact the UCSF Archivist for information on access to restricted files.

Conditions Governing Use

Copyright has not been assigned to the Library and Center for Knowledge Management. All requests for permission to publish or quote from material must be submitted in writing to the UCSF Archivist. Permission for publication is given on behalf of the Library and Center for Knowledge Management as the owner of the physical items and is not intended to include or imply permission of the copyright holder, which must also be obtained by the researcher.

Biographical / Historical

The COVID Tracking Project was a network of volunteers that compiled, managed, and published state- and territory-level COVID-19 testing, hospitalization, and death data from March 7, 2020 through 2022. The Project began when two informal COVID-19 tracking initiatives - Alexis Madrigal and Robinson Meyer's research for The Atlantic, and Jeff Hammerbacher's independent data collection - combined their efforts; Erin Kissane joined soon after as co-founder. The Project grew to include hundreds of volunteers, who contributed via data entry, data quality review, web and infrastructure development, community development, and science/government communication. In addition, the Project employed up to 30 contractors to manage operations (divided into teams).

Due to the difficulty of automatically aggregating and maintaining data from dozens of health departments, the Project enlisted volunteers to monitor, enter, and review COVID-19 tracking data. It also worked with government agencies to advocate for data transparency and standardization. The Project also collected and maintained data regarding COVID-19's racial impact (in partnership with Boston University's Center for Antiracist Research) and on long-term care centers. As a result of its focus on data quality and transparency, the Project was one of the leading sources for COVID-19 data in the United States.

The COVID Tracking Project received administrative support from The Atlantic magazine, but was otherwise an independent organization. The Project received funding through donations from foundations.

The final daily data publication was on March 7, 2021. The Project continued through the summer of 2021, during which time it produced reports, updated existing data, and wound down the Project's activities.

Extent

439.74 Gigabytes (640,295 digital files)

Additional Description

Abstract

The COVID Tracking Project was a volunteer organization launched from The Atlantic and dedicated to collecting and publishing the data required to understand the COVID-19 outbreak in the United States. Their records include data products and sources, blog and social media posts, correspondence, internal communications, and team documents. Technical infrastructure, such as data entry and quality tools, public code repositories, and internal databases are also present.

Arrangement

The collection has been subdivided into seven series: Teams, Archive Files, Datasets, Screenshots, Slack, Social Media, and GitHub Repositories.

Immediate Source of Acquisition

This collection was donated to the UCSF Archives and Special Collections by The Atlantic magazine in 2022.

Accruals

No future additions are expected.

Existence and Location of Copies

Data Explorer is available at: link

Oral histories are available on Calisphere: link

Processing Information

Processed by Alexander Duryee and Kevin Miller in 2022-2023.

Subjects

Topical

Digital Material

MSS 2022-74 COVID Tracking Project records on Calisphere

Finding Aid & Administrative Information

Title: COVID Tracking Project records MSS 2022-74
Author: Alexander Duryee and Kevin Miller
Date: 2023-05-05
Description rules: Describing Archives: A Content Standard
Language of description: English
Script of description: Latin
Language of description note: Description is written in: English, Latin script.
Sponsor: Funding for The COVID Tracking Project Archive was provided by the (Sloan grant G-2022-17133).

Repository Details

Part of the UCSF Archives and Special Collections Repository

https://www.library.ucsf.edu/archives/

Contact:
UCSF Kalmanovitz Library
530 Parnassus Avenue
San Francisco CA 94143-0840 USA
https://www.library.ucsf.edu/archives/ask-an-archivist/

ArchivesSpace Public Interface

COVID Tracking Project records

MSS 2022-74 COVID Tracking Project records on Calisphere

Scope and Contents

Dates

Creator

Language of Materials

Conditions Governing Access

Conditions Governing Use

Biographical / Historical

Extent

Additional Description

Abstract

Arrangement

Immediate Source of Acquisition

Accruals

Existence and Location of Copies

Processing Information

Subjects

Topical

Digital Material

Finding Aid & Administrative Information

Repository Details

Repository Details

Collection organization