COVID Tracking Project records
Scope and Contents
The COVID Tracking Project records consists of COVID-19 data products, data creation and quality records, organizational records, correspondence, and code repositories.
The Project existed entirely online, with no physical headquarters, as reflected by the nature of the records. Files primarily came from Google Drives created under the covidtracking.com account, which were used to create, manage, and share documents, spreadsheets, presentations, and other files. Other sources of records included Github for code, Amazon S3 for screenshot storage, Airtable for project management, and Front for correspondence.
The Teams series documents the activities of the Project's teams, which managed tasks such as data entry, data quality, data and web infrastructure, and community development. As all of the Project's work was done by a team, these records represent the organization's internal operations, and encompass the full breadth of its activities. When possible, team lead files are present as Google Takeout exports, which include the lead's Google Drive and GMail records.
The Archive Files contains files selected by Project contractors and volunteers for preservation in the collection. This series includes data spreadsheets, training materials, community records, data quality notes, and personnel records.
The Datasets represent the final published data products created by the Project. These were the Project's main output and its primary function, with the Testing and Outcomes (TACO) dataset being the best known. Included are the final COVID Racial Data Tracker (CRDT) and Long-Term Care (LTC) datasets, and annotations, which contain "per-state, per-metric structured notes on state reporting practices".
Screenshots contain records of the data sources used by Project teams. The majority of items in this series are screenshots of government websites; however, other data formats, such as Excel spreadsheets and PDFs are included. Each primary data product had its own collection of screenshots; in addition, unpublished and secondary research, such as variants and vaccinations, had their sources captured. The screenshots were designed to provide a record of provenance for the Project's data and a stable backup for potentially unstable data.
The Slack series includes export of internally public discussions from covidtracking.slack.com, which was the Project's primary hub for communication and internal activity. Access copies of discussions were generated by the archivists, which include files and images associated with Slack messages.
Social Media contains records from the Project's Twitter and Instagram accounts (including Twitter Direct Messages), along with newsletters emailed to a mailing list.
The Github Repositories series includes public code and data repositories published by the Project. Github was used to maintain active code for the website, API, and other technical infrastructure for the Project; in addition, certain research data was published on Github. One repository, issues, used Github's issues feature as a public forum to discuss errors, questions, and features.
Dates
- Creation: 2020-2022
Creator
- COVID Tracking Project (Organization)
Language of Materials
English .
Conditions Governing Access
The UCSF Archives and Special Collections policy places access restrictions on material with privacy issues for a specific time period from the date of creation. Restrictions are noted at the series level. This collection will be reviewed for sensitive content upon request. Contact the UCSF Archivist for information on access to restricted files.
Conditions Governing Use
Copyright has not been assigned to the Library and Center for Knowledge Management. All requests for permission to publish or quote from material must be submitted in writing to the UCSF Archivist. Permission for publication is given on behalf of the Library and Center for Knowledge Management as the owner of the physical items and is not intended to include or imply permission of the copyright holder, which must also be obtained by the researcher.
Biographical / Historical
The COVID Tracking Project was a network of volunteers that compiled, managed, and published state- and territory-level COVID-19 testing, hospitalization, and death data from March 7, 2020 through 2022. The Project began when two informal COVID-19 tracking initiatives - Alexis Madrigal and Robinson Meyer's research for The Atlantic, and Jeff Hammerbacher's independent data collection - combined their efforts; Erin Kissane joined soon after as co-founder. The Project grew to include hundreds of volunteers, who contributed via data entry, data quality review, web and infrastructure development, community development, and science/government communication. In addition, the Project employed up to 30 contractors to manage operations (divided into teams).
Due to the difficulty of automatically aggregating and maintaining data from dozens of health departments, the Project enlisted volunteers to monitor, enter, and review COVID-19 tracking data. It also worked with government agencies to advocate for data transparency and standardization. The Project also collected and maintained data regarding COVID-19's racial impact (in partnership with Boston University's Center for Antiracist Research) and on long-term care centers. As a result of its focus on data quality and transparency, the Project was one of the leading sources for COVID-19 data in the United States.
The COVID Tracking Project received administrative support from The Atlantic magazine, but was otherwise an independent organization. The Project received funding through donations from foundations.
The final daily data publication was on March 7, 2021. The Project continued through the summer of 2021, during which time it produced reports, updated existing data, and wound down the Project's activities.
Extent
439.74 Gigabytes (640,295 digital files)
Abstract
The COVID Tracking Project was a volunteer organization launched from The Atlantic and dedicated to collecting and publishing the data required to understand the COVID-19 outbreak in the United States. Their records include data products and sources, blog and social media posts, correspondence, internal communications, and team documents. Technical infrastructure, such as data entry and quality tools, public code repositories, and internal databases are also present.
Arrangement
The collection has been subdivided into seven series: Teams, Archive Files, Datasets, Screenshots, Slack, Social Media, and GitHub Repositories.
Immediate Source of Acquisition
This collection was donated to the UCSF Archives and Special Collections by The Atlantic magazine in 2022.
Accruals
No future additions are expected.
Processing Information
Processed by Alexander Duryee and Kevin Miller in 2022-2023.
- Title
- COVID Tracking Project records MSS 2022-74
- Author
- Alexander Duryee and Kevin Miller
- Date
- 2023-05-05
- Description rules
- Describing Archives: A Content Standard
- Language of description
- English
- Script of description
- Latin
- Language of description note
- Description is written in: English, Latin script.
- Sponsor
- Funding for The COVID Tracking Project Archive was provided by the (Sloan grant G-2022-17133).
Repository Details
Part of the UCSF Archives and Special Collections Repository
UCSF Kalmanovitz Library
530 Parnassus Avenue
San Francisco CA 94143-0840 USA
https://www.library.ucsf.edu/archives/ask-an-archivist/