We sourced our data from an open-source GitHub repository, featuring California Department of Corrections and Rehabilitation (CDCR) data on COVID-19 in California State Prisons. The raw data was in CSV format, but did feature COVID-19 metrics for each prison with daily granularity, meaning it contained over 10,000 rows. Since we wanted to aggregate cases/deaths for each prison for some of our visualizations, we used the command line tool ‘grep’ to automatically filter and aggregate data as needed. Because the repository is constantly updated, our project uses compiled data from March 10, 2020 through November 9, 2020.