Tomi Chen

covid tracker social image

CVUSD COVID-19 Tracker

August 2021 – Ongoing Source

Miscellaneous

The CVUSD COVID-19 Tracker is a project that uses git scraping to track COVID-19 positive case numbers in my school district over time. New data is also logged to a Google Sheet to generate pretty graphs.

If you’re interested in seeing how I built this, I wrote a blog post about how it works. I walk through data fetching, logging to a spreadsheet, and doing analysis and visualization.

Why?

In the fall of 2021, school reopened after 1.5 years of remote learning. We weren’t sure how long it would last, considering how packed classrooms were. Helpfully, our school district had a public COVID-19 dashboard showing the number of positive cases at each school. However, it just had the total count, so there was no way to determine when cases were reported. Maybe we had 1 case a day for 30 days, or 30 cases all in one day!

I decided to create a GitHub Action running on a cron job that would scrape the dashboard 4 times every school day, committing the data back to the repository. This was hard to visualize at a glance so I setup a spreadsheet to analyze and visualize the data.

Challenges

The hardest part of this project was wrangling the data to gain insights using the spreadsheet. Google Sheets is pretty powerful, but I ran into issues since the district dashboard format was not ideal. It only reported cumulative cases, but only in the current month, so I had to sum up previous months to get the full data. I also had to figure out how to compute the case change for each day.

All these calculations ended up spanning 25 columns in the spreadsheet, with rows appended for each day.

a screenshot of the spreadsheet

But after doing all that work, I didn’t need to touch it again for the rest of the school year. The graphs would update automatically and be reflected in the repo README.

graph of estimated active cases

Cron Sidenote

Another interesting thing about this project was properly scheduling the GitHub Action cron job. Since GitHub Actions runs in UTC and I’m in Pacific Time, I had to figure out the correct cron schedule to align with my timezone.

Working it out by hand, I did figure out that it would require two separate entries. However, this sparked an idea of building a tool to shift cron schedules for me, that I eventually built out in my tools site.

Lessons Learned

I definitely got a lot better at spreadsheet scripting after this project, and got more comfortable using Google Sheets for data wrangling. I also learned how powerful GitHub Actions truly are, especially since it’s free for public repositories. I’ll be using it more in the future.