Streamlining Broadband Insights with cori.data.fcc
Olivier Leroy
Center On Rural Innovation
2024-10-20
How are you getting your data? Are you downloading it? Is it hard?
Overview
Introduction
Research using the latest broadband service data
Challenges to using the latest broadband service data i.e., More data more problems
Our broadband data package: cori.data.fcc i.e., How we accelerate innovation by making this complex, ever-changing broadband data more accessible and usable for research
š Hi, Iām Olivier, a Senior Data Engineer at CORI
Meet the people the people who can coax treasure out of messy, unstructured, data1
Working with source dataāoften referred to as ābigā, āmessyā, āunstructuredā dataāis a growing challenge
We believe that small towns are home to big ideas ā and combining new models of economic development with strategic investments in new infrastructure can empower rural communities across the U.S. to participate in and benefit from the nationās growing tech economy.
Broadband knowledge is an important part of our work
Out of necessity we had to become experts in broadband data
library(cori.data.fcc)dir <-"data_swamp/nbm/"get_nbm_release()nbm_data <-get_nbm_available()system(sprintf("mkdir -p %s", dir))dl_nbm(path_to_dl ="data_swamp/nbm",release_date ="June 30, 2023",data_type ="Fixed Broadband",data_category ="Nationwide",)# part to check if dl was successfulnum_files <-get_nbm_available() |> dplyr::filter(release =="June 30, 2023"& data_type =="Fixed Broadband"& data_category =="Nationwide") |>nrow()files_dl <-length(list.files(dir,pattern ="*.zip"))identical(num_files, files_dl)# TRUE
Created quality checks to reduce errors
Complexity is handled in our upstream process and abstracted so that users can focus on what brings value!
Added DuckDB
How to use the package > Choose your own adventure!
Broadband data at the census block (or tract, county, etc.) level is perfect for my research: Download the transformed data for NBM from CORI (ISP / County)
I need source data but working with hundreds of CSV is not for me: Download raw data as tables from CORI (NBM / Form 477)