Title: “Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data”
Abstract: Conventional macroeconomic analyses are based on monthly/quarterly variables (such as per capita consumption) released by government agencies. The Nielsen scanner data record the transaction price and quantity of thousands of products sold in major cities on a weekly basis. The high frequency, spatial, and detailed product level information that make the data interesting are also what make the data difficult to analyze. Not only do we have to overcome the bottleneck created by the volume of data, we also have to address the periodic and idiosyncratic features in the data that were previously handled by official agencies. I use seven terabytes of scanner data collected between 2006 and 2014 as case study to better understand the challenges that this type of big data can pose for economic modeling and the insights that this data may provide.
Bio: Serena Ng is a Professor of Economics at Columbia University and a member of the National Bureau of Economic Research. She received her B.A. from the University of Western Ontario and Ph.D from Princeton University. Her primary research is in methods for analyzing economic time series. She has written extensively on estimation and inference when data are non-stationary, model selection, and regressions with principal components as predictors in a data rich environment. Her recent work focuses on the opportunities and challenges that big data might pose for economic research. She uses terabytes of scanner data on consumer purchases to see what can be learned about aggregate economic conditions from micro level data.