- Date:
- 2020-02-28
- Main contributors:
- Byungkyu (BK) Lee
- Summary:
- In the era of “big data” revolution, social scientists face different types of challenges that we think are more technical, rather than theoretical. While it is certainly a challenge to analyze bigger than tera-byte data, the analysis of big data is not just a matter of solving computational problems. Big data provides a unique opportunity to solve society’s big problems if and only if it is analyzed through careful research designs and strong theoretical frameworks. This talk introduces two practical strategies for social scientists — parallel aggregation and matching — to make big data smaller so that we can overcome technical difficulties while making robust statistical inference. I will illustrate them based on my own trial and error during the analysis of large-scale medical claims data under the context of the US opioid epidemic. This talk also presents several tips for the effective management of big data.