Textual data are central to the social sciences. However, they often require several pre-processing steps before they can be utilized for statistical analyses. This workshop introduces a range of Python tools to clean, organize, and analyze textual data. It is intended for researchers who are new to working with textual data, but are familiar with Python or have completed the Introduction to Python workshop. Computers with Python pre-loaded are available in the SSRC on a first-come, first-served basis.
Helge-Johannes Marahrens is a third-year doctoral student in the department of Sociology at Indiana University, working toward a PhD in Sociology and an MS in Applied Statistics. His research interests include cultural consumption, stratification, and computational social science with a particular focus on Natural Language Processing (NLP).