Introduction to Web Scraping with Python

Copy the text below to embed this resource

2016-02-05 (Creation date: 2016-02-05)
Main contributor
Brodnax, NaLette
Web scraping is a method of extracting and restructuring information from web pages. This workshop will introduce basic techniques for web scraping using popular open-source tools. The first part of the workshop will provide an overview of basic HTML elements and Python tools for developing a custom web scraper. The second part will enable participants to practice accessing websites, parsing information, and storing data in a CSV file. This workshop is intended for social scientists who are new to web scraping. No programming experience is required, but basic familiarity with HTML and Python is helpful.

NaLette Brodnax is a data scientist and fourth-year doctoral student in the Joint Public Policy program administered by the School of Public and Environmental Affairs and the Department of Political Science at Indiana University. Her research interests include education policy, policy analysis and program evaluation, and quantitative research methodology. As a graduate assistant for the Center of Excellence for Women in Technology, she is working on a number of projects intended to expose women to technology and to support women using technology in their studies and careers. Prior to entering the doctoral program, NaLette spent nine years in corporate finance roles, managing large data sets and developing financial models for large companies such as Abbott Laboratories and Nokia. She holds a BSBA from The Ohio State University with a concentration in Finance and a Master's in Public Policy from Loyola University Chicago.
Indiana University Workshop in Methods
web scraping; Python; Workshop in Methods
Workshop in Methods
Social Science Research Commons
Related Item
Accompanying presentation materials on IUScholarWorks 

Access Restrictions

This item is accessible by: the public.