The videos associated with Day 2 of SICSS 2021 covered a broad range of topics including screen-scraping, working with APIs, and building apps and bots for social science research. Some of you have already had experience with some these techniques– but for others, this may have been your first time learning about digital trace data. This group exercise is designed to find a balance between practicing rudimentary skills (for those of you with little or no experience in this area) and cutting edge techniques (for those of you with extensive expertise in this area). As an added bonus, this exercise not only challenges you to practice your coding skills, but also to think about how to ask questions that contribute new knowledge to social science theory (and thus motivate possible group research projects during the second week of SICSS).

  1. Within your randomly assigned groups, work together (on Zoom) to identify a research question that you believe can be answered using digital trace data. You will not have time to answer this question in full today, but the conversation you have today could eventually lead to a group project during the second week of SICSS. One of the reasons this step is so open-ended is that we want to encourage you to get to know each other, learn from each other, and practice asking questions together.
  2. Identify a sampling frame to help you answer this research question. For example, if your question is about politics, your sampling frame might be a list of elected officials on Twitter;
  3. Use legal screen-scraping techniques in order to collect the names of individual accounts, keywords, or topic areas to populate your sample.
  4. Write code to collect data from each unit of analysis in your sample (e.g. tweets of elected officials)
  5. (if you have time) Evaluate the strengths and weaknesses of the data you have collected
  6. (if you have time) Outline a hybrid research design (e.g. an app or a bot, a supplemental online survey, or other qualitative methods) that could be used to address the weaknesses of the data you collected, or otherwise improve your ability to answer the research question.