July 18 to 29, 2022 | University of Rome Tor Vergata
SICSS - Roma Tor Vergata will be held fully in person this year. Classes will roughly follow the same schedule, as shown in this section.
Morning
An in-class session with the organizers introducing the day’s content followed by lectures and demonstrations.
Afternoon
Exercises to be completed in smaller groups of participants, followed by a plenary session with the organizers to review the exercises for the day.
About the Exercises
It is important that you work through the chapters completely to understand all of the important concepts (both those covered in the session and the rest covered in the chapter). Afterwards, we have extracted a number of relevant exercises from the chapter to work through. For these exercises you can use the provided notebooks and data here to get started (download the full repo as a zip file). While the solutions for the exercises can be found on the Internet, we encourage you to work through the exercises using the chapter first to get used to solving problems with R.
Accompanying Notebooks and Data: View on GitHub.
09:00-10:00 Welcome and Introduction
Introduction to R and R Studio
Watch the Video on YouTube.
This session is based around chapters 2, 4, 6, and 8 of R 4 Data Science. It is important that you work through the chapters completely to understand the basics of working in R before we move forward with more advanced concepts. First, as shown in the video, download and install R and Rstudio. There are no exercises for this session, your task is to get setup with R and RStudio and work through the relevant chapters before we start working with the tidyverse tomorrow.
Relevant Chapters: Chapter 2, Chapter 4, Chapter 6, Chapter 8.
Data Transformation in R
Watch the Video on YouTube.
Relevant Chapters: Chapter 5.
Exercises available here.
Working with Data in R
Watch the Video on YouTube.
Relevant Chapters: Chapter 11, Chapter 12, Chapter 13.
Exercises available here.
Data Types in R
Watch the Video on YouTube.
Relevant Chapters: Chapter 14, Chapter 15, Chapter 16, Chapter 20.
Exercises available here.
Data Visualisation
Watch the Video on YouTube.
Relevant Chapters: Chapter 3, Chapter 28.
Exercises available here.
Introduction to Computational Social Science. Video. Slides.
Ethics. Video Part 1. Slides Part 1. Video Part Two. Slides Part Two. Video Additions and Extensions. Slides Additions and Extensions.
Exercises available here.
Guest Speaker: Casey Fiesler
Data Is People: Ethical Considerations in Computational Social Science
Everyone’s tweets, blog posts, photos, reviews, and dating profiles are all potentially being used for science. Though much of this research stems from social science and purposefully engages with the human aspects of online content, in many cases this human-created content simply becomes “data”—particularly for the creation of training datasets for machine learning algorithms. In these kinds of contexts—from algorithms trained on dating profile photos to recognize gender to algorithms that can predict mental health conditions from your tweets—traditional ethical oversight such as university Institutional Review Boards often does not apply. But what is the line between “data” and human subjects research? In this talk, I draw from empirical work to argue that the current ethical metrics that many researchers use to determine whether it is okay to collect or use online content are all wrong, particularly when it comes to the “publicness” of data or whether collection is allowed by Terms of Service agreements. I discuss findings from studies of user perceptions of researchers’ use of tweets, analysis of social media TOS, interviews with members of vulnerable online communities, and a literature review of papers that use Reddit data, all to consider the broader landscape of research ethics when it comes to computational social science.
Strengths and Weakness of Digital Trace Data. Video. Slides. Annotated Code.
Application Programming Interfaces. Video. Slides. Annotated Code.
Screen Scraping. Video. Slides. Annotated Code.
Building Apps and Bots for Social Science Research. Video. Slides. Annotated Code.
Exercises available here.
An Introduction to Text Analysis. Video. Slides. Annotated Code.
Text Analysis Basics. Video. Slides. Annotated Code.
Dictionary-Based Text Analysis. Video. Slides. Annotated Code.
Topic Models. Video. Slides. Annotated Code.
Text Networks. Video. Slides. Annotated Code.
Exercises available here.
Timing TBC Lightning Talks
You can host a partner location of the Summer Institutes of Computational Social Science (SICSS) at your university, company, NGO, or government agency.