Post-Mortems

A collection of post-mortems

Summer Institutes in Computational Social Science 2021 Post-mortem

Published on: Sep 5, 2021

We’ve just completed the 2021 Summer Institutes in Computational Social Science. The purpose of the Summer Institutes is to bring together graduate students, postdoctoral researchers, and beginning faculty interested in computational social science. The Summer Institutes are for both social scientists (broadly conceived) and data scientists (broadly conceived). This summer all of our Institutes were virtual because of COVID-19, but we will still refer to them by their planned physical location. In addition to SICSS-Princeton, which was organized by Chris Bail and Matthew Salganik, there were 20 partner locations run by SICSS alumni.

  • SICSS-Beijing organized by Yan Leng, Tian Yang and Yuan Yuan.
  • SICSS-Bologna organized by Filippo Andreatta, Giampiero Giacomello, Marco Albertini, and Matthew Loveless
  • SICSS-Chicago organized by Kat Albrecht, Carrie Stallings and Yian Yin
  • SICSS-Brazil organized by Marco Aurélio Ruediger, Tiago Ventura, Amaro Grassi, and Danilo Carvalho
  • SICSS-HSE University organized by Elizaveta Sivak, Sofia Dokuka, and Ivan Smirnov
  • SICSS-Helsinki organized by Matti Nelimarkka
  • SICSS-Hong Kong organized by Han Zhang, Jaemin Lee, and Haohan Chen
  • SICSS-Howard/Mathematica organized by Naniette H. Coleman
  • SICSS-Istanbul organized by Akin Ünver, Yunus Emre Tapan, and Ahmet Kurnaz
  • SICSS-Law organized by Rūta Liepiņa, Monika Leszczyńska, and Catalina Goanta
  • SICSS-Lisbon organized by Qiwei Han and Filipa Reis
  • SICSS-London organized by Andrea Baronchelli, Joshua Becker, Nicola Perra, Milena Tsvetkova, and Mike Yeomans
  • SICSS-Los Angeles organized by Alina Arseniev-Koehler, Jennie Brand, Pablo Geraldo Bastias, and Bernard Koch
  • SICSS-Montreal organized by Vissého Adjiwanou
  • SICSS-Oxford organized by Chris Barrie, Charles Rahal, Francesco Rampazzo, and Tobias Rüttenauer
  • SICSS-Rutgers organized by Michael Kenwick, Katie McCabe, Katya Ognyanova, and Andrey Tomashevskiy
  • SICSS-Stellenbosch organized by Richard Barnett, Douglas Parry, Petrus Schoonwinkel
  • SICSS-Taipei organized by Feng-Yi Liu and Robin Lee
  • SICSS-Tokyo organized by Hirokazu Shirado and Makiko Nakamuro
  • SICSS-Zurich organized by Elliott Ash, Malka Guillot, and Philine Widmer

The purpose of this document is to describe a) what we did, b) what we think worked well, and c) what we will do differently next time. We hope that this will be useful to other people organizing similar Summer Institutes. If you are interested in hosting a partner location of SICSS 2022 at your university, company, NGO, or governmental organization, please read our information for potential partner locations.

This document includes post-mortem reports from all of our locations in order to facilitate comparisons. As you will see, different sites did things differently, and we think that this kind of customization was an important part of how we were successful.

SICSS-Princeton

We’ve divided the post-mortem into 6 main sections: 1) outreach and application process; 2) pre-arrival and onboarding; 3) first week; 4) second week (group projects); 5) second week (SICSS Festival); 6) post-departure.

1. Outreach and application process

We continue to think that the best way to have a great Summer Institute is to have great participants. As in previous years, we advertised our event to a large, diverse group. Our major outreach effort began in January—once almost all of the partner locations had been finalized. We emailed former participants and former speakers. We also advertised through professional societies and asked our funders to help spread the word. Finally, we tried to reach potentially interested participants through social media, email lists, and emails to faculty that we thought might know interested participants. We made a special effort to reach out to faculty that we thought might know people from groups that are under-represented in our applicant pool.

Managing the application process continues to get smoother each year. In 2019, at the request of our funder, the Russell Sage Foundation (RSF), we switched to Fluxx (partner locations are not required to use Fluxx and are not allowed to use the RSF instance of Fluxx). Based on what we learned in 2019 and 2020, we improved the process and it went pretty smoothly. One issue that we have considered off and on over the years is the role of letters of recommendation, which are currently required for grad student applicants.

This year while reviewing the applications, we weighed whether the letters of recommendation were worth the additional burden that they impose on our community. Ultimately, we decided that the letters were not critical for our evaluation, given the other information we have. We do not plan to request them in the future. This was our first year accepting applications for an event that everyone knew would be virtual, and we continued to receive many high quality applications. Our overall number of applications was down slightly, and we think this is because of the huge number of excellent partner locations (there were 20 this year) and the difficulty of people participating virtually, especially for people who are not in our time zone.

2. Pre-arrival and onboarding

After participants are accepted we begin to onboard them into the program and provide them with pre-arrival materials. The goal is to have them arrive at SICSS ready to learn. The onboarding process went smoothly and we have no major changes to recommend.

Onboarding emails

We added all participants and staff to a Google group and sent out an email to the group with the following pre-arrival logistics:

  • A request that participants fill out a Google form (the “onboarding survey”) to provide bio information for our website. In prior years, the onboarding survey we used was only for the SICSS-Princeton or SICSS-Duke location, and other locations created their own surveys if desired. This year, we had a SICSS-wide onboarding survey used by all locations. This new SICSS-wide onboarding survey automatically fills a new part of the website: a directory of all past SICSS participants. However, the survey does not automatically fill the “people” page for each location (e.g., https://sicss.io/2021/princeton/people). We manually copied the survey responses to post bios on the SICSS-Princeton “people” page.

  • Information about joining the Slack workspace. We added people to the workspace and told them they should have received an email inviting them to join. In case they couldn’t find that email, we also included a link via which they could join the workspace. We set up the workspace so that anyone who joined (from any SICSS location) would be automatically added to the channel #sicss-all-office-hours, which contained information about office hours that TAs held prior to arrival. For SICSS-Princeton participants, we also had a channel called #sicss-princeton.
  • Information about pre-arrival activities they should complete. These included reading, optional coding activities for participants who wanted additional practice, and watching our lectures. We provided a link to the pre-arrival section of our location’s webpage for more details.

After a couple of email reminders, everyone completed the Google form.

There were a few subsequent onboarding emails: (1) an announcement about the virtual meet-and-greet scheduled for Sunday night of the first week, (2) instructions for applying for data access for the Fragile Families Challenge activity, and (3) a final reminder about the contents of the earlier emails (pre-arrival materials to complete, meet-and-greet time, and Fragile Families data application instructions).

Office hours

Another component of the pre-arrival support is office hours run by our TAs. These office hours are open to participants at all SICSS locations. We provided 5 weeks of office hours from 5 TAs. We tried to spread out different times so that people from different time zones can attend at least one if needed. Few attended the sessions. For the ones who attended, TAs helped with troubleshooting technical issues or addressing any other questions they had.

T-shirts or other SWAG

In prior years, we asked participants for their t-shirt size and a shipping address to which to send the shirt as part of the onboarding process. This year, we used a new system to distribute t-shirts or other SICSS-logo SWAG (“Stuff We All Get”). Due to the time it took to work out the details of this system, SWAG distribution was part of the post-departure logistics this year rather than pre-arrival. See details below in the post-departure section.

Week 1

The first week of SICSS is traditionally a mix of lectures and group activities. This year, we basically stuck to the format we developed in 2020 for our first virtual SICSS. To reduce Zoom fatigue we used a “flipped classroom” model where we pre-recorded our lectures and asked participants to watch them before arriving. We also reduced the length of the day (e.g., starting at 10am rather than 9am), added more breaks, reduced the number of guest speakers, and made more events optional. Overall, we think these changes were necessary and reasonably effective.

We began the first week with a virtual meet-and-greet on Sunday evening. We think it is nice for people to meet at a social event, and it provided us a chance to answer some questions and ensure that everyone was ready to go on Monday morning.

We used several different models during the 5 days of instruction, and we received different feedback on different days. It is hard to know how much of the results were attributed to our instructional choices, as opposed to the content of the day, the fact that participants were getting more familiar with the format and each other, or the fact that participants were generally getting more exhausted. All days followed a structure where participants were split into smaller groups to work together on activities (which are all available from our website). What differed across days was how open-ended the activities were, how multi-dimensional the activities were (i.e., did they require a mix of skills or just a single skill), the sizes of the groups (between 3 and 6), how the groups were formed (e.g., random or designed to be mixed skills), and whether we came back together at the end of the day for participants to share and discuss what they did in their smaller groups. It is not clear if there are right answers to any of these decisions, but we think that each should be made explicitly based on the learning objectives for that day.

As expected, collaboration over Zoom proved tricky, especially for activities that required coding collaboratively. Different groups experimented with different ways to navigate this, including screen sharing, working individually, and passing files back and forth. Based on what we saw, we encouraged them to take turns “in the driver’s seat” and also be sure to spend a few minutes at the beginning and end of each activity with their hands off the keyboard (i.e., reflecting on the activity, rather than rushing to debug something). Another challenge to shared participation was the choice of programming language. Most participants prefer R and some Python. We did not choose to separate people based on their preferred languages. One reason is that collaboration in computational social science may involve people working together using tools they’re not familiar with.

At an in-person SICSS a lot of learning and community building happens over lunch. Last year, we tried to mimic that online by creating a few breakout rooms from 12-1 on each day, based on topics suggested by participants.. This year we switched to a format where participants could give “flash talks” about their work or other projects of interest to them. The flash talks were optional to give and optional to attend, but we think they helped participants get to know each other better in week 1.

Week 2 (Group Projects)

A major part of SICSS is participant-led group research projects during the second week. This year, because of the online nature of the event and the challenging nature of the times, we decided to make the group projects optional.

In our research group matching process, we used a google spreadsheet for people to add research interests and then clustered them into maximally similar groups and maximally dissimilar groups. Many of these conversations turned out to be generative of project ideas.

It is much more difficult to keep track of group projects in a virtual setting. Though we had a list of project ideas and very tentative group assignments from early in the week, this changed through the week in ways that were difficult to track. By the end of the week, we were no longer sure how many participants were participating in group projects, and we were much more aware of some projects than others.

One improvement over last year is that we had more concrete information about the post-SICSS research grants available to participants during week 2. For those who have projects that require grants, we hope this information was helpful.

Chris and Matt also hosted sign-up office hours during week 2, and many participants took advantage of this opportunity. Many of the discussions were not about their group research projects but about computational social science more generally.

On Friday there were four groups presenting projects, and these groups involved about a bit more than half of the participants. The presentations and discussions were lively, and we think this is a great way to wrap-up the week.

Week 2 (SICSS Festival)

This year we hosted the second SICSS Festival. During the Festival, alumni from all SICSS locations hosted events such as tutorials and panel discussions. With the Festival we were hoping to provide learning opportunities to a larger and more diverse set of people than those who can commit to attending a two week program. We also wanted to provide an opportunity to showcase the contributions and expertise of our amazing alumni.

This year the Festival had seven events, with one organized and run entirely by SICSS-Law. Building on what we learned in 2020, this year we had a mix of events that were targeted to people with various levels of expertise in computational social science. This enabled the Festival to serve as an on-ramp to our community. Registration and attendance: Attendance at the events was quite good, and we had about 400 total attendees. We also found that about 1/3 of registered participants actually attended.

  • Using images and video data for social science: Challenges and opportunities (webinar), 262 registered, 100 attended
  • Introduction to Text Analysis in Python: A Hands-on Tutorial (meeting), 84 registered (capped), 48 attended
  • Panel discussion on the non-academic job market in computational social science (webinar), 230 registered, 97 attended
  • Tutorial on deep learning for causal inference (meeting), 144 registered, 62 attended
  • Taking Quantitative Description Seriously (meeting), 154 registered, 38 attended
  • Creating your own virtual lab experiment with Empirica (pre-recorded videos + slack office hours), 144 registered
  • Panel discussion on the (many) paths to computational social science research in law, registration through SICSS-Law

Creating the festival started with outreach to SICSS alumni. This year we sent out the email to SICSS-Princeton and SICSS-Duke alumni about one month before SICSS began. We asked interested alumni to propose event titles, descriptions, and potential speakers, and we provided feedback to help make their description crisp and clear for the festival webpage. One thing we heard from alumni was that for some alumni one month was not enough time to coordinate a full event, and this outreach to alumni should be done earlier next year.

Once the events were finalized—-or nearly finalized—-we posted the information to the festival web page with signup forms. This year we used a google form to get information about the registered audience, and then emailed them the zoom link about 24 hours before the event. If you take this approach in the future, we recommend that you directly include the zoom link in the google form. An alternative would be to run the registration through zoom. Once the overall events list was finalized, we promoted the festival through Twitter, Slack, listservs, etc.

For each particular event, we had let the panelists know ahead of time that we hoped to record the event. We also asked panelists to come 15 minutes ahead of time to the zoom room to go over logistics.

At the beginning of the very first event, we forgot to turn on the recording. So, we moved to a system where in his introduction & conclusion to the event, Matt said “we’re going to start the video recording now” and “we’re going to end the video recording now.” That worked well. We also kept the zoom room open after the recording stopped to enable some casual chatting.

After each event was over, we posted about each event on Twitter. We also sent a thank you email to the panelists and moderators. Finally, we processed the videos and posted them on the Festival website and the SICSS YouTube page.

Post-departure

After SICSS was over, we worked to get participants their SWAG through Swagpack. We also encouraged them to apply for the post-SICSS grants, and we encouraged them to stay in touch with the SICSS community through our Slack and Facebook groups.


SICSS-Beijing

We’ve divided the post-mortem into 6 main sections: 1) outreach and application process; 2) pre-arrival and onboarding; 3) first week; 4) second week (group projects); 5) post-departure. 6) Challenge of online format.

1. Outreach and application process

We first made decisions on the main theme of our summer institute to be network science and the format of the first week to be reading groups with four topics, including introduction and models; machine learning; social contagion/collective intelligence, and misinformation/polarization. Due to the format, it is especially important to attract great and diverse participants. We advertised our events to a large and diverse group, both in the US, China, Singapore, and India. In the US, Singapore and India, we reached to faculty that we thought might know interested participants. In China, we advertised both through maillists, wechat groups, Sina weibo, and wechat public accounts, the latter three are especially effective formats in attracting applicants in China. We made a special effort to reach out to faculty in different departments, other than departments where the three organizers are from.

For the application process, we used google form. To ease the application review process, we require 200 words about achievements, potential contributions, and research keywords. In reflection, we think this makes the review process especially efficient in balancing research interests. We also required CV, statement and writing samples. We only require contact information of referees without an actual recommendation letter. We think this can reduce the burden from application and the information collected is already sufficient for selection and evaluation.

We have received 150 applications from 113 organizations. We receive 24% more female than male applicants. 50% of our applicants are PhD students, 24% are master students, 8% are assistant professors, 4% are postdoctoral researchers, 3% are associate professors, with the remaining are research staff. We add the background of our applications in Figure 1.

In the application review process, we use a scoring system. Two organizers will review each applicant. We then sort applicants according to the average scores. During the review process, we account for diversity of research interests and whether they can create a synergy in collaborations.

Figure 1. Background of applications.

2. Pre-arrival and onboarding

On April 21, we sent our decision letter to the admitted applicants. To confirm their participation, we asked them whether they could fully attend our event. In that email, we also provided a rough schedule of our event—which contains the plan of invited speakers and meeting opportunities with speakers, and the format of participant-led activities (which included reading groups, participant presentations, and group projects). We believe that a sketch of the event is essential to convince their full commitment to our event at this stage. Out of 30 admitted participants, about 25 accepted our admissions.

Besides acquiring their confirmation, we also built a slack channel. Once they committed full participation, they would be invited to a slack channel. We attempted to employ this channel to send quick messages and facilitate informal interaction. However, it did not work well—the software is not ubiquitously adopted in East Asia. Meanwhile, participants seem to like to prefer one-to-one private online messages. Meanwhile, due to the low rate of adaption, announcements were still primarily communicated by emails—we were not fully confident whether our messages reach all participants from our slack channel.

The major pre-arrival task for our participants was to design their reading groups. Our three organizers, according to the site theme (“network science”), came up with several topics that participants could freely select from when they received the admission letters. We put them into four groups that satisfy their interests, consider their backgrounds, and balance the participant disciplines in each group. We then asked them to set the format of reading groups and compiled a reading list for each group in May. Only half of our groups finished the task and sent the list back on time. Indeed, it is not too realistic to ask participants to work too much before the event, especially the group task, particularly given that they did not know each other, and our participants were mostly busy senior PhDs or junior professors. We should lower our expectations on pre-arrival tasks.

Meanwhile, we asked our participants to give us their preferences on private meetings with invited speakers, and their willingness to present their own work in the second week. We asked each participant to list at most four speakers (ordered) he/she wanted to meet. And this way of assigning private meetings works pretty well. No one complained about the fairness of our assignment.

Finally, to help our participants easily identify helpful information, we compiled a “handbook” (a google doc) which contains the schedule, the zoom link, the reading list, and all arrangements. We sent it to our participants 3 days before the event. An advantage of this kind of handbook is that this is editable and we could easily change content even after we sent the read-only link to our participants.

3. First week

Our first week is composed of four main components - speed dating, reading group, guest lectures, and group project formation. From Monday to Thursday, we host a speed dating session, a reading group session, and a guest lecture session every day. Friday is mainly for group project formation.

Speed dating was designed to encourage interactions among participants. In this online format, we miss many naturally occurring interactions, which would be extremely beneficial for them to get to know the research interests of each other and establish academic connections. We therefore designed this speed dating session. We shuffled participants into zoom breakout rooms. Each breakout room contains 3-4 participants. They were encouraged to discuss either research or personal interests. After a few sessions of speed dating, participants got to know and make friends with each other.

Reading groups are lectures led by each pre-assigned group. We had four reading groups so every group had about 7 people. Each group was asked to present a topic in network science — introduction and models; machine learning; social contagion/collective intelligence, and misinformation/polarization. Each reading group session lasted for 90 minutes. The reading group materials turned out to be high-quality and provided guideline participants to explore their group project topics.

On Friday of the first week, participants have two 90-minute speed-dating-like sessions to discuss research ideas for the group project. We used a google spreadsheet for those who have concrete research ideas to write them down. Each research idea corresponded to a breakout room. Other participants joined the break rooms to explore the research idea that interested them the most. By the end of Friday, most participants found their group and started working on the project. We had six groups in total. Most groups had diverse participant backgrounds, which was what I hoped to see.

We have 10 guest lectures throughout our event. The topics covered agent based modeling for networks, statistical models for networks, machine learning for networks, diversity and networks, misinformation, networks for scientific collaboration, social influence and social media. These topics were highly relevant to network science and participants’ research interests.

We accommodated 1:1 sessions following each guest lecture. Each session lasted for 1 hour and contained 4 slots. In each time slot, 1-3 participants met with the speaker, asking each question regarding their talk or other general questions. They met in one breakout room. At the same time, participants who did not sign up for the 1:1 entered other breakout rooms. We asked them to brainstorm research ideas in the first week and worked with team members on projects in the second week.

4. Second week

Our second week is composed of three main components - group projects, guest lectures, and participant presentations. From Monday to Thursday, we host a group project meeting session, a participant presentation session, and a guest lecture session. Friday is mainly for the group project final presentation.

The group project meeting session is mainly designed to help participants to meet each other every day to proceed with the projects. They discussed their projects in their breakout rooms separately.

About 20 participants signed up to present their own research projects. Each talk lasted for 15 minutes including Q&A. Both talks and participant feedback were in general high quality.

On the last day, we asked all groups to present their projects. Each group presentation lasted for 30 minutes; and we encouraged 15 minutes’ presentation plus 15 minutes’ Q&A. Most group projects have made good progress — they either presented very detailed research project proposals or preliminary results.

At the end of the last day, we hosted a small social event. We took photos on zoom and officially said good-bye.

5. Post-departure

After SICSS was over, we encouraged participants to apply for the post-SICSS grants, and we encouraged them to stay in touch with the SICSS community through our Slack and Facebook groups. Our participants also plan to organize an offline meetup in Beijing.

6. Challenge of online format

Here are some challenges we encountered from organizing our site:

  • How to attract people to a two-week online event (intriguing topic; high-quality speakers)
  • How to let people stay for two weeks (community-building; some rules to regulate attendance)
  • How to inform participants of all relevant information (an online dynamic handbook; email notifications)
  • How to build community virtually (abundant social events; high-quality participants)
  • How to ask participants to work before the event (it is hard to do; lower the expectation)
  • How to avoid zoom hacking and online talk accidents (prepare plan B; pretest the software)
  • How to ask participants to organize their group projects (put one hour in each day of the second week as compulsory group project time to let them work together so that they were mindful of other groups’ progress; free to attend/quit a group)

SICSS-Bologna

SICSS-Bologna was hosted at the Department of Social and Political Sciences (DSPS) at the University of Bologna (UNIBO). This was the University of Bologna’s first Summer Institute partnership as well as the first Italian SICSS partnership. We were a fully self-funded, virtual partner with an asynchronous two-week program (21 June 2021 until 2 July 2021: our first week was, for example, Princeton’s second week).

The post-mortem follows the following (basically temporal) order: application process, developing the program, pre-arrival/onboarding, first week, second week, and debrief/departure.

SICSS-Bologna: APPLICATION PROCESS

Following the lead of SICSS-Princeton team and others, we were quick to advertise. We advertised SICSS internally in Political Science, Economics, and Sociology (among PhD students, post-docs, and faculty). We targeted other top Italian universities and placed advertisements in the newsletters of the International Sociological Association, the European Consortium for Sociological Research; the Italian Political Science Association, and the European Consortium for Political Research. We also used the department’s website and social media presence.

For the applications, although some other partner locations used FLUXX, we opted for the more basic but sufficient Google Forms. We had initial worries about the online nature of this year’s Summer Institute. We thought the number of applications would be low because of the fluidity of competition among partner sites and the lower attractiveness of not actually going to study for 2 weeks in Italy. However, we were fortunate to attract a very competitive pool of high-quality participants. For the application itself, we would make a few small tweaks to the application itself and like Princeton, we found the letter of recommendation a less effective separator of students than other attributes/skills/projects.

SICSS-Bologna: DEVELOPING THE PROGRAM

In designing and developing the program, we relied on the input of previous participants, speakers, and the SICSS community. We planned to have interactive training based on the Princeton material mixed with speakers from outside our organization and opportunities for participants to present their work and get feedback, network, and even collaborate.

This was the intent in January 2021. However, northern Italy, as many places, we returned to a severe lockdown, virtual classes and administrative meetings – and crucially, for nearly all of the SICSS-Bologna organizers – a closure of primary and secondary schools for 2 months. The situation was difficult for everyone. However, we managed to assemble much of what we wanted.

What was an ambitious original idea, as we would discover, might have made a bit of difference in the end (yes, a more fully developed program would have likely been more successful, surprise); yet, the ultimate organization worked out 90-95% as well. To minimize our exposure to problems, and because SICSS would be virtual, we aimed to ‘brand’ SICSS-Bologna by following the Princeton-led Week 1 and hosting a series of events in the second week aimed at highlighting ‘homegrown’/local experts and focusing on participants’ research projects. The goal of the second week was to offer students an opportunity to interact with a range of practitioners.

We began contacting speakers (and understanding their availabilities) immediately (January). Many of us were new to both Github and Slack but managed to get up to speed fairly quickly. We also had to make a last-minute switch to MS Teams rather than using Zoom because of a last-minute administrative email that short-circuited our purchase of the necessary level of Zoom. This change received mixed reviews although Teams worked fine for us (UNIBO) and a sizeable majority had no problem or preferred it.

We devised four main activities:

  1. A morning discussion about key concepts and issues raised in the videos.
  2. An afternoon of activities that mixed the existing and available material from [https://sicss.io/curriculum] with material borrowed from others’ activities as well as developed by faculty in the department.
  3. A Brown Bag Seminar series for each day or 1 or 2 participants to discuss their projects and get feedback (starting mid-week, week 1)
  4. Invited speaker series (we had 10 outside speakers overall)

SICSS-Bologna: PRE-ARRIVAL/ONBOARDING For the participants that accepted (12), we circulated the Boot Camp tutorial (https://sicss.io/boot_camp) approximately a month before SICSS-Bologna with an explicit agreement to work through the material (if needed). Nearly all reported as this being helpful – if only to manage their own expectations of the Summer Institute. In addition, we asked participants to indicate their level of R and Python. We could have taken a more proactive evaluation of this information in designing the small group activities.

At the direction of Princeton, we forwarded the onboarding documents/survey to the SICSS-Bologna participants so that they would be automatically hosted on the SICSS main webpage. We uploaded our faculty and organizers’ details as well as the speakers’ details as they agreed to participate.

We also circulated the Slack workspace to all participants and organizers. Despite initial hesitation, this turned out to be a very useful tool particularly for exposing the SICSS-Bologna participants to the activities and opportunities of the partner locations. It was also actively used as the main means of communication (meeting links, webpages, papers, further discussions). We recommend to do the same even in the context of an in-person event.

We did not participate in the initial social event and maybe we should have. All of the organizers were unfamiliar with the social website such as gather.town (which was suggested but pretty late for first time users) and spatial.chat. We opted not to use these and maybe wish that we have tried organize at least one event (e.g.: end of week 1?).

We have 4 in-house professors and added a Teaching Assistant (who is a post-doc at UNIBO). I think we would benefit from no less than 2-3 TA’s next year.

SICSS-Bologna: FIRST WEEK

The first week was very similar to the schedules of many other partners. We used the mornings to discuss more deeply some of the key issues of Ethics (Monday); Trace Data (Tuesday); Text Analysis (Wednesday); Surveys and Mass Collaboration (Thursday), and Experiments (Friday). We allocated these based on our distribution of skill sets and interest (many of the leaders of these days bringing their own research into participants’ hands).

We started every day at 10 and ended at 4 pm. We had circulated a list of videos that the participants should watch before – and crucially with us. Each morning, we watched one of the SICSS videos together (usually 20 mins) as a means to kick off the conversation and ensure that everyone had at least watched one. The usual selection was the most general (again, in order to speak to the middle of the topic). For the participants, this was useful for many although some other lamented losing the 20 minutes. Next year, we will use the ‘flipped classroom’ approach of assigning all of the videos to watch before.

The afternoons were organized around the activities which uniformly followed the small group activities and returning to a large group to debrief. This was largely successful although by the end of the week, it was clear that if we could have mixed up the pedagogy, that might have been appreciated.

We did not participate in the Fragile Families Activities. For Europeans, while the project is certainly interesting, most of our participants were in the field of political science and focused on Europe.

We also started a Brown Bag Seminar series of participants presenting their work for feedback. We specifically asked participants to present parts of their projects on which they are seeking the most feedback. Most participants were happy with the feedback and help, some wanted something a but more interactive. As other partners have noted, the online nature of this first week was ok. Most participants had been doing online training/school for the past year and were familiar with the ins and outs of online learning. Still, it is a long time to be sitting in front of the computer (the Brown Bags started strong but by the end of week two were clearly a struggle for everyone). As mentioned above, this is where Slack carried some of the weight. That became the area of activity.

There was some discrepancy among R and Python users. While most fell into one group or the other, some did try to coordinate and share code and tools.

SICSS-Bologna: SECOND WEEK

We did not go with group projects in the second week. One, we were short-handed (an organization misunderstanding of the demands of doing so) and two, we were eager to ‘brand’ SICSS-Bologna with UNIBO. We organized a line up of 10 speakers for the week and Brown Bag Seminars each day.

The speakers ranged from very good to excellent. We felt very fortunate to be able to corral so many to come speak to our participants – who made it clear how much they enjoyed it.

There were two drawbacks. One, in the second week, there was a European heat wave. Two days in Bologna that week were 39/40C (102-104F) and most people, certainly graduate students, do not have air conditioning (one student in Sweden was also hit with a heat wave in the first week and complained at how hard it was to concentrate). Two, the schedule of the speakers was difficult to distribute evenly. Some presented from time zones in Oxford, California, Massachusetts, France, and others. This forced two days, in particular, to be quite intense (10-6:30). We were unsure of how to untangle this while planning and hoped for the best. The speakers were well-received but by the end of the second day of this, the participants were exhausted.

This was most evident in the Brown Bag Seminars – tired participants make for tired presenters and commenters. Next year, we will keep this in mind.

SICSS-Bologna: DEBRIEF/DEPARTURE

On the final day, we had one speaker and a departure debriefing. We had expected a short send off but instead had a very productive 1 hour discussion about the events. We have included some of the participants’ feedback above but some things are worth mentioning quickly again. In an effort to make SICSS-Bologna as good as possible, we may have instead made it as busy as possible. These are not necessarily the same thing.

We note that the overall rating was quite high. The feedback was intended to improve SICSS-Bologna not simply to complain.

What was clearest is that the online nature was ultimately the most limiting dimension of this year’s SICSS-Bologna. Comments included limiting the time of speakers, splitting speakers by topic (simultaneous/optional), and possibly fewer overall. The ability to stay attentive to online presentations was too demanding – despite the high quality of invited speakers. Secondly, the limitation of virtuality was costly to networking and interaction. This was most clearly evident in the feedback for more interactive or collaborative opportunities. Whether as small groups taking their own (spontaneous) initiatives, composing their own groups, or doing more actual coding/programming with like-minded participants. Clearly these are considerations for next year’s Summer Institute.

We did make a t-shirt for all participants, speakers, and organizers. They are being completed by the University and will be distributed by post when ready (free to all). Princeton also offered Swagpack to all SICSS-Bologna participants. Very nice!

We forwarded the post-SICSS feedback survey and the announcements about grant opportunities. We plan to reach out to the participants in 6 months to ask for an update on their projects.


SICSS-Chicago

Application Process

We solicited applications from potential participants with several intentional directives. One, the institute would be virtual. Two, we wanted as much representation from varying types of universities, whether they be local liberal arts colleges or big state research universities. Three, we wanted a diverse cohort, both in terms of skill sets and interests (so, searching for participants with backgrounds in a variety of fields with varying coding/computational methods experience) and identity (racial/ethnic, gender, sexuality, etc.).

Given our goals, we sent our call for applications to institutions across not only Illinois but the midwest more broadly. We emailed specific people first, including experts in the field, colleagues we knew, and SICSS alumni, asking them to disperse our call within their network. Additionally, we got onto several listservs for social science, computer science, data analytics/science and complex system departments. We also pushed our call for applicants through Twitter channels held by the organizers and various units of colleges in Chicago.

We opted to use the full SICSS application (letters of rec, research statements, writing samples, and CVs) in order to discourage half-hearted applications. We received over 100 completed applications from scholars (and industry professionals!) from a variety of backgrounds, including sociology, archaeology, anthropology, computer science, technological sciences, physics, and more. These applicants represented a diverse set of identities.

In selecting, we considered several points. We knew we wanted cohesion, i.,e., a group of participants who had a variety of common interests and a diverse set of skills. We also desired to have most of our applicants from Illinois, but felt it important to have a portion of applicants be from other parts of the midwest given the unique opportunity of attendance that many had because of the virtual format. We were also very conscious that there was a dearth of SICSS sites in the central United States this year, so that for many applicants we were in fact the closest SICSS site. Furthermore, we wanted a diverse set of backgrounds among the participants to ease the process of group work, i.e., those with lots of coding experience, those with experience formulating social research questions/projects, those interested in teaching computational science, and those who wanted to learn computational methods for the first time. Lastly, we were excited to receive (and accept some of) applications who strive to introduce computational social science into their fields (e.g. archaeology and anthropology).

With these points in mind, we selected 30 applicants. We also opened additional spots for two undergraduate participants, a result of a collaboration with the Data Analysis and Social Inquiry Lab at Grinnell College. In doing so, we wanted to broaden the intergenerational nature of the institute. The quality of applicants was significantly different than previous years (especially the first year of the institute). A vast majority of applicants had some computational social science experience or programming experience, which was not the case in the first year of the institute. We also saw more applications from faculty and postdoctoral candidates than in years 1 or 2 of the institute. We believe that ~40-50 of the rejected candidates would also have been successful at the institute. Ultimately, we were very happy with our cohort and received only one declination (someone who had already accepted a SICSS site that notified them earlier). Our attrition was incredibly low, even into the group project week (which we were concerned about given the virtual format). The participants worked well together, and many reported learning from one another; the TA’s ability to visit the varying groups in the Zoom breakout rooms aided greatly in our capacity to measure group progress/functioning. Many fantastic project ideas were produced and piloted within the group. Additionally, the undergraduate participants excelled, participating, learning, and often even taking leadership roles within their groups.

Onboarding

Our onboarding process included several steps, including the gathering of participant information, creating and populating the slack channel, as well as drafting and distributing the institute’s schedule and other materials. We communicated with the participants primarily via email until right before the institute, which was a marked improvement over previous years where the Slack channel was started too early and fizzled out before the start of the institute.

Participants did not access TA/computational help resources very often before the start of the institute, which was similar to previous years. There were only one or two requests for materials/primers that were referred to the SICSS website successfully. We hypothesize that this occurred for a couple of reasons. First, the time commitment to self-teach before the institute was likely more nebulous and difficult in the virtual-pandemic environment. Second, our quality of applicants improved significantly this year such that a vast majority of participants were proficient or at least familiar with coding basics. Because of patterns in previous years, the organizers handled on-boarding, the website, and the very small number of logistical questions at the start of the institute rather than requiring the TA to do so.

Lecture/Workshop Week

As computational social science develops rapidly, we have also updated our lecture materials to reflect the latest trend of this field. For example, we have seen an increasing trend of Python usage within our participants and the broader community[1]. As a result, we have created two versions of coding materials (Python and R) for lectures of web scraping. We have also leveraged the Google Colab platform as a standard way to test and run code, which saved the participants a great amount of time and effort in version/system issues on their local machines. We have made these materials publicly available and hope other sites find it useful. With regards to specific lecture content, we decided to follow along with some of the material available from the mainsite. However, we played only a few of the pre-recorded videos from the main site and amended the rest of the lecture content into a live lecture given by the organizers. This way, we had more interactive content which encouraged participant engagement. We were very worried that an all virtual institute would lack a cohort-bonding component if we used too much pre-recorded content. Furthermore, we hosted Northwestern University’s Research and Computing Services Team on our text-analysis day to boost the interactivity of the institute; a data scientist from their team held a 2.5 hour long workshop on an introduction to text analysis in R which included live coding activities.

Participants really enjoyed this aspect of the lecture week and several reported that they felt they learned a lot in the workshop format. In the future, we may consider more workshop style lecture activities, especially in a virtual environment.

We included site specific lecture content as well, playing to the unique research strengths of the organizing team. Specifically, we had lectures on surveying hard to reach populations and the concept of data sovereignty. Given the engagement with these lecture contents, we would encourage other sites to offer site/organizer specific content.

Apart from our regular lecture contents, we have also compiled a diverse set of invited speakers from the Chicago area broadly construed. The one-hour speaker series covered a wide range of topics in computational social science, including innovation studies, social media, network analysis, machine learning and bias in computational social science. When scheduling the talks, we have also tried to match the topics of talks and lectures – for example, we invited Kristina Lerman to talk about the common pitfalls in analyzing social media data after lectures of Twitter API usage. This combination not only allows us to construct a more cohesive schedule, but also enables the participants to think more broadly about how to apply the tools they have just learned in group works and final projects. The list of speakers at SICSS Chicago 2021 is as follows:

  • James Evans, University of Chicago
  • Dashun Wang, Northwestern University
  • Abigail Jacobs, University of Michigan
  • SCALES OKN (Systematic Content Analysis of Litigation EventS Open Knowledge Network)
  • Kristina Lerman, University of Southern California

We used the exercises provided by the main site for each day of lecture week with one exception. We wrote a new survey activity that still used the Prolific platform, but instead had groups come up with their own original research questions and pilot their own original surveys. We felt the activity provided by the main site would be too easily accomplished by our participants (a goodly number had digital survey experience already) and wanted to build in more opportunities for organic group creation through project work. All groups successfully developed research questions, launched the surveys, and produced visualizations of these results. We would recommend that other groups do this in the future as well.

Teaching Assistants

The SICSS Chicago organizing team is traditionally 3 people (Joshua Becker, Kat Albrecht, Jeremy Foote; Tina Law, Natalie Ghallager, Kat Albrecht; Andrew Thompson, Kat Albrecht, Joshua Becker; Kat Albrecht, Carrie Stallings, Yian Yin) so only hires one TA. This year we were able to hire Erin Ochoa, who was able to assist in multiple languages and attended the entire institute. It is our hope that Erin will continue next year as a member of the organizing team, perhaps starting precedent for using the TA role as a precursor to organizing. We believe this would have significant advantages in ensuring the continuity of the institute. During the institute, SICSS Chicago participants conferred with the teaching assistant using two main methods, Slack and individual team Zoom breakout rooms. The TA was available on Slack while SICCS was in session, and some students contacted her there, unprompted, with a variety of questions, for example, about course materials. Additionally, the Slack channel proved an accessible way to share and update code snippets with individual participants or even with a whole team. Some students also messaged the TA to request assistance in their team’s Zoom breakout room.

The breakout rooms facilitated high-level conceptual and theoretical conversations, as well as acting as a platform to discuss implementation approaches and to troubleshoot low-level technical details. The TA rotated through the breakout rooms during the teams’ working sessions and participants engaged with each other and the TA to discuss topics surrounding their project, such as the theoretical basis, their intended research questions and empirical approach, and their proposed implementation, as well as with questions regarding their code. Overall, most participants enthusiastically communicated their progress with the TA.

Though interactions with the TA were positive, there were fewer instances than expected of participants reaching out directly for assistance or with questions. We believe this is related to the online nature of this year’s Institute: some participants may be more hesitant to hail a remote TA than one occupying the same physical space. While we do not find this to be a failure, per se, of the online delivery, we surmise that a return to an in-person format will increase participant engagement with the TA.

Group Project Week

We included several activities and tasks throughout the first week to encourage group formation. First, we randomized groups for the afternoon group activities with the intention that participants would work with a variety of people throughout the week. Additionally, we scheduled time for lighting talks at the end of the first week. We advertised the signup sheet for the lightning talks, five minute presentations about research and methods, from the first day of the institute onward. We had 14 individuals sign up to discuss their work. Through the lightning talks, the participants became familiar with one another’s work, facilitating final group formation.

We used the project group formation recommendation from the main site. We set aside time at the end of week one to have breakout rooms in which the participants could discuss potential topics. To create the groups for these rooms, we had all participants describe their potential project topics/ themes in a collective spreadsheet. Once done, we had each participant indicate what topics they found the most interesting, and then grouped the participants based on their interests. After the lightning talks and breakout sessions, we had participants sign up for the potential research topics alongside their other potential project team members, resulting in the final groups for the project week. This process was slightly hectic in the online form, since it required participants to ‘trust the process’ despite not being together. It was more difficult to arrange than it had been in person in previous years.

Several groups continued projects which were created during week one’s activities. Other groups devised entirely new projects, whether related to some participants’ current research or an entirely new project which catered to the varied interests within the group. We believe that each of the activities played a role in group formation, and that some activities (such as week one activities vs lightning talks) benefited specific groups most.

We strongly encouraged participants to meet with each other daily throughout project week. We also requested that each group meet with the organizers at least once throughout the week during our daily office hours (which each group did do). We also had a happy hour in the middle of the week, which did not have much attendance. We speculate that this was likely due to the intensely virtual nature of the institute and zoom burnout, particularly given that it was at the end of a group work day.

Project teams presented their research at the end of the second week. A variety of project teams used newly-learned methods in their work and others investigated social science questions which they had not previously considered.

Grants

At the conclusion of the institute, we made participants aware of several grant opportunities which they could pursue for future research support. Those opportunities included the mainsite grant as well as the Chicago site grant. SICSS Chicago was able to raise 4,000 for grant funding across project groups. Additionally, in collaboration with Grinnell College’s Data Analysis and Social Inquiry Lab, we were able to offer a $1000 grant for the project which had the most demonstrable impact on the social good.

We received applications for the Chicago site grant and the Grinnell College grant, and expect that several groups will apply to the mainsite. Four project groups (including the ~10 person combined project group) from the institute applied for additional funding from SICSS Chicago. We were able to fund all requests that came under the purview of the institute funds.

Lessons Learned

Our biggest concerns going into the 3rd SICSS Chicago were how to adapt to a virtual format and what to do if we had exhausted local interest in the institute. In some ways, the digital format solved the second problem. Because the institute was virtual, we were able to admit participants from areas of the Midwest that are Chicago-adjacent but not entirely local (ex: St. Louis, Urbana Champaign, Normal etc.). These participants often came from less-resourced institutions that could not have self-funded for the two weeks necessary to attend the in-person institute. We note that a return to in-person will activate this issue again, which becomes particularly acute at partner sites that do not have funding to house participants. It would be worth considering if there should be an intentionally virtual partner site in the future (i.e. one that is not tied to any particular location).

We made several intentional and time consuming changes to our curriculum to accommodate the online format. We significantly lessened our use of pre-recorded lectures because we worried that an online and pre-recorded experience would not promote cohort bonding and group formation. We also worried that a more static experience would lead to more attrition. This gave the organizing team substantially more work to do as we had to write hours of additional live-lecture content. We also slightly shortened the run-time each day to account for zoom fatigue, which meant we had to move briskly through the program. We also built in time to share our own expertise and do intentional ice breakers (zoom-annotated ‘This or That’ was a crowd pleaser) which were both very well-received by participants. It was important to us to present a uniquely SICSS Chicago experience in addition to the regular programming, so our extra content and local speakers gave the institute some specific Chicago flair.

Another lesson from this year’s institute concerned how to transition the leadership team. Kat Albrecht, an attendee of SICSS Princeton in 2017, had been on site to help organize SICSS Chicago since its inception, but finished her doctorate this year. SICSS Chicago is somewhat unique in that it has never had a faculty member on the organizing team (though we have the requisite faculty sponsor). A big push at this years’ institute then, was to help the new organizing team learn the systems and networks (for grant funding, room reservations, and admin) to continue the institute. It is the intention of SICSS Chicago to apply for funding to continue next year. If possible, it would be useful for the main site to encourage organizing teams to be very intentional and conscious of their transition teams.


SICSS-Brazil

The SICSS FGV-DAPP Brazil was held virtually from June 14 to 26, 2021. We invited 30 participants out of a list of more than 100 applicants, ending with 27 participants. Our first week followed the main SICSS curriculum . The second week focused on roundtable discussions about Computational Social Science in Brazil and on launching collaborative research projects with our participants. This post-mortem is divided into six sections: 1) outreach and application process; 2) pre-arrival and onboarding; 3) first week; 4) second week; 5) Post-Sicss, and 6) Final Remarks.

1. Outreach and Application Process

Since this was the first time we were hosting a SICSS edition in Brazil, our initial plan was to cast a wide net for applicants ranging from distinct academic backgrounds, universities, academic programs, and regions across the country. Our main target audience were participants researching or working in CSS in or about Brazil. This was important for two main reasons. We planned to conduct most of our activities in Portuguese, since English fluency is not strong in Brazil, and we have always seen SICSS as an opportunity to build a network of CSS scholars in Brazil.

We used three main strategies to advertise our edition. First, we widely advertised our call for applicants using the organizers’ personal network, particularly on social media platforms such as Twitter and LinkedIn. We also made a list of scholars working on CSS in Brazil, to whom we sent personalized emails asking them to share the details of our event in their departments. Finally, we used distinct institutional newsletters from Fundação Getulio Vargas to reach out to an even broader audience than the organizers’ direct network.

Overall, we believe our advertising for SICSS Brazil was a success. The event received great attention and traction on social media, and we received more than one hundred applicants. It is important to note that, considering the organizers’ roots in political science, our recruitment was higher in that area. Another characteristic of our recruitment, in this case related to regional inequality in Brazil, was that the majority of our applicants were affiliated to institutions in the country’s South and Southeast regions. We tried to balance both of these characteristics when deciding which applicants we would invite to participate in our event.

We selected twenty-seven applicants, and only two declined our invitation to join the SICSS. The majority (about three-fourths) of our participants were Ph.D. students, with some junior faculty members, postdoc scholars, one master student, and two data scientists working in the private sector. We also achieved a 50-50 gender split, and had participants coming from all the regions across the country.

Although the field of political science was overrepresented with seven participants, we had a wide range of academic backgrounds, such as sociology, anthropology, psychology, history, law, data science, information technology, environmental science, communications, and defense strategy. We also recruited Brazilian students doing their Ph.D. abroad, in institutions such as the University of California Los Angeles, Bocconi University, University of Toronto, Münster University, and University College Dublin.

We believe that it is possible to expand the number of students accepted to join us on this type of virtual setting. Attendance declined over the duration of the event, and two students participated very little in the second week. Therefore, even though we opted for a smaller edition since this was the first time out team was organizing a SICSS, it is likely that we could expand to something like 40 accepted students in the future.

A final thought about the selection of participants: we asked them to submit a cover letter, a CV, and a writing sample, but we barely had the time to read the writing sample. Therefore, we believe that the cover letter and the CV are probably enough for us to reach a decision about which participants to invite.

2. Pre-Arrival and Onboarding

After participants were accepted, we sent participants an email asking them to confirm their participation, using the same template provided by the SICSS-Princeton Team. In that same email, we introduced the structure of our SICSS and asked them to start working on the pre-arrival materials. All but one participant accepted our invitation to join the SICSS FGV DAPP.

A week later, we sent a second email adding all of them to our Slack, and asking for a photo and bio to include on the website. Both of our emails stressed the importance of reviewing the pre-recorded lectures once they were posted because the institute would be a flipped classroom format.

Overall, we believe most of our participants watched the pre-recorded lectures, and some read a few chapters of Bit by Bit. We are less confident that the students took the time to work on the coding notebooks provided by Chris Bail in his lectures on Digital Trace Data and Text Analysis. Our feeling was that most of the students watched the lectures, but did not take the time to work on the codes. Some did, but not the majority. Maybe in the future we should think about doing some exercises together with the more coding-focused lectures, allowing the students to practice a bit more before the beginning of the SICSS. Another suggestion we received is organizing some type of live coding activity in R before the beginning of the SICSS, or simply highlighting more about the importance of “coding along” as one watches the pre-recorded lectures.

We were worried at the beginning that the students would have a hard time following the lectures, since English fluency is not too high among Ph.D. students in Brazil. However, we were positively impressed by the feedback we received from them. The feedback about the lectures was extremely positive, we did not receive any complaint about the topics covered, and none of the students said that they could not follow the lectures due to any type of language barrier.

We organized two pre-SICSS meetings with the students. The idea was to answer questions and to help them prepare for the event. The attendance was excellent. Most of the students participated in these meetings. They were really helpful for them to understand how to go through the pre-arrival materials and to get a better understanding about flipped classroom plans. In addition, these meetings allowed us to get to know each other a bit, share our Twitter handles, and do some ice breaking before the beginning of the SICSS.

One important thing for us was that our organizing team met once a week before the SICSS to discuss each topic covered in the first week. This was extremely important for us to get a better sense of the material and to organize our schedules and activities. This decision was crucial for the preparation of our event, particularly because only one of our organizers is a SICSS alumnus[LMP1] .

3. Week 1

Schedule and Activities Since this was our first time organizing a SICSS edition, we decided to follow the SICSS main curriculum closely.

For the first three days, we deviated very little from the main curriculum. We started with a smaller group discussion about CSS and expectations for our SICSS edition during the morning, and covered the Pantheon exercises during the afternoon. On the second day, we did the suggested scrapping exercise, and on the third day, we covered text analysis.

The activities we changed on these three days were the following: on the second day, we encouraged our participants to work with anything other than the Twitter API. Our feeling was that most of our participants had some experience grabbing data from Twitter, and the field of has an overrepresentation of Twitter as a data source. The students followed the recommendation and worked with different data sources, such as Wikipedia edits, YouTube, fact-checking agencies, and official data from the Brazilian government through APIs. We got positive feedback from this activity, although some students complained about it being too broad, and spending too much time thinking about what to do, instead of practicing how to collect digital data.

On the third day, we decided to start with a text analysis workshop in Python. Some of our participants were more fluent in Python, so we decided to start the day with a short one-hour workshop showing some NLP techniques. In addition, we used a dataset collected from YouTube and provided by FGV DAPP about electoral integrity in Brazilian elections in the group exercises. Our decision to use this data was to encourage our students to think and implement text analysis methods in Portuguese. This is important because the availability of dictionaries and sentiment lexicons (for example) is more limited in Portuguese than in English, so we wanted our participants to discuss these issues. From the feedback we received, we believe using text data in Portuguese was well received by the students; however, the workshop in Python was too demanding and more complex than our students were prepared for. Particularly because we decided to cover topics not discussed in the text analysis lectures, such as word embeddings and neural nets, our feeling was that students did not take too much from the workshop.

On day four, we deviated substantially from the main curriculum. Because Mturk is not widely available in Brazil, we decided to focus on the use of Facebook Ads as an alternative to recruit survey respondents. We started the day with a workshop about how to use Facebook Ads, how to connect your surveys with the Ads, and which type of decisions, such as quotas, geolocation, number of ads, Facebook allows you to control. After the workshop, students created their own survey and made it available through a Facebook Ad. Overall, the participants appreciated the learning experience about using the Ads for social science research. However, because the time to complete the activity was too short, most of the groups were not able to collect enough responses, as we would expect from Mturk. One option for the future is to split this activity into two different days, where the participants can put the Ad up on the morning of day 4, and let it run until the afternoon of day 5. To do this, we would need to go through a different activity for day 5 than the Fragile Families Challenge.

On day 5, we finished our first week with the Fragile Family Challenge. Overall, our participants had a hard time with this activity. The reason in our view is simple. Very few social scientists in Brazil have a background in predictive modelling and machine learning. Therefore, the learning curve for this activity was too high; the participants did not have a good sense of what the splits in the data were, or what else they could do in the modelling part. Our plan for a future edition is to replace the Fragile Family Challenge on Day 5 with the classes in Experiments in Social Science. First, this change would allow us to combine the activities from Days 4 and 5. Second, we felt students would benefit more from learning about experimental designs. Third, our sense is that the use of experiments in social science research is well established in the Brazilian context, so we could push more on this front.

Guest-Speakers

We invited three guest-speakers for the first week:

  • Patricia Rossini (University of Liverpool, UK. Communication) - “Rethinking Online Toxicity: Conceptual and Methodological Advantages of Disentangling Uncivil and Intolerant Discourse”
  • Cesar Zucco (FGV-EBAPE, Political Science) - “Conducting Digital and Survey Experiments in Brazil.
  • Thiago Marzagao (CGU-Brazilian Government) - “Putting a price on real estate: Data Science and Machine Learning in the Brazilian Government.

We also invited Daniel Trielle (Northwestern University and SICSS Alumnus) to a workshop on Algorithmic Accountability, and Fernanda Scovino (Base dos Dados) to present the Brazilian Open Data Initiative Base dos Dados.

We believe participants enjoyed the guest-speakers and workshops. Our sense was that these workshops were a nice break from the group activities structure, and an opportunity for participants to see some research projects in practice. One thing we decided not to do, but we might consider for future editions, is to allow participants to meet with the speaker before the talk, creating a space for networking and collaboration.

Other Issues

Time Issues: we felt after a few days that we could have been more rigid controlling the timing of the activities, particularly the workshops and guest speakers. On a few days, we ended our days half an hour later than we planned, and we could feel participants were not following anymore.

We had no issues with logistics in the virtual setting. For small-group activities, we used the breakout room feature, breaking our participants into rooms with 4-6 people. We divided all our groups randomly.

In addition, we allowed participants and organizers to interact during the lunch break, and we think this more informal space worked really well. We usually ended in nice discussions with the participants about academic and non-academic issues. All the participants seemed to enjoy the lunch breaks, and I think this was a kind of bonding moment for all of us. Finally, as suggested, we conducted a series of daily surveys (Keep-Start-Stop) during the first week; the turnout for the first two surveys was really good, but we saw an expected decline by the end of the week.

4. Week 2

Schedule

Our schedule for week 2 departed substantially from the main curriculum. In addition to the Research Speed Dating and the time for the participants to launch their group projects, we organized two new activities.

On Monday, June 21, we spent the entire day working on an activity called DAPP Day. During this day, we did a workshop and a group activity where the participants could learn from our organizers at FGV DAPP about some of the methodologies developed at DAPP in their analysis of social media data. We started the day with a discussion about using complex linguistic queries to collect data on social media and then went over a coding exercise together, implementing some of these queries and covering basics of network analysis with Twitter data. We developed two different notebooks in R and Python to cover these materials.

On Tuesday, we focused on the Research Speed Dating. We followed this activity exactly as suggested in the main curriculum, first splitting the participants into their most similar group, then on the maximally diverse. By the end of the week, the participants worked on four different projects that we expect participants to develop further and submit for the SICSS grant proposal.

Our attendance declined a bit in the second week. We see a few reasons that might explain this decline. First, we believe most of the students decided to participate in the SICSS to learn more skills on Computational Social Science, instead of starting a new research project. Second, because of the virtual environment, we believe that, although some students were able to clean their agendas in the first week, they unfortunately had some appointments in the second week. Therefore, participation was more unstable. Third, experiences like the SICSS, where students work together to think about a research project and submit a grant to get some funding, are quite uncommon in Brazil. Therefore, we fell that sometimes students were not sure how to move on with a project that was not directly related to their dissertations, and that they could receive[LMP2] some financial support to fund the projects.

Overall, we believe we could have also let students have more time to work on their projects. In particular, we don’t think students benefited too much from the maximally diverse group discussion. We also think that maybe we could have started a spreadsheet during the first week where students could add some of their research ideas and start thinking about them for their second week research projects. Finally, our three roundtables, which we discuss below, took some time that the participants could have used towards their group projects.

Roundtables

We also organized three roundtables to cover the topics of applications of Computational Social Sciences in academia, the tech industry and also in civil society organizations with public impact.

The guest speakers were:

Computational Social Science in Academic Research

  • Ernesto Calvo - iLCSS, University of Maryland (Tiago)
  • Rochelle Terman - Assistant Professor of Political Science, University of Chicago

Opportunities for Computational Social Scientists in the Industry.

  • Daniel Mariani - Folha de São Paulo
  • Henrique Lorea - Data Analytics Manager, PicPay

Computational Social Science and Civil Society Organizations

  • Cristina Tardáguila, International Fact Checking Network
  • Cecília Olliveira, Fogo Cruzado App (gun fire monitoring in Rio de Janeiro and Recife)
  • Natália Leão, Gênero e Número (datajournalism focused on gender and race issues)

All three open discussions were very productive for the students, with high participation and several insights for career strategies for a researcher in CSS. In a next edition of SICSS Brazil, we intend to expand this experience exchange with professionals from different backgrounds and trajectories.

5. Post-SICSS

One of our major goals organizing the first SICSS edition in Brazil was to help a growing community of Computational Social Scientists to do research in or about Brazil. We believe that the SICSS was an important step for this, but there is much more to be done. Therefore, we have a few plans for the future

  • Organize a second edition of the SICSS in Brazil, and try to invite this year’s participants to help us in the process.
  • Organize a small Zoom session (40 mins) every month to keep some level of connection with the students, or even to talk about their research projects.
  • Try to keep our Slack rolling as a platform for students to get information on on-line courses, workshops, and other events in CSS.
  • Incentivize joint researches between the network created in the SICSS, contributing to strengthening and growing of CSS field in Brazil
  • Incentivize other universities in Brazil and in South America to host SICSS partner editions in the future years.

6. Final Remarks

  • The virtual edition worked really well for us. Academic funding is very scarce in Brazil right now. Therefore, we are not sure we would have the resources to organize an in-person edition. Most importantly, Brazil is a huge country, so it is likely that a virtual edition was more diverse and reached a broader audience than otherwise, in an in-person format.
  • However, the virtual format was not without weaknesses. Particularly for the final group projects, we believe in-person interaction is a better space to let research projects emerge. It is also possible that engaging in non-research, more informal community building was harder in a virtual setting. However, we believe that our option lunch breaks worked well as a space where participants could get to know each other outside of the “classroom”.
  • We are happy with our decision to conduct our edition mostly in Portuguese. Language barriers in English are still huge in Brazil, so we think the students could participate and engage more with discussion and materials mostly in Portuguese.

SICSS-HSE

Organized by Elizaveta Sivak, Sofia Dokuka, and Ivan Smirnov

Teaching assistants: Alex Knorre, Saydash Miftakhov

The SICSS HSE University partner site (SICSS - HSE) was held remotely from June 13th to June 19th, 2021. This year we organized SICSS for the first time. We decided to make a smaller event: to keep the number of participants lower than usual (around 15) and to limit the school to one week instead of two. We had one week of group activities, discussions, guest lectures, and lectures from the faculty, and skipped the second week (group projects). We’ve divided this post into three main sections: 1) outreach, the application process, and selection of participants; 2) participants’ preparation for the school and onboarding; 3) the program.

1. Outreach, the application process, and selection of participants

1.1 Outreach, application process

We started the first outreach wave in January, once we posted basic information about the school on our website: the dates, bios of the organizers, details about the application process, and a short description of the program. In the first outreach wave, we mostly tried to reach potentially interested participants through social media. After that, we decided whom to invite as guest speakers and wrote them in March 2021. The guest speakers were supposed to give their talks online, so the timing was fine (in the case of an offline school, the invitations of guest speakers should begin earlier – at least in January). All three speakers agreed to participate. We updated the schools’ website (added the preliminary schedule and the guest speakers’ bios) and started the second outreach wave in mid-March. In this second wave, we emailed our former students at HSE University and faculty that might know interested participants. We also tried to reach potentially interested participants through social media (again) and email lists.

We collected applications via a google form. Application materials required a curriculum vitae, a statement of interest describing both any current research and interests in computational social science and one writing sample. We decided not to request recommendation letters because this can be a problem for some groups of scholars (e.g. younger scholars), and thus some competent applicants would be filtered out. We didn’t regret this decision. It turned out, that statement of interest and CV were sufficient to select motivated participants, who were able to benefit from the school and also contribute to the educational experience of other participants.

The results of the application process were somewhat unexpected. We were surprised and happy to see strong applications from different countries in East and Southeast Asia, Europe, and Northern and Southern America. At the same time, our Russian sample included only Moscow and Saint Petersburg, and we received almost no applications from Eastern Europe and Central Asia. This may be because our networks (and networks of people we know) don’t reach the people in these regions who may be interested in CSS. Also, we didn’t find email lists or professional online communities related to CSS in these regions. One of the aims of our institute is to promote computational social science in post-soviet countries and develop the CSS community in this region, where CSS is still underdeveloped. So, in the future, we should put more effort into reaching out to the people in these regions, who are interested in CSS and do some work related to CSS – for instance, via contacting universities or universities’ accounts in social media directly.

1.2 Selection & confirmation

We evaluated applicants along several dimensions: 1) likelihood to benefit from participation, 2) likelihood to contribute to the educational experience of other participants, research and teaching in the area of computational social science, 3) potential to spread computational social science to new intellectual communities and areas of research, 4) contributions to public goods, such as creating open source software, curating public datasets, and creating educational opportunities for others

We selected 22 participants (50% female) and wrote a letter with details about the school: 1) that we require preparation (studying pre-arrival materials, listed on our website), 2) which activities the school will include, 3) challenges related to the online-format, mostly the problem with different time zones - all our events were based on Central European Summer Time (CEST), and for some participants, CEST is inconvenient either in the morning part or in the evening part of the day.

We then asked to confirm the participation after considering all these conditions. Almost all participants (20) confirmed. However, for those participants who were in Northern America or East/Southeast Asia, time zones turned out to be a barrier. They missed some of the morning and evening events. This was not very convenient logistically, mainly during group activities. Someone from faculty or TAs always had to stay in the main Zoom room to explain what’s going on to people who joined later and assign them to groups. Also, the participants in the groups had to update the newly joined participants about what they are working on. However, there seems to be no way around this problem (other than restricting participation to certain time zones).

Also, a note to our future selves: in case of a virtual event we should stress more that the school will take a big part of the day (and that it will be intense and tiresome), and ask the participants straight away to block out half of the day and not plan much for the remaining part.

2. Preparation and onboarding

After participants were accepted we begin to onboard them into the program. We sent out an email to the participants requesting them to:

  1. send short bio and photo for the site via the across-locations google form (note: stress in the letter again that photo should be 200*200 pixels, preferably the portrait)

  2. join the Slack workspace via a link, and then join our school’s channels there. Some participants had to be reminded to join us on Slack. After that, we started to communicate with them solely via Slack

  3. (again) study the pre-arrival materials. We also reminded the participants about the opportunity to join all-SICSS office hours held by Princeton TAs, and probably some participants used this opportunity. Also, we encouraged the participants to join the #sicss-hse-help channel supervised by our TAs (we thought that for some people it might be easier not to join zoom Q&A sessions, but to write the questions on Slack). But we didn’t receive any questions.

We noticed that not all participants studied pre-school materials, especially materials about R. However, we were happy to learn that those who did (even though they had no previous programming experience with R) were able to participate proactively in group work, after only one or two months of learning R. This can be used for motivating people to pay more attention to R tutorials before the school.

Also, we followed the SICSS tradition to start the school informally and get to know each other a bit during the meeting in the evening before the main program of the school. At first, we met in Zoom for a short welcome. For the rest of the meeting, we followed the Zoom-party scheme, but instead of Zoom, we met in Gather town. Participants suggested some very good topics (just in case, we also visited zoom rooms while people were discussing). We wrote the topics in the google doc and indicated a private room in Gather town space for each topic. In general, the impression with Gather town was positive, but general instructions are needed beforehand about what to do when people get to the space to avoid chaotic movements and people blocking the entrance\exits to private rooms. We also noted that participants enjoyed discussions, where they can suggest topics by themselves, and had another free discussion on Monday. However, communication is less spontaneous online, so it’s better to have a moderator and decide on initial topics beforehand.

Another good thing about the meeting before main events was that all the participants figured out time zones (when exactly events start according to their time).

3. The program

  1. With some exceptions, we followed the topics suggested by Matt Salganik and Chris Bail and relied on their pre-recorded videos. We added some talks from the faculty (see Materials), guest lectures, and also adjusted some activities (see below). All activities were held in Zoom. We used two Zoom links – one for all internal meetings and one for guest lectures that were opened to non-SICSS participants. We used Zoom breakout rooms for group work. We asked not to leave Zoom calls on breaks and also made screenshots because Zoom doesn’t remember group membership if someone leaves and then comes back. Initially, we had 15-30 minutes breaks between the events and during group work, but some people asked for longer breaks, so we restructured the schedule a little to have at least one longer break (45-60 minutes) and got positive feedback.

  2. Group activities were scheduled for around 3 hours each day. Participants were assigned to groups of 3-5 people either randomly (if an activity didn’t require special skills) or non-randomly in case an activity involved coding (we tried to assign a person with coding experience to each group). We did not assign people to groups based on their preferred coding languages except for the Fragile Family challenge when we made one Python group. We encouraged people to write in Slack or ping us in Zoom in case of questions or problems, and we also were visiting breakout rooms from time to time to check how things are going. After each group activity, we came back together to discuss what participants did in groups.

  3. In the activities that required coding collaboratively the groups mostly used screen sharing to code together, and sometimes R cloud. Unfortunately, we were not able to find a collab environment that works with R. Usually, one person shared her screen, coded, and explained the code.

  4. We adjusted Day 2 activity (Collecting Digital Trace Data), which is based on Twitter data because we were not sure if all participants will be able to get Twitter’s developer’s account. Applications from two SICSS-HSE faculty members were rejected, and we learned later, that the application of at least one of our participants was rejected too (and Twitter doesn’t currently allow one to appeal). Instead of Twitter, we based this day on Reddit data, and everything went alright.

  5. In the Day 4 activity (Surveys in the digital age) we used Prolific instead of mTurk because it is not possible at the moment to create an employer’s account from Russia. We created one Prolific account and added money to it; all groups used the same credentials (we gave the instructions not to exceed the sum allocated to each group and also visited each breakout room to remind them about that). This worked without any issues.

  6. On Day 6 (a day-off in the SICSS-Princeton\Duke program) we scheduled one guest lecture. Also, we had plenty of time for flash talks (around 15 minutes for the talk and 15 minutes for discussion), which went well. We shared a google spreadsheet to sign up for a talk at the beginning of the school and reminded several times about the opportunity to give a short talk and its’ possible formats (results of the study, project of the study, small tutorial, review of some research field).

  7. At the end of each day, we asked to provide anonymous feedback using the keep-start-stop survey in the google form. At the beginning of the next day, we summed up all suggestions and critiques and explained what we will adjust based on the feedback. We didn’t receive many suggestions (and the number decreased by the end of the school), but all that we received was very useful (for example, to make one longer break).

In general, despite all the challenges of the online format (lack of naturally occurring communication, time differences, the difficulty of involving more shy people in communication, zoom-fatigue, etc.) we had a very positive experience with the school and we are very grateful to the participants who made a great contribution to each other’s educational experience. We also didn’t regret that we decided to limit the school to only one week for the first time, because running the school is quite difficult, albeit very rewarding.


SICSS-Helsinki

University of Helsinki hosted a Summer Institute in Computational Social Science, following the one week of instruction and one week of student project work. During the instruction week followed Coding Social Science, the textbook developed by Matti Nelimarkka and discussed research ethics, validity and reliability questions, data science, network analysis, simulation models and interactive systems for social science research. We ended the second week already on Thursday to accommodate for Midsummer – a Nordic holiday on Friday.

Pre-activities

We conducted targeted advertisements for Finnish and Nordic audiences via list servers. Overall, we received a modest number of applications across disciplines: computer science, physics, communication and media studies and political sciences. We accepted a total of 20 participants, about half from Finland and others from mostly Europe.

In the spirit of flipped classroom, students were expected to read seven chapters from Coding Social Science and self-study basics of R using Coding Social Science, SICSS online materials or other online assets. We tried to communicate the expected skill level via a set of exercises students should finish before SICSS starts. However, based on feedback survey responses it seems that about half of the responders did not feel sufficiently prepared for the instruction and project weeks, citing lack of time or noting that they should have read and exercise programming more.

Action point: This year the pre-activities were laissez-faire: we did not provide detailed scheduling not required for example one-minute essays or returning coding exercises for evaluation and assessment. However, this approach puts a significant burden on participants’ self-regulation skills to ensure they prepare sufficiently. Partly this could be aided with smaller interventions, such as a mandatory diagnostic test for all participants both communicating the expected skill level and allowing participants to self-assess the amount of work they ought to do. Alternatively, the learning activities could be spanned to cover about one month before SICSS and have more instructor-driven assessments.

Action point: I also observed that participants did not use the global SICSS TA channel to ask for any questions before nor contacted me regarding programming problems nor conceptual or theory issues. I am not sure if this relates more to the usually high barrier to contact people – especially for novices (at least in software industry, Begel & Simon, 2008) – or poor communication on the availability of these resources.

Beyond knowledge and skill improvement, from previous years we know SICSS is also about building a community of like-minded scholars. To support this, we organised three voluntary one-hour meetings before SICSS Helsinki and invited to share their photo and a short bio in the SICSS Slack. About one half of participants joined in these activities. All participants who answered the final survey indicated that participants felt they were welcomed to SICSS Helsinki, which is a critical part in an online activity like this.

First week: instruction

The instruction activities assumed participants had already familiarised themselves with the corresponding chapter and mostly focused on taking the theoretical ideas to use. We used group activities to think further about how various concepts materialise in research (for example, for network analysis we examined what could be seen as a network, in data science we identified potential data sources for empirical questions etc.). Students were also expected to modify and expand brief tutorial code snippets demonstrating the implementation of the methods in R. Originally we attempted more group-work based approach with programming exercises (in hopes that peer-support would be better organised), but following students’ feedback we modified the process to allow students to either work solo or together with specific Zoom breakout rooms for each part of the exercise and a separate space for solo workers (even while participants rarely moved between these spaces; i.e., people went to the solo workspace or worked in the room for Exercise 1).

Participants’ feedback highlighted occasional unclarity about the learning activities and their goals as well as the level of difficulty to increase too quickly considering their technical skills. Similarly, it was highlighted that I did not teach that much during these activities, but rather facilitated and provided materials and comments.

Action point: I will revise the exercises both in the book and additionally the programming snippets to ensure exercises are clearer and have a better indication on their difficulty (and that there are even more beginner friendly exercises). Programming snippets could be supplemented with voice-over tutorials to make them more accessible and references to the book when possible.

Reflection: I think the root cause of students’ observations on the level of difficulty relates to the heterogeneity of the student population: some have extensive programming experience beforehand while others are still beginners. This is a difficult challenge to tackle. The heterogeneity can be addressed via more formal pre-activities, bringing the minimal skills to a certain level. However, even then there will always be people with prior experience in programming who will ace programming activities better. I do not think the problem can be solved by catering these different audiences with different exercise sets as this would break the shared experience of SICSS. I think I need to revise the coding exercises once again to also ensure they communicate the intent: gaining some hands-on experience working with these methods with toy examples.

Second week: group projects

We used Thursday and Friday afternoons in Week 1 to discuss student projects, allowing participants to elaborate their research interests and discuss them in groups. We also conducted speed-dating activities where pairs or small groups discussed what kind of projects they were interested in – and got to know each other better. Based on participant feedback, participants enjoyed their groups (comments from feedback survey “My team worked very well together and the collaboration was a pleasant experience.”, “My group worked perfectly as we were on the same level and had the same aims for the project. Our collaboration was wonderful and it made the project a lot of fun to do despite problems.”).

For each day we had two daily checkups with the teams (about ten minutes for each team), covering what was done, what is expected to be done and highlighting any issues, prompted with questions such as “what would you do differently tomorrow” and “does the team work efficiently in your opinion?” The overall aim of these meetings was to ensure groups communicated with each other and give the teaching team an overall perspective on how distant teams function (and provide the opportunity to mitigate any conflicts early on) and observe major showstopper issues where intervention is expected.

Student feedback indicated that this strategy was not appreciated by all participants, as they felt useless and took time away from solving actual problems. However, for me it provided an opportunity to follow group projects and attempted to serve as a venue to support reflection during the intensive work. While I did say on several occasions (and for each group if issues emerged), I was available to help teams during the day if needed and followed Slack for any issues – but I was rarely contacted for support.

Action point: Formalize more clearly the structure and purpose of checkup meetings as an avenue to support reflection, for example with small pre-assignments focused on group dynamics, project process and key learnings and insights.

Students also highlighted that for final presentations the ten minutes presentation and five minutes for discussion was insufficient time, as there were more learnings to be shared. Like always, the timing was a balancing act between the number of groups and amount of time we can expect to keep people on Zoom. However, there could be a potential to examine opportunities for sharing a bit wider documentation (such as an extended slide set and any supplementary material) to allow offline engagement for this extended material and keep the ten-minute slots as quick introductions to any extended material they hand. On the other hand, it is a key skill in academia to summarize key ideas to a short period of time, thus it is a beneficial learning experience at the same time.


SICSS-Hong Kong

The SICSS Hong Kong was held virtually from June 14 to 25, 2021 at the Hong Kong University of Science and Technology. This is the first time SICSS comes to Hong Kong. From over 80 applications, we invited 18 participants; 17 participants have engaged in events until the end. Of the 17 final participants, 15 of them are graduate students and 2 of them are assistant professors in their first or second year of the job.

Our first three days followed the main SICSS curriculum. The fourth, fifth, and the sixth days covered site-specific topics that are more each organizer’s research expertise. The rest of the second week focused on group projects. This post-mortem is divided into six sections: 1) outreach and application process; 2) pre-arrival and onboarding; 3) first week; 4) second week; 5) Post-Sicss, and 6) Final Remarks.

1. Outreach and Application Process

Compared with other sites, SICSS-Hong Kong started relatively late. It’s because two organizers (Han Zhang and Haohan Chen came together and decided to organize the event rather late, at the beginning of 2021). Later Jaemie Lee also joined as an organizer. Therefore, we set the deadline of application to be April 15, 2021.

During planning, our main target audience are research students and faculty in Hong Kong and the Greater China Region. The rationale is that there has been a growing body of faculty in Hong Kong doing computational social sciences, and a clear growing interest among students. However, there lacks intercollegiate conversations among local universities and researchers on computational social science. Therefore, fostering a working group on computational social sciences is a goal.

We advertised our event to all universities in Hong Kong, as well as major research universities in mainland China, Singapore, and South Korea. The advertisement was mainly spread through emails to university departments.

2. Pre-Arrival and Onboarding

Two weeks before the SICSS, we sent them reminders to watch videos of the lectures for the first three days (especially first days). Students were expected to read Bit by Bit.

We tried to communicate the expected skill level via a set of exercises students should finish before SICSS starts. However, based on feedback survey responses it seems that about half of the responders did not feel sufficiently prepared for the instruction and project weeks, citing lack of time or noting that they should have read and exercise programming more.

Action point:

We could have people read lecture materials and watch videos further prior to the event (say, 4 weeks before), and send a reminder one or two weeks prior to the event.

The technical background is mixed; some people are pretty good at coding but others are not. We can send special reminders to people that are relatively unskillful at coding to watch the coding bootcamp material and practice.

3. First week:

To ensure that participants could get started with group projects as soon as possible during the program, we asked them to post their research interests on Slack before the institute began. To further make sure that they know what We hosted flash talks regularly throughout the first week (15 minutes per participant) to ensure that they know what fellow participants are interested in. Most flashtalks are in the evenings (due to time zone consideration).

Since this is the first time we host SICSS in Hong Kong, for the first three days, we largely followed the main site’s curriculum, including discussing lecture materials as well as exercises.

For Day 4 and 5, we have our own site-specific lectures: automated image analysis for Day 4 and online experiments for Day 5. The lectures take around 1 hour. For Day 4, we also prepared some exercise for hands-on experiences of image analysis.

Reflection points:

To randomize or not. For the first three days’ exercises, we decided to randomize group orders. Some students think that it prevents them from fostering a stronger tie with their classmates. Our rationale is that we are in an online setting and it’s hard to socialize with other participants. Randomizing groups each time may provide them more chances to know each other. This may be a decision we will continue using if next year’s SICSS is still online.

The main site’s exercises received mixed feedback. Some people thought it was too rudimentary while others (those with little experience of computational social science) thought it was okay. We have not figured out what is the best workaround. Maybe it would be more useful to directly goes into projects and let students help each other in coding while doing their own projects, instead of working on toy examples?

4. Second Week

The first day of the second week also consists of site-specific lectures on how to combine surveys and big data. After the lecture, we have an afternoon session for research speed dating. We believe that the flash talks and exercises in the first week have already provided participants a good basis to form groups. Still, we suggested possible areas of overlapping interests to them. We hosted afternoon slack sessions to hear their preliminary ideas make suggestions.

We stated that group projects are optional but highly recommended. If some participants choose to not participate, they can join the SICSS festival as much as possible.

Found groups ended up doing group projects. They did a preliminary presentation on the event of the first day of the second week, then did a short presentation of their progress on the third day of the second week, and finally a presentation at the end of the event. The quality of presentation was actually far beyond our expectation. All groups have come up with well-documented, well-thought, and executable plans for their projects.

5. Guest Speakers

We invited four guest speakers, with disciplines ringing from communications, sociology, and computer science/data science. Two speakers are from Hong Kong local universities and the other two speakers are from the US.

Reflection points:

Students may find that it would be more helpful to have office hours. Currently we did not allow office hours but this is something worth considering next year.

6. Final Remarks

Outreach:

Do more social media advertising. Currently it’s mostly refined to universities through email chains.

Plan and Time:

If we are going to have a flip-classroom model next year, one potential idea is to wait for two weeks and start around the middle of July, when most Chinese universities finish their semester. Of course this depends on whether we want to recruit mainland Chinese as a major source of applicants.

Time of SICSS Festivals:

Currently the time zones make it very hard for people in Asia time zone to participate. It’s all after midnight; some of them are around 2AM. Of course people can watch the recordings later, but recordings are not available until the next week. So students cannot discuss what they have participated in because very few can stay up to that late.


SICSS-Howard/Mathematica

“The Summer Institutes in Computational Social Science (SICSS) were created to provide free training to the next generation of researchers at the intersection of social science and data science—and to incubate cutting-edge research across disciplinary boundaries.” As the first SICSS partner site at a Historically Black College or University (HBCU), Howard University, SICSS-Howard/Mathematica is additionally focused on seeding and growing a depth of knowledge and appreciation for computational social science in underrepresented communities while serving those communities’ needs directly and unapologetically.

SICSS-Howard/Mathematica welcomed 24 participants to a two-week institute focused on the alleviation of anti-black racism and inequity. SICSS-Howard/Mathematica was convened virtually and synchronously by Howard University from June 14th to June 25th. The SICSS-Howard/Mathematica experience also included “Praxis to Power,” a unique, community and confidence building pre-institute convened from June 12th to 13th.

SICSS-Howard/Mathematica consciously and consistently put the needs of emerging scholars of color first throughout the planning process and within the entirety of the institute. Several SICSS-Howard/Mathematica site innovations were introduced to build community, build confidence, break down the inequity that is the norm for our participants in academe, and create something better.

A quote from a SICSS-Howard/Mathematica 2021 alum:

“I have not been this excited about research in a long time, and that’s thanks to the truly diverse and multidisciplinary participants you admitted, the incredible Naniette and the event staff, and the insightful and brilliant speakers that were thoughtfully chosen to share their knowledge and experiences with us…as a person of color, there are no words to truly describe the impact of being validated every day and to be in a space where people embrace conversations about race and ethnicity, and are passionate about making a change.”

Since this was my first time organizing a SICSS partner location, I decided to closely follow the SICSS main curriculum for the first week. Along the way I also adhered to the SICSS promise of creating open-source materials by documenting all our work thoughtfully, posting links to videos and slides on our website to make the learning experience accessible to those present in our virtual Howard classroom and those following along around the world. In addition to those SICSS aligned choices, I also opted to introduce several Howard/Mathematica site specific innovations. In the interest of space and time I will share the overall structure of that plan. I welcome and appreciate requests to discuss, write about, and instruct others on this work in greater detail as well. I can be reached via email at nhcoleman@berkeley.edu or through my website naniettecoleman.com.

The SICSS-Howard/Mathematica partner site was made possible by the exceptional generosity of Paul Decker and our friends at Mathematica. This program was also made possible by the dedication of many exceptional people at Howard University and Mathematica including Calvin Hadley and Akira Bell.

This post-mortem will be divided into six sections: 1) outreach and application process; 2) pre-arrival and onboarding; 3) first week; 4) second week; 5) post-SICSS, and 6) final remarks.

Outreach and application process

Outreach: The intention of our outreach was to attract applicants, build a long-term pipeline for current and future SICSS-Howard/Mathematica site promotion, and inform interested parties about the day-to-day activities at our site. Our messaging prioritized discussions of the fact that we are the first SICSS partner site at a Historically Black College or University, noted our topical focus on anti-black racism and inequity, and included images of black students. Outreach to potential participants was broken up into three categories in order to spread responsibility across all three organizations and capitalize on existing expertise. The SICSS organizer took the lead on contacting academic associations, centers, and thought leaders in the social sciences, former SICSS faculty, former SICSS participants, and current faculty, administrators, and students at predominantly white institutions. Howard University took the lead on contacting HBCU presidents, Deans, Chairs, Faculty, administrators, and students. Finally, Mathematica took the lead on contacting professional associations and organizations (e.g. National Society of Black Engineers) and professionals in data science and computational social science. Following the lead of the Princeton site we also “tried to reach potentially interested participants through social media, email lists, and emails to faculty that we thought might know interested participants.” Once we identified individuals who were interested in our SICSS we created and added them all to a Google Group. Our site maintains a Twitter and Facebook account and creates original content. We posted daily in the days leading up to the institute and daily during the institute. To our knowledge, the only other SICSS site with a dedicated account was SICSS Istanbul. We posted daily and made sure to retweet those who tweeted at us. We had a variety of content, including previews, recaps, quotes, direction to our sites and videos, and spotlighting speakers and participants. Having this account also allowed those that were unable to participate this year to easily follow along and stay updated. Finally, our partner site maintains a YouTube depot of 59 public videos (total length: 43 hours, 8 min, and 32 seconds) for repeat viewing by participants or interested parties, as well as several embargoed videos for use in promotional efforts in the future.

Application Process: Applicants had to submit the following documents through the SICSS-Howard/Mathematica application form for full consideration by 11:59pm EST on Wednesday, March 31st: (i) personal statement (maximum 1000 words) including any economic, cultural, and/or social experiences, obstacles you have overcome, and or community service that shaped your interests in computational social science and your desire to attend the SICSS at Howard University (ii) research statement (maximum three pages) describing any current research, your general interest in computational social science, and your interest in the expressed topical focus of the SICSS-Howard/Mathematica site and (iii) a curriculum vitae. We had originally planned to notify candidates through e-mail by Wednesday, April 14th however we were unable to do that until Monday April 26th. We asked participants to confirm their participation very soon thereafter. All applicants were notified regardless of outcome.

This was our second year accepting applications. We initially accepted applications in 2020 for an in-person which was eventually cancelled due to COVID-19. This was our first year accepting applications for a purely virtual event. We more than doubled the number of applicants from the previous year. Ultimately, we admitted 31 applicants, 26 of which accepted admission. 24 attended.

What we learned from this process is there is a hunger for this type of skill-based methodological training for members of the global majority, but several myths, challenges, and constraints exist that prevent more widespread participation. I believe we have developed an approach that can be duplicated. I welcome the chance to share what we have learned with others should there be interest.

Pre-arrival and onboarding

Onboarding: Like SICSS-Princeton, we added all participants and staff to a Google group and sent out an email to the group with similar pre-arrival logistics: SICSS-Howard/Mathematica onboarding survey, SICSS-wide onboarding survey used by all locations, information about joining the Slack workspace, and information about pre-arrival activities they should complete. These included reading, optional coding activities for participants who wanted additional practice, and watching lectures by Matt and Chris. We provided a link to the pre-arrival section of our location’s webpage for more details and slowly added links for all the pre-arrival information in our daily agenda for ease of access since we found the two locations of information confusing.

Office hours: Like other sites we initially had a plan to host site specific office hours hosted by teaching assistants. In the end we suspended office hours and rethought the role of student TAs for the SICSS-Howard/Mathematica site for two reasons. First, we read in several post-mortems that office hours often went unused, and we found participants at our site were also not using the office hours consistently. Second, the pay prescribed for teaching assistants by SICSS made it very difficult for us to commandeer possible TAs undivided attention. In the end those realizations and the unexpected changes that followed produced a very positive outcome. I rethought initial work allocations, considered ways to better leverage existing expertise and relationships already within my network and at my disposal from co-sponsors. After that I just needed to pivot thoughtfully. Both co-sponsoring organizations and SICSS-Umbrella’s willingness to trust my instincts and share their resources, led to my feeling more confident in a space that could have been wrought. I ultimately opted to hire a larger team of long-term, committed members of the Berkeley based Interdisciplinary Research Group on Privacy to serve as logistics focused event staff instead and then recruited professional data scientists working for our co-sponsor Mathematica to serve as TAs for our HBCU based SICSS. Most of the event staff were either undergraduate or recent graduate women of color who had read Bit by Bit and were excited to jump in and do whatever was needed to make the event memorable. When starting a new site, that kind of energy and willingness goes a long way. Going forward we hope SICSS-Princeton and SICSS-Duke will permanently staff centralized office hours and for the #sicss-all-office-hours slack channel. Centralized office hours could offer more precise support for foundational curriculum activities and create a space for participants from different sites to meet each other.

T-shirts or other SWAG: Several SICSS-Howard/Mathematica participants said that the SWAG bag they received from us was the best they had ever received. The organizing team conceptualized a Howard/Mathematica branded SWAG bag with accompanying items to send to participants to welcome them. After pricing out the various parts of the endeavor Mathematica purchased all the items and then assembled and shipped the bags to all the participants. Speaker and event staff bags will go out shortly. The package contained a SICSS-Howard/Mathematica branded shopping bag, Mathematica branded spiral notebook, notepad, and M&Ms, a $25 DoorDash gift card for food delivery, four books by site affiliated authors (Bit by Bit by Matt Salganik, Mothering from the Field by Bahiyyah Muhammad, Policing in Natural Disasters by Terri Adams, and Race After Technology by Ruha Benjamin), one additional book with general applicability to our attendees (The Black Academic’s Guide to Winning Tenure Without Losing Your Soul by Kerry Ann Rockquemore), and a welcome letter by lead organizer Naniette Coleman. Looking ahead to future iterations (and considering ways to alleviate technological challenges) of this virtual program, we will also consider including an ethernet cord, wireless headset, camera, and flash drive with SICSS-Howard/Mathematica site-specific datasets on it (eg Black Twitter data as opposed to data on former president Donald Trump).

Participants also received SWAG from SICSS-Umbrella through Swagpack which our participants greatly appreciated.

“Praxis to Power” Pre-Institute

“Praxis to Power” was a pre-institute exclusive to SICSS-Howard/Mathematica that was focused on combating imposter syndrome and fostering an inclusive environment for our cohort of participants, which comprised mostly scholars of color.

The first day (June 12th) started with an extended welcome and logistics session in which participants and the organizing team did personal introductions and began to build community. Other main events included a lunchtime “real talk” roundtable discussion on “Personal Experiences of African Americans in the Academy” which included a panel of four Black professors (including three currently at Howard and one Howard alumna), followed by a workshop by Meckell Milburn entitled “Choosing Self-Care in a Hostile World,” and day 1 of SICSS-Princeton ‘19 alumni and recent Berkeley PhD, Jae Yeon Kim 3-hour, hands-on workshop on R entitled “Making Computational Methods Accessible”.

The second day (June 13th) also began with a longer logistics session, followed by three hours of self-driven learning (bootcamp and lecture materials from Matt and Chris, watching interview videos to get to know the organizing team, and a “Changing the Game” section with videos from partners on diversity in the field and reshaping higher education), the second iteration of Dr. Jae Yeon Kim’s hands-on 3-hour workshop on R entitled “Making Computational Methods Accessible” was also offered on day two.

We chose to conduct a skills survey during “Praxis to Power” in order to help our event team put individuals into groups daily during the first week. Rather than randomly assigning participants to groups, we wanted more intentional grouping with a balance of expertise in each day’s topic, while also ensuring individuals would meet new people each day. Questions included:

Day 1 (Monday): What is your experience with the R programming language? Briefly describe how you have used R? What is your experience with the Python programming language? Briefly describe how you have used Python? What is your exposure to conversations surrounding research ethics?

Day 2 (Tuesday): How would you describe your experience with APIs (Application Programming Interface), such as Twitter, Facebook? From 1-10, how would you rate your experience with extracting/web scraping digital trace data?

Day 3 (Wednesday): What is your experience with automated text analysis [techniques for analyzing large amounts of text in R]? (If applicable) Briefly describe what techniques you have used with automated text analysis (e.g., text preprocessing, dictionary-based text analysis, topic models, text networks, etc.)

Day 4 (Thursday): How much experience do you have with survey methods? How familiar are you with Amazon’s Mechanical Turk (MTurk)? If you have an MTurk account, please share the email address/username?

Day 5 (Friday): What experience do you have with turning raw data into usable data (wrangling data)? What background do you have, if any, in statistics? What experience do you have with building a simple linear regression model? For example, one variable is a predictor (independent), and one variable is the outcome (dependent). What experience, if any, do you have with building a more complex predictive model? This refers to a model more complex than a simple linear regression model (see previous question) and includes multiple predictor variables. A complex model might be like the model used in the Netflix Prize open call (as discussed in Bit by Bit on page 246). What is your familiarity with the Fragile Families data set?

We sent out 2 emails on the first day and 6 emails on the second day to remind participants of the time and check their media consent form. There were a total of 37 email threads on the first day, 25 email threads on the second day for calendar invites, checking in with speakers, and time scheduling. Attendance was 23/25 people on both days.

First week

Opening Plenary (Sunday): The pre-institute concluded, and the institute officially kicked off, with an opening plenary talk by Howard University President Dr. Wayne A.I. Frederick on Sunday evening. He discussed the importance of data in his experience as a surgeon and researcher, emphasized the impact of having SICSS at Howard University, and ended on a very personal note. Participants openly expressed their gratitude for his address.

Day 1 (Monday): SICSS-Howard/Mathematica specific activities included a lunchtime “real talk” roundtable discussion on “Getting the most out of your SICSS experience” with five SICSS alumni of color over lunch. Attendance was at its highest for the entire institute on the 1st day with 24/24 or 100% of participants present. For office hours, we had 2 professional Mathematica teaching assistants, but no participants attended office hours. There were 23 email threads in total and we sent out 7 emails to remind participants of the event, send updates, confirm with speakers’ media consent, and scheduling time with speakers.

Day 2 (Tuesday): SICSS-Howard/Mathematica site specific activities included a bite-sized lunchtime lunch with Wiki Education. This live event was a follow-on to pre-taped original content participants had access to. Attendance continued to be at its highest for the entire institute on the 2nd day with 24/24 or 100% of participants present. For office hours, we had 11 professional Mathematica teaching assistants, and 20 participants attended office hours. The most common topics were about web scraping, using R, exploring APIs, and reading data. When there was traffic in the office hour session, participants were lined up to get their questions answered. There were 11 email threads and we sent out 7 emails to remind participants and update on resources.

Day 3 (Wednesday): SICSS-Howard/Mathematica site specific activities included a bite-sized lunchtime talk with the founders of the Data Nutrition Project. This live event was a follow-on to pre-taped original content participants had access to. Attendance remained high with 23/24 or 95.83% of all participants present. For office hours, we had 11 professional Mathematica teaching assistants and 6 participants attended office hours. The topics covered in office hour sessions included debugging web-scraping code and programming with R and STATA. There were 11 email threads and we sent out 7 emails to remind participants and check in with teaching assistants and speakers.

Day 4 (Thursday): SICSS-Howard/Mathematica site specific activities included a bite-sized lunchtime talk with a representative from the California Policy Lab. This live event was a follow-on to pre-taped original content participants had access to. Attendance continued to be at its highest for the entire institute on the 4th day with 24/24 or 100% of participants present. For office hours, we had 8 professional Mathematica teaching assistants, and 4 participants attended office hours. The most common topics were about using R, analysis ideas and approaches, and web scraping. There were 13 email threads and we sent out 8 emails to check in with speakers, send event reminders, and discuss agenda for the next day.

Day 5 (Friday): SICSS-Howard/Mathematica site specific activities included a Bite-sized lunchtime talk with the founders of FactSpace West Africa and FakeNetAI. This live event was a follow-on to pre-taped original content participants had access to. At the end of the day, we had a surprise Juneteenth celebration with a live DJ. There was a slight drop-off, but attendance remained relatively high on the 5th day with 21/24 or 87.5% of participants present. For office hours, we had 10 professional Mathematica teaching assistants and 7 participants attended office hours. The topics covered included linear regression, data visualization, data manipulation, and statistical modeling. There were 10 email threads and we sent out 4 emails for office hour remainders, checking in with participants and speakers, and confirming media consent.

Second week (group projects)

Day 6 (Monday): SICSS-Howard/Mathematica site specific activities included two extended sessions of research speed dating, which we altered for our site, a bite-sized lunchtime talk with a board member of Black in AI, and live Q&A/ office hours with Dr. Ruha Benjamin. Both live events were follow-ons to pre-taped original content participants had access to. Participants spent a fair amount of the day setting their intentions and conceptualizing research teams and/or planning individual enrichment projects. Attendance remained high on the 6th day with 23/24 or 95.83% of all participants present. For office hours, 8 professional Mathematica teaching assistants were available via email. Attendance was not possible to gauge during week two since all appointments with TAs were scheduled by participants directly. Many would meet their TA in one of our break-out rooms. There were 23 email threads and we sent out 2 emails for scheduling events, discussing run-of-shows, checking in with speakers, and communicating with other SICSS sites.

Day 7 (Tuesday): SICSS-Howard/Mathematica site specific activities included finalizing research teams and/or planning individual enrichment projects. Other activities included a bite-sized lunchtime talk with Jeffrey MacKie-Mason from the University of California’s Publishing Negotiation Team and live Q&A/office hours with guest speaker Chris Wheat, PhD. Both live events were follow-ons to pre-taped original content participants had access to. Attendance dipped a bit but remained high on the 7th day with 21/24 or 87.5% of all participants were present. For office hours, 15 professional Mathematica teaching assistants were available via email. Attendance was not possible to gauge during week two since all appointments with TAs were scheduled by participants directly. Many would meet their TA in one of our break-out rooms. There were 22 email threads and we sent out 7 emails for checking in with speakers, reminding participants, and confirming availability.

Day 8 (Wednesday): SICSS-Howard/Mathematica site specific activities included team or individual project work in the morning and afternoon, in addition to a lunch time “real talk” roundtable discussion with five researchers from Mathematica, moderated by a Mathematica research associate and Howard alumna, and live Q&A/office hours with guest speaker Dr. Naomi Sugie. Both live events were follow-ons to pre-taped original content participants had access to. Attendance remained relatively high on the 8th day with 21/24 or 87.5% of all participants present. For office hours, 12 professional Mathematica teaching assistants were available via email and 2 groups of participants scheduled meetings with teaching assistants to discuss questions about web scraping via R. Attendance was not possible to gauge during week two since all appointments with TAs were scheduled by participants directly. Many would meet their TA in one of our break-out rooms. There were 16 email threads and we sent out 16 emails to check with participants who showed a preference for solo projects, check with teaching assistants, and send remainders.

Day 9 (Thursday): SICSS-Howard/Mathematica site specific activities included team or individual project work in the morning and afternoon, in addition to a bite-sized lunchtime talk with vary CSS and live Q&A/office hours with guest speaker Dr. Laura K. Nelson. Both live events were follow-ons to pre-taped original content participants had access to. Attendance dipped to its lowest level for the entire institute on the 9th day with only 20/24 or 83.33% of all participants present. This number is still relatively high for a fully virtual event. For office hours, 18 professional Mathematica teaching assistants were available via email and 3 groups of participants scheduled meetings with teaching assistants to discuss questions about topic modeling, accessing APIs, and web scraping via R. Attendance was not possible to gauge during week two since all appointments with TAs were scheduled by participants directly. Many would meet their TA in one of our break-out rooms. There were 19 email threads and we sent out 9 emails for funding questions, event remainders, and checking in with participants, speakers and teaching assistants.

Day 10 (Friday): On the final day of the institute SICSS-Howard/Mathematica site specific activities included a private screening of Dr. Timnit Gebru’s Keynote Address before it was posted publicly. Dr. Gebru’s detailed with candor her journey from childhood throughout her education, to her experience in AI research and industry and her co-founding of Black in AI. Group and individual project presentations also took place on Friday. We invited all the speakers, sponsors, TAs, and members of the organizing team to be in community with our participants for the presentations and close of the institute. Many of those invited did join (including Chris Bail!) and were engaged in asking questions and providing feedback. We provided 20 minutes total for each presentation, to include 10 minutes of Q&A. Participants came well-prepared for their own presentations, and supported others during their presentations with comments and questions in the chat box. Attendance increased on the final day of the institute with 22/24 or 91.67% of all participants present. For office hours, 13 professional Mathematica teaching assistants were available via email. Attendance was not possible to gauge during week two since all appointments with TAs were scheduled by participants directly. Many would meet their TA in one of our break-out rooms. There were 25 email threads and we sent out 4 emails for logistics, event remainders, and checking in with participants for the final presentation.

Closing Plenary (Friday): Dr. Paul Decker’s Closing Plenary was a celebration of participants’ dedication and learning over the course of the institute. He emphasized how much he had learned from their final individual and group presentations and the important role that data plays for him as President and CEO of Mathematica. His engagement with participants included an extensive Q&A session. SICSS-Howard/Mathematica officially ended with Dr. Paul Decker’s closing plenary.

Post-departure

Alongside the SICSS-wide survey from SICSS-Princeton, we also sent out a final survey unique to the SICSS-Howard/Mathematica site and received 17 out of 24 responses. A list of questions asked (some combined for brevity) appears below:

Please rate the importance of the following components at SICSS Howard/Mathematica to your experience: Praxis to Power Day 1 and 2, Days 1-10, guest speakers, roundtable talks, Bite-Sized Lunchtime talks, lectures about CSS by Matt and Chris, open-source materials (takeaway sheets, videos, etc.), plenary speakers, keynote speaker, office hours with Mathematica TAs?

How useful a time investment was Praxis to Power for you? Please any additional thoughts about how useful Praxis to Power, the pre-institute provided exclusively to SICSS-Howard/Mathematica participants was for you.

On a scale of 1 to 10, how accurately does this (description of imposter syndrome) describe your experience?

How many of the guest speaker talks and live Q&A sessions, roundtables, opening and closing plenary, and keynote address did you attend? How important were they to your experience, and is there anything else you’d like to share?

Which of the innovations founded at SICSS Howard/Mathematica institute would you recommend we continue at future iterations?

What did you think about group projects and individual projects? Do you have any suggestions on improving these? Did you do a group project? Did you do an individual project?

From 1-10, how would you rate your experience with: the R programming language, the Python programming language, extracting/web scraping digital trace data, automated text analysis, survey methods, wrangling data, building a simple linear regression model, building a more complex predictive model?

What is your overall impression of this summer institute in general and the summer institute at Howard sponsored by Mathematica in particular?

A big part of the mission of the SICSS-Howard/Mathematica site is the diversification of the field of computational social science. In what ways might we update the teaching materials (bootcamp and pre-arrival lectures, daily institute exercises, etc.) to continue to increase diversity in our field?

If you had to nominate two people to serve as Marshalls to foster this community after SICSS Howard/Mathematica ends, who would that be?

Any additional comments?

Final Remarks

The first Summer Institute in Computational Social Science at a Historically Black College or University, co-sponsored by Howard University and Mathematica and conceptualized, founded, and organized by SICSS-Princeton ’19 alumna and Berkeley PhD Candidate Naniette H. Coleman reached its conclusion after a final “real talk” group debrief where 22 participants danced a little, cried a little, laughed a lot, and shared what the experience meant to them. Once an actual mic was dropped, we knew it was time to say goodbye. Until next time…peace.


SICSS-Istanbul

SICSS-Istanbul started with a welcoming event on May 30 and ended with research presentations on June 25. We can divide our program into three parts in terms of Pre-SICSS preparations, Research Talks, and Group Works. In this post mortem, we evaluate what has been done from the application process to the end of the program. Most of the notes in this post mortem have been taken during the event to not forget practicalities and participant’s feedback are also included in it.

Application Process and Welcome Session

By following SICSS-Princeton, we announced the call for applications in early January. We have accepted applications until February 28. We sent the invitation and rejection emails on March 15 so that participants would have a chance to check their availability. In the invitation email, we briefly described what we will be doing in the virtual SICSS this year and introduced our Pre-SICSS preparation tasks to give them a chance to look at tasks beforehand. As all of the materials (including SICSS Bootcamp) are available online this year, we outlined them in the invitation email as well. When it comes to the rejection email, we mainly describe which tutorials and materials the applicants should study to improve their computational skills.

Next year, we are aiming to call for applications not too early because most of the applications were submitted one week ahead of our deadline. We are thinking that it would be practical to keep the application process in the first two weeks of April and declare the admissions in early May. We are also planning to have a welcoming session just after the admissions to go over Pre-Arrival materials and introduce Pre-SICSS tasks to give more time to our participants for Pre-SICSS tasks. We can keep our code-through sessions and office hours at the same time.

For the advertisement, we use mainly social media. We created a dedicated Twitter account for SICSS Istanbul: https://twitter.com/SICSS_Istanbul. This allows us to be more organized and applicants to be informed well. We also sent emails to the faculty in the eminent Turkish universities. It seems that SICSS Istanbul gained recognition in the Turkish CSS community. Thanks to our alumni, we can reach a broader community easily.

Pre-SICSS Tasks

This year, we revised our tasks and divided them into five tasks. Every participant is expected to finish a whole data project including data collection, data preprocessing, data analysis, and reporting. Related SICSS lectures and SICSS Bootcamp videos are assigned for each task. We covered Day 2(Collecting Digital Trace Data) and Day 3 (Automated Text Analysis) curriculum in our tasks. Also, some of the tasks required to complete SICSS Bootcamp videos. What we do is to give more time to our participants to complete the materials in days 2 and 3 of the main SICSS curriculum. In this way, we ensured that the participants have hands-on experience in most of the coding stuff and they are ready for the group projects.

Code Walkthrough Sessions

This year, we added code walkthrough sessions into our program. Every session lasts two hours including a short break. The main idea is coding in real-time with the participants and answering their immediate questions in the session. While participation is optional, most people follow the sessions regularly. We should continue to have them.

We can add short small group exercises to the session. For example, we can show example codes in the first hour of the session. And, we can leave the second hour for the small group exercises which can be just a replication of the codes. Small groups can be created in a random way. We, organizers, can visit the groups and help them when they get stuck. Finally, all groups should come back again to evaluate the exercise together and to close the session together.

We need 2 hours and 15 minutes long sessions for that.

  • 1 Hour: Code Walkthrough
  • 15 minutes: Dividing small groups and short break
  • 45 minutes: group exercise (replication of the same codes)
  • 15 minutes: whole group discussion

Getting Twitter API credentials is still a huge problem for our participants. We should think of possible solutions for that. Some of the people got rejected from the Academic API application, then they got Student API credentials. Fortunately, almost half of the participants are to get either Academic or Regular API.

Slack Management and Office Hours

After the welcoming session, we started to use Slack as the main way of communication instead of emails. However, some of the participants got confused with the flow of the messages on Slack, as all of the discussions have been discussed in one channel. Therefore, we created three channels in terms of #istanbul, #istanbul_projects, and #istanbul_random.

  • In #istanbul channel, the aim is to discuss our schedule, materials, organization, office hours, etc.
  • In #istanbul_projects channel, the aim is set to discuss research ideas, articles, etc.
  • In #istanbul_random channel, the aim is to have fun and leisure and share random things which are related to neither research ideas nor our organization.

We organize office hours on request via our Slack workspace. Also, we put free talk sessions before and after every session so that participants have a chance to ask random questions in a relatively friendly environment.

Networking and Group Work Sessions

When it comes to online networking sessions, we used the Wonder platform during the Pre-SICSS period. It was really great! Most of our participants agreed that it can be used as a replacement for Zoom breakout rooms and office hours in the future. For the group meetings, we switched to using Gather by following advice from other sites. Actually, it is more useful than Wonder for group work. We also can use Gather for lecturing purposes instead of Zoom.

First Week

We had research Talks and group discussions in the first week of “Real SICSS”. Until that time, we have already spent two weeks on the Pre-SICSS tasks. This week was full of research talks and small group discussions. We already covered some of the SICSS curricula during the Pre-SICSS period. By completing Pre-SICSS tasks, participants had hands-on experience on how to collect data from the web and how to process your data. We had small group discussions on Ethics first.

Then, we hosted eminent scholars of computational social science to see not only state-of-the-art methods in the field but also applications of those methods in multiple disciplines. Some of the talks have preliminary readings, we put them on our schedule. We had three research talks in the first week on big data for studying human migration, the micro geography of ethnic enclaves, and characterizing the nature of online political conversations at scale. Including geospatial analysis is new to our curriculum. We are aiming to continue to organize a workshop on geospatial analysis next year as well.

We also had research dating sessions to establish groups for the second-week program. We did not prepare a google sheet to match the participants’ interests in contrast to our previous year. Instead, we have started the research dating process from the second day of the first week. Thus, participants had four dating sessions for only discussing potential research ideas and finding suitable group members. They were expected to report their group members and a short description of their group project at the end of the week. It went well.

Second Week

This week, we had guest talks and alumni sessions alongside the group studies. For the first time, we hosted a leading figure in the Turkish Tech industry as a guest speaker to see the other side of the coin. SICSS Festival talks inspired us to organize such a local talk on our site. We will also host an eminent figure of Turkish Natural Language Processing to learn the state-of-the-art NLP applications in social sciences. This talk was of great importance for us considering the shortage of NLP studies in non-English languages.

By following the same tradition from last year, we hosted two alumni talks to discuss their ongoing research started in the last year of SICSS. While one of the alumni has presented his latest published article, the second alumni has presented his latest conference paper which has started as a group project in SICSS Istanbul 2020.

Alongside the talks, groups were working on their projects. We had five groups this year. At the end of the week, they presented their preliminary findings to the public. They put all their outputs to a dedicated GitHub page so that future applicants may have a more grasp on group projects. The link can be found on our schedule: https://sicss.io/2021/istanbul/schedule

New Model for in-Person Event next year

In brief, we should extend the self-paced online Pre-SICSS period and keep the same in-person SICSS period.

Applications and Evaluation Period:

  • March 28, call for applications via Twitter
  • April 11, close applications
  • April 25, sending acceptance letters

Pre-SICSS Period:

  • May 1, Welcoming event and beginning of Pre-SICSS period
  • If we declare the offers by the end of April, we can launch the online Pre-SICSS period with an online welcoming event. In this event, we should describe the tasks and learning materials. This should be a self-paced learning experience. Aside from SICSS-Princeton office hours, we can also put weekly code-through sessions and office hours.
  • We should put networking sessions just before and after the events so that project group formation may get momentum.
  • Gather is useful for both office hours and networking sessions.

Welcoming Dinner

On June 11, we can have a welcoming dinner to get to know each other well.

SICSS Period

  • Between 12-24, we can organize an in-person event for just group works and workshops.
  • A typical day should begin at 9:00 am and end at 18:00.
  • The first week would be full of tutorials that include hands-on experience. We are thinking of organizing tutorial sessions on Image-as-data, Geo-Spatial Analysis, Text Analysis, and Network Analysis considering the demand of our previous participants and current trends in the field.
  • One evening should be reserved for the social event. The whole SICSS Istanbul alumni can be invited to the social event.

SICSS-Law

1. Outreach and application process

Timeline:

29 January: we started the first promotional activities and reached out to the guest speakers.

22 March: original application deadline.

22 March: a second round of promotional activities.

29 March: extended deadline.

21 April: results sent to the applicants.

Numbers:

29 applications (10 before the first deadline, 19 after)

23 invited participants. 3 withdrawn due to other events, 1 did not respond to the acceptance email.

19 participated in Week 1, 2 participants were present only part of the time, 1 had to withdraw due to health issues.

14 participated in Week 2 research projects.

Additional info:

We reached out to various academic networks through emails, Twitter, and Facebook groups. We paid particular attention to contacting universities and research centres in Eastern Europe to promote computational social science skills in the countries where these methods are still uncommon among researchers. We emailed various research groups interested in law, computational methods, social science and more.

We used a designated SICSS-Law Gmail account for all communications with participants and Google Forms for the application process. Only one applicant did not use Google products and we accepted their application via email. All information was stored in one place allowing for a smooth selection process. All applications were discussed with SICSS-Law organisers and teaching assistants.

We required the following documents: (i) a curriculum vitae, (ii) a statement (maximum three pages) describing both any current research and your interest in computational social science, (iii) one writing sample (no more than 35 pages). The CV and statement of interest gave a good indication of whether the person would fit SICSS-Law profile. It was very helpful to have an organisation team that largely overlapped with the Maastricht 2020 team - their experience with last year’s participants served as an additional filter for the applications where the results were not clearcut.

Things to improve and some afterthoughts:

Given the low response rate before the original deadline, the promotional activities could have been more targeted, including more follow ups with individuals and organisations that did not provide reactions to the first emails.

We wanted to emphasise the unique curriculum of SICSS-Law and commitment that is required from the participants at the early stages of their offer acceptance. However, given the two passive participants and one of the applicants who withdrew their application, we believe that even further emphasis on the intensity of the program should be added to the website descriptions and acceptance emails. We sent additional email to participants with no coding experience this year, but phrasing could be improved to make it even clearer.

Google Forms worked well for the application process.

2. Pre-arrival and onboarding

Email communication:

We sent out three emails to participants: (1) acceptance email asking to confirm their intent to participate and emphasising the values of SICSS, (2) on-boarding email with all the pre-arrival and summer institute details, and (3) on-boarding follow up email.

Content of the on-boarding message:

We asked participants to (1) complete the pre-arrival tasks, (2) share a short bio, profile photo (through Google Form), (3) record and share a video flash talk presenting their project ideas for Week 2 (uploaded on Youtube as unlisted videos), and (4) join Slack workspace (SICSS and SICSS-Law).

18 participants submitted their bios. 12 participants submitted their flash talk videos that were shared and discussed during the lunch sessions on Week 1. All participants joined or were added to Slack.

Slack communication:

Slack was our main communication tool during the summer school. It worked really well for an online event. We created three channels for SICSS-Law: announcements, general, and speakers. In the announcements channel, we sent reminders about the start of sessions, changes in schedule and Zoom links. Participants were reminded to check the channel regularly.

Things to improve and some afterthoughts:

We could put more emphasis on ensuring that participants understand the level of preparation required for SICSS, especially if they have no coding experience. While the website and on-boarding message provides all the materials and explanations, there were still a few participants who had misaligned expectations of the event and pre-arrival tasks.

One idea is to send a separate email focused only on expectations for participant coding skills and/or provide a two day crash course before the summer school to those with no experience. We think this would be even more effective in circumstances where SICSS is held in person.

3. Week 1

General Info:

We opened SICSS-Law with a Meet and Greet on Sunday, June 13 at 17.00 CET. After a brief introduction, participants were randomly assigned in breakout rooms of 3-4 people and had a chance to meet the organisers and other participants. We repeated the random allocation one more time and then people who wished to remain on the call after the allocated time stayed in the main room. The event created a friendly atmosphere and allowed us to skip long introduction rounds on the first day of the intensive programme.

Lectures and group exercises were scheduled Monday - Friday from 12.00 CET to 19.00 CET. Most participants were from Europe, but we also tried to accommodate participants from GMT-6 and GMT+8. This proved to be challenging, but would not be a problem in in-person SICSS. We made an announcement on Slack announcements channel 15 min before every session with the relevant Zoom link. Participants found this very useful.

Recurring activities:

The first four days followed the SICSS-Princeton 2021 curriculum. Each day started with a guest lecture (1 hour), which was followed by a small lunch session, where flash talk videos were discussed (1 hour), a group activity (3 hours) and debriefing (1 hour). We scheduled breaks between the sessions, these were later adjusted from 15-20min breaks to 30min breaks to reduce Zoom fatigue. We repeatedly reminded participants to take breaks away from the screen. Day 5 schedule was changed due to some challenges encountered during the group activity on Day 4.

Guest speakers gave presentations between 30-40 minutes, leaving 20-30 min for questions from the audience and discussions. We selected speakers who are themselves working on SICSS-Law related topics. Therefore, participants not only had the opportunity to hear presentations on Week 1 topics, but could also ask questions related to presenter research methods and experiences in academia. Participants reflected very positively on all the lectures and networking opportunities. Depending on speaker availability, their talks were either scheduled as the first or second session in the day. We noticed that it worked well to have the lunch session before the first guest speaker on Day 1, since not all participants were present at the Meet and Greet session on Sunday.

For the small lunch sessions, we created a playlist of 3-4 flash talk videos on Youtube, which participants had to watch in preparation for the next day. We created one breakout room per topic and the presenter had to be present in their room. Other participants could freely move between the different breakout rooms during the one hour session. We aimed at having one member of our team in each breakout room to moderate the discussion in case there was a need for that. This was introduced after the first day in reaction to feedback from participants. We also created a document where people could share resources on the topics discussed during these sessions. The idea was to discuss potential research ideas for Week 2 activities and learn about the research interests of the other participants. The groups for Week 2 started forming early in the week and we felt that those participants with flash talks scheduled later were somewhat unlucky and might risk their topic not to be chosen.

Afternoon sessions were dedicated to group activities (3 hours) and debriefing (1 hour). We changed the group compositions every day to ensure that participants have plenty of opportunities to get to work with others and that those with no coding experience could work in groups with more experienced researchers.

Special activities on each day:

Day 1 started with a 30 minute introduction and logistics session. We reiterated the goals and motivation behind SICSS, emphasising the importance of teamwork and community building activities. We gave a brief introduction of the ways computational social science can be applied to law providing several examples to contextualise skills participants will be learning throughout the week. The introductory session ended by explaining the Zoom and Slack setups. After the introduction, we continued with the lunch session with the first four flash talks. It was followed by an interactive presentation on research ethics by Teun J Dekker. For Day 1 group activity, we used the Patreon case study by SICSS-Princeton. We allocated 1 hour for group discussions and 30min for whole group discussion. Participants were allocated into groups of 3-4 people. There was some overlap between the topics discussed in the lecture and group activity, and we later received feedback that a full hour for the case study was too long. The first day concluded with the keynote by Michael A. Livermore. The keynote was open for a wider audience and was advertised on Twitter, it had around 40 attendees.

During day 2-4 we had the following schedule: We started with an invited talk (60 minutes) and participants seemed to like that a lot. We continued with a lunchtime session (60 minutes). Afterwards, we had group activities (3 hours). In the evening we had a one hour debriefing during which we discussed the group activities. Each group presented what they did and reflected upon the activity. This worked very well.

On Day 5 we intended to cover the topic of legal predictions, but due to the great enthusiasm for the self-led group projects, issues with data collection via Prolific on Day 4, and some unforeseen issues in designing the prediction task, we changed the structure of the last day. Instead, Day 5 morning guest lecture was followed by debriefing of Day 4, and then group formation activities, originally planned for Day 6. For group formation activities, we created a shared Google file where participants could add a title and short description of their proposed project. Then others could indicate their interest in the projects by adding their name next to these projects. We created breakout rooms based on the topics to discuss the possible teams and directions of the study. Three topics emerged among others and these formed the groups for Week 2 activities. Additionally one participant opted to continue working on their own CSS related project during Week 2.

Additional info:

Most days relied on materials shared by Matt Salganik and Chris Bail. With 15-19 active participants, most group activities were in groups of 3-4 people. We allocated 3 hours for exercises each day. After a short introduction of the tasks, the groups were let to organise the time among themselves, including taking sufficient breaks. During the group activities teaching assistants and organisers were present and attended the breakout rooms to provide support.

On Day 4, as in SICSS-Maastricht 2020, we used Prolific instead of MTurk. We reused 2020 accounts and topped them up for data collection. Due to some changes in Prolific policy the data collection speed was very slow. Since all groups experienced this issue, we adapted the task to complete survey design and deploy it overnight. We skipped the debriefing on Day 4 as none of the groups had results for analysis yet. Despite the challenges, the participants really enjoyed this activity.

We had three team meetings during the week to discuss participant feedback and changes in schedule. Most communication took place on Slack.

Things to improve and some afterthoughts:

We shared a Keep-Start-Stop survey (Google forms) at the end of every day. Overall, participants were enjoying the activities and communication with the peers and organisers. We took their feedback into account and made some changes throughout the week. In particular, we extended some breaks between the sessions to give participants enough time to leave their devices and rest their eyes.

Participants appreciated discussions on research ethics on Day 1. However, the overlap between the guest talk and group activity meant there was some repetition. One idea is to either add more questions to the case study or choose a guest who covers topics related but not overlapping with the afternoon questions. Day 1 was the longest day and we noticed that participants were less active in interacting with the keynote speaker.

Twitter API access requests took longer than initially anticipated, consequently not all participants had access to Twitter data. For next editions, we should send out instructions about access requests before the start of Week 1. Participants reported that there are a few inactive members who either do not engage in discussions or have their cameras off. At the debriefing, we tried to emphasise again the importance of active participation and communication. We also encouraged participants to switch roles from day to day. For example, if a person had shared their screen and coded on Day 2, on Day 3 another person should take that role. We changed the group compositions every day, which should have helped with switching the roles too. We were also encouraged to share more information about the speakers. Not all speakers had sent their bios before the summer school. For the following days, we added their university profile page and links to the newest or most relevant CSS articles.

Due to changes in Prolific policy, we did not manage to complete the tasks on Day 4. However, we adjusted and took over Day 5 sessions to finish analysing the data. We intend to save this year’s data set in case of any future failures. We need to rethink Day 5 group activity to engage participants with very varied coding experience. We planned to focus on legal case outcome predictions. Yet, among our participants we had only very few with more advanced coding experience which would be necessary for such an activity. It is something to keep in mind for the next year.

Overall, we had a very enthusiastic team of organisers and TAs that kept up the mood of the summer school. The atmosphere was informal and collaborative, and there was a lot of enthusiasm for the group projects on Week 2.

4. Week 2

Schedule:

Since the main tasks of group formation were already carried out on Friday, Day 6 began by a session where each project topic had been allocated their own breakout room. Undecided participants could move between the rooms to see which topic they would like to work on. By the end of Day 6, we had four well formed groups with a total of 14 participants. Group size varied from 1 till 7 participants. We asked the groups to fill a simple proposal form by 17.00 on Monday indicating whether any financial support would be needed. None of the groups requested any funding.

We held office hours in the afternoons of Days 6-9. Participants could sign up for these in a shared Google document. On Days 7-9, we also continued the small lunch tradition, participants could propose topics they wished to discuss. We had 1-2 breakout rooms in most sessions.

We hosted a Panel discussion on Day 8, which was opened to the SICSS Festival participants.

On Day 10, we had four group project presentations, 45 min each with 15 min break in between the sessions. Participants were active in engaging with the other projects. We finished the summer school with short closing remarks and discussion on the next steps of the SICSS-Law group.

Things to improve and some afterthoughts:

During Week 2, some of the participants had to return to their daily commitments and could only partially participate in the group tasks. The groups tried to accommodate that.

It would have been a good idea to advertise the project presentations more widely, in particular among our Maastricht Law and Tech Lab community and SICSS-Law alumni, since the groups worked on very interesting topics.

The panel discussion lasted 2 hours and with three panelists and a general topic of career directions in SICSS-Law this could have been a shorter session.

5. Logistics

Zoom: we had three Zoom links in total. One for all the invited talks, one for the lunch sessions and group activities, and one for the keynote and panel discussion, which were opened to the public. We originally thought that the guest talks would be attended by some of the faculty members, but in the end these were only attended by SICSS-Law participants and we could have used a single Zoom link. The lunch and group activities Zoom was set to enable participants to move between the breakout rooms freely and to share their screens when needed.

Slack: we created three channels: SICSS-Law general, SICCS-Law announcements, and SICSS-Law speakers. We posted reminders about the sessions with Zoom links in the announcements channel to help participants navigate through the sessions. Speaker channel remained largely unused.

We discussed and created a SICSS-Law Github page to gather the materials and code created during the programme.


SICSS-Lisbon

The SICSS-Lisbon program is a partner location of SICSS 2021 in Lisbon, Portugal. It is organized by Qiwei Han and Filipa Reis and co-hosted by two leading Portuguese business schools Nova School of Business and Economics (NOVA SBE) and Catolica-Lisbon School of Business and Economics (CLSBE). Due to the COVID-19 risk, the program is held remotely from June 14th to June 25th. The focus of the Lisbon program is to nurture a community of computational social science (CSS) researchers in Iberia and neighboring areas, equipping them with essential skills for CSS research.

In this post-mortem, we have described our experiences in the following sections: 1) advertising and application process; 2) logistics issues and onboarding process; 3) main programs; 4) group projects; 5) extra activities and post-departure.

1. Advertising and application process

As this is the first-ever SICSS program held in Iberia (Portugal and Spain), promoting the program in the area is the top priority to attract the great participants that are interested in CSS research. At the beginning of the year, we have received support from the host schools to help advertise the program to the local Portuguese schools. Meanwhile, we also reach out to other schools through organizers’ professional networks in Europe. Although later many participants reflected they applied to the program because “SICSS is quite well-known”, we realized it is not the case in Iberia, that SICSS is much less-known compared in other regions. Thus, we decided to extend the application deadline to April 7th, and at the same time to circulate the message through social media, email lists.

All applications are managed through a customized Google Form and are reviewed by both of the organizers. The applications are reviewed by both organizers and decisions were sent on April 21st. In total, we have received 25 applications and admitted 17 of them. Those who are not admitted mainly have submitted insufficient application material or do not meet the requirement (we are looking for participants who are currently enrolled in the Ph.D. program, post-doc, or pre-tenured faculty). However, 3 of them decided to withdraw due to personal reasons, as this is quite common this year when the restrictions like travel restrictions limit people’s schedules. As a result, we formally welcomed 14 participants with very diverse backgrounds. They come from 12 nationalities from U.S., Europe, Middle East, China, and Africa, 14 universities, and 5 major disciplines, including Communications, Sociology, Business, Economics, Political Science. It is also worth noting that 7 (50%) of participants are female, allowing us to have a balanced group.

All program days are scheduled to start at 9 am Western Stanard Summer Time (WEST), which potentially posed challenges for several participants from the U.S to participate, although a majority of them are in various European countries. However, our participants managed to participate in the full program to overcome the time zone barrier.

2. Logistics issues and onboarding process

We have recruited two TAs to help with logistic issues and the onboarding process. Both TAs have prior teaching experience in the data science courses and thus provide good help throughout the program. All participants are notified of the pre-arrival preparations right after they are accepted. First, they are required to submit the bio and profile photos that are added to the SICSS website. Second, they are provided with instructions such as how to join the Slack channel #SICSS-Lisbon and to follow the instructional materials from the SICSS website. According to the application material, all participants have some programming experience, while a few have good R knowledge. For those who need to practice R programming, our TAs have set office hours to work with participants for individual concerns. In particular, participants reflected that they find the instructional material provided by SICSS are very helpful and allow them to jumpstart the program later.

We also hosted the welcome session the day before the program, and all participants were able to use Zoom, which is the main communication tool for group work and guest talks. They experienced how to get assigned into different groups and return to the main room after the session expired. Overall, the prearrival process is quite smooth and everyone is ready to launch the main program after months of preparations.

We followed the experience from the SICSS-Duke 2020 as one organizer was the participant there. We created the Zoom link for each day of the event separately, including both morning and afternoon sessions, guest talks as well as virtual social activities. The feedback forms and daily notes are created in the form of Google doc and have been put together in a shared Google excel that all participants have access to.

We managed to invite two computational social science researchers to give guest talks through the organizer’s professional network. Prof. Joana Sa from the University of Lisbon talks about “Digital Epidemiology” and Dr. Moinul Zaber from United Nations University talk about the “prospect of CSS for governance in the developing world”. Furthermore, we have benefited greatly from the SICSS community and received support from SICSS-Oxford and SICSS-London to hold joint invited talks by Prof. David Lazer from Northeastern University and Prof. Eszter Hargittai from the University of Zurich.

3. Main program

As the SICSS program has universally adopted the “flipped classroom” model since 2020, we have followed the main program at Princeton with the same teaching schedule. Participants are required to watch program videos before each day and work together on the group exercises. Organizers and TAs are available to help them with all questions.

Week 1

Given the well-established program schedule of the Princeton site, generally, we have detailed instructions on each day’s learning materials as well as exercises. In the first two days, participants joined the group exercises through random assignment, for the purpose of getting to know others more easily in the virtual setting. Then, we switched to the skill matching mechanism by asking participants’ confidence in the subject and assign them to form balanced groups of knowledge competence. It turns out to work pretty well, as participants are more interested in knowing each other in the beginning and later tend to reply on group wisdom to finish the exercises.

This setting does not come without limitations. As we monitor the group progress, we still see sometimes a few participants lead much of the discussion and the implementation. We tried to remedy this by asking each team to at least have a notetaker, a coder, and a discussion leader. It helps in much of the group sessions, but still, participants later reflected that they would prefer working more with the people they are already familiar with. We also tried to have some sessions to allow all participants to discuss their experience of the day.

We also encouraged all participants to share their interests in special topics that are discussed during the lunch break. Essentially, we ended up with at least one lunch topic per day, where participants talked about their experience in academic job hunting, self-learning coding, research using unstructured data, etc. Although it is designed to be optional activities, this turns out to be a very involving activity that a majority of participants actually had lunch together (except for those who are in distant time zones).

Most of the days we ran the daily activities independently. On the last day of the first week, we received support from the Princeton site to allow us to have access to the Fragile Families Challenge. All participants, staff receive the data access and are able to share the results on the leaderboard hosted by Princeton. Again, this demonstrates a great community feeling that our participants can also see the results across different partner locations. Overall, participants have positive feedback in terms of organizations, material, group exercises.

Week 2

Following the Princeton schedule, the participants mainly work on the group research project during the second week. We have created a spreadsheet for participants to express their interests, and then participants further discussed in the most similar and most dissimilar groups. In total, there are 3 topics that are initiated by the participants, and 2 of them end up with a research presentation with preliminary findings. However, admittedly, it is not easy to track the group’s progress on Zoom. According to our TAs, most of the time, the participants joined the Zoom session when working on the group projects, showing their dedication to the program.

The group projects are presented and discussed on the second week’s Friday afternoon. One topic analyzed the public reactions to the enactment of Juneteeth as the Federal holiday using #Juneteeth on Twitter. The other topic is about the role of emojis in the spread of information on Twitter. Both topics have used the fresh data through the Twitter API while caring for the social science research questions. They have enjoyed the presentation as well as the discussions afterward.

Meanwhile, all participants are encouraged to attend the nice activities organized by SICSS Festival. They reflected very positive experiences and viewed SICSS Festival as a good supplement to the main program in the first week.

Thanks to the 2020 experience of virtual collaboration, this year the technologies allow bringing together people across the continents in a nice structural way. Participants are able to use Slack, Zoom, and Google Office tools to collaborate on each day of events. We also provide daily feedback forms to collect their opinions. Indeed, the COVID-19 situation creates obstacles for all participants to fully enjoyed the program. As we know, many times great research ideas come from informal interactions, small talks, and physical proximity. Hopefully, it would be improved as the vaccinations are in place and the world would be reopened.

The experience from the organizers and staff is also quite positive. This is the first time we have made the program and there are many issues that we do not perceive. For example, the administrative process in Portugal is quite slow to get approval from the university, which has delayed the organizations. This would not be the case anymore had if we would organize it again because this is just a one-time problem. Also, the CSS community in Portugal and Spain is not as strong as in many other parts of the world. Therefore, we aim to do a better promotion job in the next year to let more people be aware of the program and to apply. We like our participants a lot, but it would be great if we could have more, to create a dynamic and diverse community.

4. Post-departure

The participants have received their swag from the SICSS program. Also, there are several of them asking about the participation certificate. We consulted with the SICSS, and it does not seem to have one. We decided to report to the school to issue the certificate instead. I would suggest SICSS considering issuing a program-wide participation certificate, as in many European universities, it is somehow required to justify the time spent during the program. Next time, we also would like to invite more organizers preferably from other countries like Spain to help us to establish the program in Iberia.


SICSS-London

General Comments

I think a lot of this post-mortem will refer to things related to operating in a virtual world, but they also offer some more general lessons for SICSS planning. Some aspects may also have to do with my (Joshua) transition from USA to UK, and it’s hard to tell the difference. For example: following the standard model, we scheduled lunch as “optional co-working sessions” but nearly everybody logged off during lunch. I’m not sure if this has to do with the less frenetic work norms of Europe or the online experience. In the zoom world, I think this is ultimately a good thing—we need a break from screens, and hanging out during lunch online is not the same as hanging out during lunch in person. The former can be exhausting while the latter is refreshing.

We had 5 faculty for this event. And although we didn’t really discuss expectations for who would attend which event (beyond the days we were running) the faculty were largely very present during the week during each others’ activities and discussions, which was fantastic. I (Joshua) am so grateful to the other faculty for their presence, especially since we didn’t discuss it, as I think this diverse set of voices really helped to enrich the participant experience. Not to mention, making it more fun for me as a faculty member.

We forgot to set up daily feedback surveys. Though I noticed this mid-week, I consciously chose not to correct this, perhaps out of laziness, perhaps out of a desire not to switch tack mid-way. I personally don’t value these very much, but perhaps I should find more value in them.

On teaching assistants: we hired 2 excellent teaching assistants who provided great help during some select sections of the program. One of them had expertise in virtual lab experiments and contributed heavily to Day 5; one of them had expertise in working with digital trace data and contributed heavily to Day 2. Both of them not only supported learning during the day but also helped prep materials. However, we didn’t really think through what they would be doing the rest of the time! Thus, I think they were rather underworked. It was very nice to have them around, but the virtual format meant there were far fewer logistics to be handled during the day.

We did not plan (and this didn’t have) any social media presence. This would have been a great job for a TA.

For the days where we produced custom material, we ended up leaning a bit more heavily on lecture than the format modeled by SICSS-Princeton/Duke for the virtual sessions. I think in future sessions, we could make a more intentional effort to offload that content into the async material.

Pre-Sicss Faculty Planning

We initially envisioned a joint Oxford/London seminar series. While we did carry this out in fact, in the end it was really just sharing speakers at the same time. I don’t think we really clearly established goals for this program, and we didn’t build on it to actually forge connections between the participants at the two sites. In the virtual space, it certainly was very nice psychologically (I think) to have a sense of the shared experience, but it didn’t extend to anything practical. Thus, the major step to improvement would be more thorough planning—why we’re doing it, what we want to get out of it. Admittedly, a major goal (which was successful) for myself (Joshua) was simply a matter of branding during the recruiting by making the program sound exciting and full.

Regarding faculty. I (Joshua) initiated the planning of this program while living in Chicago, and so I reached out to the handful of computational people and SICSS alumni I knew in London to put together a dream team. It was indeed a dream team and I think this worked very successfully. And, I think we could have done even better. A major shortcoming on my own part was that I didn’t take the time to make sure the non-alumni faculty really understood what SICSS was or how it would be run. I applaud their willingness to dive in for sheer love of CSS, and I could have done a much better job about managing expectations.

Along these lines, another thing we could have done was to more explicitly determine roles and responsibilities. Ultimately, this all worked out well, but I don’t think we fully tapped the potential energy and expertise of all the faculty. For an in-person event, I think this kind of coordination would be even more crucial.

I think we missed an opportunity to connect more explicitly with other SICSS events in our time zone. Given the virtual format, we could have had some kind of European SICSS kickoff event.

Mike’s notes: Joshua makes great points throughout. I agree that integrating some talks with the Oxford group was a plus for the students, and in future it would be nice to do more, if they were willing! However, it seemed Oxford was trying to create a more independent set of materials, and was also trying to have a more global group, while we tried to be more focused on people close(ish) to London. Still, the small points of collaboration worked well, and I share Joshua’s optimism about doing more with other european groups in future. I also agree that the planning with the other faculty could have set clearer expectations about the goals of the group. In my view, the best versions of faculty participation were centered around teaching and in-class group exercises, whereas others felt more like “sit and listen” research seminars.

In the run-up, I think the co-operation of the faculty team also came in handy in lots of places - for example, in dividing responsibilities for funding TAs, sifting through applications and making tough choices at the margins, leading first-week activities, and so on. Also, I was incredibly impressed with Joshua’s leadership throughout the process, and he made sure there were no gaps throughout the two weeks. However, I think there could have been better ways to use the faculty team together. Our contributions tended to be a bit cloistered, without any integration. On some days where I wasn’t the assigned leader, I was essentially an observer (and perhaps vice versa). But I think this was mostly due to the virtual nature of the conference, and I felt more in touch with the folks who were also more involved in the planning. Had we been in person, and planning more explicitly, it perhaps would have felt easier to work together.

First Week

MONDAY

No notes for monday. We followed the SICSS-Princeton/Duke schedule, and it was great.

TUESDAY

For this day, we followed a custom lesson plan by Nicola. This material included async lectures provided ahead of time and custom activities to replace the standard digital trace data curriculum. My (Joshua’s) main note is that it felt a bit unstructured, and could have benefitted from step-by-step instructions as modelled by the Princeton/Duke material. This again comes back to the notes above about failing to discuss expectations early on about how SICSS works and what ideal material looks like. Similarly, we would have benefitted from more explicit scheduling of breaks and transitions.

I wrote down this note during the day: Nicola did an amazing job of staying on time.

WEDNESDAY

For this day, we followed a custom lesson plan by Mike. My (Joshua’s) primary note on this is: offer the schedule in advance, with a bit more material. Breaking the code and activity into more discrete steps/chunks I think could be helpful for participants following along. Again, this comes back to the need to discuss these things early on. Much of this was due to leaving discussions about curriculum to relatively late in the planning stages.

Also noted: having pre-fabbed code was very effective for this day.

Mike’s notes: I agree the lead-up to my day was not fully spelled out to the other group members. However, I myself was quite prepped (in secret, haha) as I was mainly using slides I had created for other NLP workshops I’ve been conducting for other PhD programs. Instead I encouraged the students to prep by following the existing SICSS pre-recorded lectures, which I think were somewhat useful. In future, I might lean more on my own content (including coding examples) as part of the prep for the day. To take one example, text networks were interesting to think about, but probably didn’t stick for students without some practice or clear guide on when it’s applicable.

The morning was focused on teaching slides that were mirrored in pre-written code. The idea being that everything I discussed in class could be implemented directly by students as we went along, and we could stop and ask questions along the way to discuss how the code matches the concepts. I think this was a success, especially given the range of coding expertise in the group. Maybe half the students were ready to just go do their own thing (including some in python), but were still active in the discussion. For the rest, some were still feeling out scientific code, or were new to text analysis. So in their projects they could basically copy/paste the code I wrote and modified things along the way. I spent more time with these students, and it seemed like they made good progress, though it would be unreasonable to expect they’d totally catch up with the rest.

The afternoon was modeled on the usual SICSS set-up, handing over the time to breakout group projects. I think this was very successful - I gave the students a sample dataset (US restaurant reviews from yelp) and they came up with a lot of innovative ideas around CSS, looking at effects of gender, economic conditions, political lean, etc. Students also got to spend more time with one another. One thing that worked was having students divide up by preferred coding language - we had a python team, with the rest in R (including some people who knew python better and just wanted more R practice). This meant teams could focus more on the collaboration.

In general, I think the main concern here was lack of time - many had ambitions for their projects that exceeded the available time. Again, what I’d do differently is to have students interact/engage with my code base before the day, so they’re ready to hit the ground running. There’s really no value to the “surprise” factor of revealing the dataset that day - it’s not a game show! I was hesitant to give out assignments like I normally do in class, but in retrospect the students all wanted to learn and that would have made their time together even more rewarding. The presentations at the end of the day were also a success. The groups enjoyed sharing their ideas, discussing what worked, what didn’t - again this framework is copied directly from the usual SICSS pattern and I didn’t want to mess with a good thing.

THURSDAY

For this day, we followed the lesson plan provided by the Princeton/Duke site. Overall it worked well. And, while carrying this out, it became apparent that we needed to have adapted the exercise to the UK. One issue is that the content and demographics could be customized, to make the exercise details more relevant. Another issue is that we want to ensure GDPR compliance.

FRIDAY

For this day, we followed a custom lesson plan by myself (Joshua). We started with an opening discussion on virtual labs led by Milena. This discussion was very lively, and I think really engaged both the faculty and participants. And, it took up a lot of time. I think a main lesson here is that we need more time during the week as a whole for discussion—a LOT of the time was taken up by activities. Because people largely parted ways when we didn’t have scheduled activities, it meant there wasn’t much ‘organic’ discussion so we need to build it in. People are hungry for discussion.

The bulk of the day was spent working through a custom built tutorial for the virtual lab tool Empirica (now part of the standard Empirica documentation). The main issue was that the breakout groups were a little chaotic, I think because it’s kind of difficult to work on a tutorial together. We did briefly discuss group/pair coding strategies, but then ultimately let people decide for themselves how to work. Some worked independently while others worked in groups. This was very successful overall, and while not everybody finished the tutorial, one group did actually make a custom “lemonade stand” (supply/demand market game) app rather than just strictly following the tutorial. One major downside of this approach is that people were largely on their own for the day, except the occasional questions as people needed help.

Second Week

Not many comments here. We had some guest speakers, and students worked on their projects. They presented their projects on Friday of week 2. The main area for improvement I think would be giving clearer expectation about the opportunity for presentation: this seemed to stress participants out at times, though we intended it to be a very informal and fun thing.

Group Projects / Post SICSS

We were lucky to have £5k in funding for group projects. 1 group submitted a request during the week for some LIWC access, and 3 groups submitted larger (~£1k) post-SICSS projects.

My main comment here is that we didn’t give any guidelines for what to include in the proposals, so they ranged in their level of detail and formality. This led to one group submitting a budget for “500-ish”.

We also didn’t really have a plan or process in place for either quick during-the-week funding decisions or post-SICSS funding decisions. We were able to reach quick consensus via email, mainly because we were under budget. But if we’d had to make some tough decisions, we would have found it more difficult without a formal process in place. For future years, I think we probably want to plan something ahead, e.g. a voting procedure and a date on which to review proposals.


SICSS-Los Angeles

This is the post-mortem for the SICSS 2021 partner site hosted by UCLA. The institute took place from June 28 to July 2. We provide a brief summary of the different stages of the institute as well as our impression about what went well and what we would try to improve in the future.

1. Outreach, Application and Planning

Compared to SICSS-UCLA 2020, we cast an even wider net (e.g., to more departments and centers across more universities in Southern CA). We organized SICSS-UCLA 2019, and so we already had a list of where we reached out to in 2019, but we added to this list to ensure our list was complete.

Our application consisted of a Google Form, which we copied and improved from SICSS-UCLA 2019 and SICSS-UCLA 2020 and is very similar to the main site’s form. We intentionally did not ask for letters of recommendation this year or in prior years, to reduce the overhead for applicants. Compared to SICSS-UCLA 2019 and 2020, we also tried to convey even clearer expectations for the statement of purpose. We were also clear about the online format, time commitment, and the centrality of group project time to SICSS 2021. We felt that Google Forms was a smooth platform to use for applications in all years. Much like last year, our website stated that we were going to focus on causal inference and machine learning in our site-specific teaching, so applicants could decide to which site to apply based on their interest and each site’s particular emphasis.

Since SICSS-UCLA would be online, we decided to limit our cohort to a small number to preserve the community feel of SICSS (we received around 80 applications and accepted about 34% of applicants, and correctly expected some attrition). Our participants were largely in the same time zone, and we intentionally tried to accept a cohort in Southern CA with the hopes that collaborations would be more likely to continue after SICSS-UCLA.

2. Pre-arrival and Onboarding

Our onboarding felt far more organized than prior years and actually went very smoothly. This is because we have had the experience of prior years. For example, we could anticipate the on-boarding tasks better and were able to consolidate information to fewer emails, and we had a good sense of all the preparation tasks on our end.

We wanted participants to be able to get to know each other before the institute began, especially since it was online. This year, we asked participants to introduce themselves on Slack when they joined and we invited them to Slack well in advance of the institute so that they had plenty of time to get to know each other if they wanted, and crowd-source R questions for those learning R. Last year we didn’t have a good system to ensure students would learn R before the institute before. To address that, this year we asked students to either post on our Slack or email us with R code (either to show us previous code they have written alone, or to show us a visualization + code from an R tutorial we suggested). This worked out well, and we felt like students were better prepared with R than prior years.

One aspect that did not go smoothly was inviting participants to the overall SICSS slack (we also had our own SICSS-UCLA slack). Several students could not get access, and we are still fixing this.

3. Week 1

We spent a lot of time thinking about ideal ways to schedule SICSS-UCLA 2021 based on our experience with SICSS-UCLA 2020 and 2019. In contrast with prior years, we put a very strong emphasis on group projects this year. Our rationale was that we wanted to avoid too much lecture time via zoom, and we also wanted participants to feel more invested, connected, and come out with a concrete research product. We were clear in our application that the program would be focused on collaboration, since we realized from prior years that this format is not for everyone.

To ensure that participants could get started with group projects as soon as possible during the program, we asked them to post their ideas, comment on others’ ideas, and brainstorm on Slack before the institute began. We did speed dating for group project ideas for the first three days and asked each group to Slack a blurb of their project by the end of the third day.

Monday through Friday, we had a mix of lectures about the topic of the day (led by an organizer who is specialized in that topic), a small group activity in break out rooms, and guest lectures. In light of our own interests, we stuck to the lectures and activities for digital trace data, text analysis, but supplemented the lectures on text analysis with a broader focus on word embeddings and other methods. Like last year, we also spent several days using our own activities and lectures on supervised ML and causal inference.

In general, we tried to stick more closely to timing on our schedule this year compared to prior years, to keep the program feeling more organized. However, we found that many participants really wanted to learn text analysis early on for their group projects, and we initially scheduled it for the Monday of the second week. We adapted by switching the schedule around and moving some aspects of text analysis to early in the first week.

Another adaptation of this year was that, rather than a day specifically devoted to ethics, we tried to incorporate discussions of ethics throughout the other topics (especially, digital trace data, text analysis, and in descriptions of word embeddings).This format worked well, but we could have made sure to incorporate ethics discussions throughout other topics as well. In addition to guest speakers, the first week we also hosted a panel with folks from industry, to discuss career perspectives for computational social scientists outside academia. We thought this panel went very well and participants were very engaged.

In contrast to last year, where we included some pre-recorded lectures, we did lectures live and specifically for our site, since we thought this would foster a more interactive learning environment, and we could leverage the organizers’ own knowledge in computational social science topics. We also added short “skills workshops” during the first and second week, covering important topics that did not fit well the lecture format, such as on Git/GitHub, Bash, and regular expressions. They were very well received by the participants.

Like last year, we found that it was more difficult to gauge participants’ engagement and level of understanding over Zoom, especially because many had their cameras turned off. It was often unclear whether participants were quiet because material was too simple, or too advanced, or for other reasons. The online format also made instruction difficult because we could not have side conversations which are often very useful to explain concepts in different ways, ask questions in a more informal format, or go in depth into an interesting tangent. We held virtual office hours, asked questions to everyone, joined break out rooms to check in, had mandatory group project check ins with organizer, were active of Slack, and mixed break out rooms with full group sessions, to try to make the experience more interactive. It was better than SICSS-UCLA 2020, but still not the same as in person. Overall, we thought this structure for the first week went well.

We found it very difficult to manage recording lectures for future use. In some cases, material that was being presented was part of ongoing research and not ready to be posted online yet, or we would forget to begin until partway through. Those that we did record we found to be unwieldy files. We have also struggled with this in the past, and in hindsight, we should have thought more carefully about how to do this. However, for every lecture we shared with the participants our slides beforehand, and we have all of them archived.

4. Week 2

During the second week, we covered additional aspects of text analysis, machine-learning, and left ample time for group projects. We again included a mix of lectures, activities, guest lectures, and discussion. We also had alumni from SICSS-UW 2018 give a talk about their SICSS group project which they recently presented at a computer science conference. We thought that this was a great opportunity for our participants to ask questions from another alum about collaboration, and also see an example for collaboration post-SICSS. The project involved several topics we had covered in previous days, (e.g., topic modeling, dictionary methods, APIs, and ethics), so it also weaved topics together well.

Unlike last year (and like SICSS 2019), we required flash talks for group presentations at the end of the two weeks, since we felt that this offered closure and culmination to the two weeks, and participants had far more time for group project time than last year. We were really blown away by the presentations. It was amazing to see how much participants learned and accomplished together in their groups over the two weeks.

Finally, we created and sent out a feedback survey after SICSS-UCLA but have had trouble getting over a 50% response rate so far. We imagine that everyone (like us) needed some time after the institute to decompress, and perhaps it would have been better to send it out during the second week or well after the end of the program. Feedback was largely positive, but a few participants felt there was not enough time to meet people that wasn’t explicit project speed-dating.


SICSS-Montreal

The 2021 Montreal Computational Social Sciences Summer School took place from June 7 to 25. As with 2020, this one too was hosted entirely online due to the Covid-19 pandemic. In addition, the workshop was extended by one week. SICSS-Montreal is organized by Professor Vissého Adjiwanou of the Department of Sociology at the University of Quebec at Montreal, (UQAM), and an adjunct professor at the Département de Démographie at the Université de Montréal. Dr. Adjiwanou is also the chairman of the scientific panel on computational social sciences of the Union for African Population Studies (UAPS).

We have been coordinating with SICSS-Princeton and have received funding from the Russell Sage and Alfred P. Sloan foundations as well as the Institute for Data Valorization (IVADO). The SICSS-Montreal 2021 was the only one in the SICSS network (https://sicss.io/) to be conducted in French. Thus, it enabled the SICSS network to reach an audience that was previously underrepresented.

This year, we received around sixty applications and selected around thirty. The participants were French-speaking or bilingual (English and French). Additionally, we offered the opportunity to a few participants who spoke only English but made them part of bilingual groups. In doing so, this summer school could accomplish its triple goals:

  1. Bring together researchers in social science with those in computational sciences.
  2. Identifying topics relevant both to the French and to English-speaking contexts.
  3. Connecting experienced researchers with junior researchers.

Recruiting and publicizing procedures

As before, we sent out a call for applications to the social science and computer science departments at the top Quebec universities (UQAM, UdeM, Sherbrooke, and McGill.) We also submitted the posters and announcements to some scientific societies (e.g., IUSSP and UEPA) to make their members aware of us. To reach more French-speaking African students who study in the best schools on the continent, we further sent the announcements to various program directors of these schools. After receiving a favorable response from the Demography Training Institute (IFORD), ten places were reserved for its master and doctoral students. We continued discussions throughout the selection process and eventually established an academic association on computational social sciences within IFORD. We wrote a grant application to IVADO to support the students from this school as well. IVADO generously considered the participation of these students in their award. We are currently working on setting up the association at this school to disseminate SICSS-Montreal training courses on the continent.

In addition to recruiting participants, we also contacted several guest speakers. Finally, we received a positive response from five researchers who were all scheduled for the second week of SICSS.

First week

This summer school was conducted over three weeks instead of two as it was in the past. The purpose of adding another week was to provide the participants with a sufficient period of instruction and make them more comfortable in working with R. This was intended to adequately prepare them for the project-oriented week where they would conduct research using R. As such, the first week became a comprehensive crash course, taught by Professor Vissého with the help of three teaching assistants. The program took place over five days, 9 a.m. to 1 p.m., was open to all participants, and covered the following topics:

  • Day 1: Introduction to R
  • Day 2: Data manipulation with R
  • Day 3: Data visualization with R
  • Day 4: Modeling with R and
  • Day 5: Handling character strings with R

Participants were expected to be able to work with R by the second week of training. Each course concluded with exercises that trainees must complete the next day. They were also encouraged to discuss the exercises among themselves to promote an interactive atmosphere and peer support. Every morning, we answered participants’ questions, fixed their bottlenecks, created pull requests for code repositories, and collected comments. These courses, which were originally designed for novices in R, were eventually followed by all participants, leading to a great exchange of information and experience. Although all the sessions were online, the participation and enthusiasm of the trainees were exceptional. The last introductory seminar on string manipulation proved to be a helpful introduction to what was to come in week two. All of our materials are available free of charge at: https://github.com/visseho/Cours_SICSS_Montreal/tree/main/Semaine1_Formation_Intensive_R

Second week

We began our actual training on new social science methods using computer science advances, particularly in data collection. The duration of classes this week was longer; starting at 9:30 a.m. and ending around 5:00 p.m. Usually, after the morning lectures and presentations, teams of three to five participants were working together in the afternoon. Even though staying focused online was difficult throughout the day, we preferred this schedule because of its success in the previous year.

The teachings given during the second week are as follows:

  • Day 6: digital data collection: several techniques are taught to the participants, ranging from webscrapping to the use of APIs.
  • Day 7: Textual data analysis: We taught these courses directly to the participants, followed by team exercises.
  • Day 8: Deep learning: this training could not be given in the end.
  • Day 9: Sampling: the participants followed together during the morning the online videos of Matthews Salganik, videos that I took the trouble to comment again. Then, they got into a team to do the exercises.
  • Day 10: Ethics: The same approach as day 9 was adopted for the course on ethics.

Furthermore, during this week, we organized the lectures delivered by our invited speakers. The presentations, from 3:30 p.m. to 5:00 p.m., were particularly well received by the participants. Videos of the presentation will be soon made available to the large public.

Third and last week

In the last week, the participants were divided into different groups according to their research interests to further develop and hone their acquired skills. As communicated with the trainees, their new research projects germinate in SICSS-Montreal, but they can continue to grow and branch off over and beyond the summer school. And indeed, we witnessed several projects evolving during the last week and beyond, as different trainees stayed in touch and maintained contact to collaborate on their projects. This year, the possibility of applying for a grant was an additional source of motivation for continuous research and development. At the end of the third week, three groups presented their research work. The first team had worked on the influence of the media on health measures taken in Canada against Covid-19 by comparing the provinces of Quebec and Ontario. The second group was interested in the analysis of the discourse on vaccination between Quebec and France. And finally, Group 3 worked on Facebook’s covid-19 posts. As we can see, all the research work has turned to the crucial question about the current pandemic. Most importantly, these researches enabled participants to deepen their knowledge of the courses they received and establish solid research relationships among themselves.

Lessons learned

This second edition of the summer school was particularly successful on several fronts. The courses taught were very well received, demonstrating the importance of programming skills in social science research. Besides, our dropout was minimal. This improvement in interest and attendance is particularly gratifying as it shows that our summer school has high growth potential. We also found our participants to be quite active and supportive during lectures, often asking questions, and sharing thoughts and insights on others’ presentations, proposing ideas and strategies for presentation improvement. In addition, the extended week provided team members with an excellent opportunity to dig deeper into the R pipeline. This year, we also held open guest lectures, which attracted interest outside the workshop, however, attendance was low. Finally, the live delivery of the lessons, courses, and lectures instead of streaming or recording videos allowed more interaction. We hope to keep on improving our collaborative learning tools to reach even better results.

As the COVID pandemic becomes slowly under control, we would design our course plans adapted to a more face-to-face environment. Nonetheless, at SICSS-Montreal, we will favor a mixed-format approach to offer students both face-to-face and online training through various support mechanisms. Despite all the disadvantages, the online format can help us reach a targeted audience we otherwise wouldn’t have been able to contact.

Challenges

Our summer school was a great success, and we continue to count on its organization next year. Going forward, one of the biggest challenges we faced was coordinating the logistics for our participants from sub-Saharan Africa who had significant internet and connection issues. This year our collaboration with IFORD started a little behind schedule and was not completely finished in time. We consider this a learning experience and are working on improving our partnership with them. Next year, the team aims to improve coordination and make the process more seamless for our African participants.

Conclusion

This event is made possible thanks to the involvement of former SICSS alumni who have put their experience into supporting the participants. This is an opportunity to thank Robert Djogbenou (SICSS Cape Town 2018) and Georges Ngalé (SICSS Montreal 2020), doctoral students in the demography department of the University of Montreal, and Nima Zahedinameghi (SICSS Montreal 2020), a postdoctoral fellow at the ’University of Quebec in Montreal. Finally, we would like to thank the participants from various horizons who, through their enthusiasm and their commitment, have made SICSS Montreal, the first SICSS in French, a real place of learning and exchange.


SICSS-Oxford

We divided this document into (1) Outreach and application process, (2) Pre-arrival and onboarding, (3) First Week, (4) Second Week, (5) Post-Departure, (6) General Notes.

1. Outreach and application process

Outreach:

Once we prepared our call for applicants and the website, we undertook targeted outreach to several professional associations (such as The International Union for the Scientific Study of Population; Economic Social Research Council, European Population Association, National Centre for Research Methods) and academic departments within and outside of Oxford, predominantly across the UK and Europe. In addition, the core organizers (Matt and Chris) provided announcements for our specific location on social media, which also helped our visibility. However, there were still individual students contacting organisers after the call deadline because they had missed various announcements, and we are seeking to improve this strategy for next year, despite its broad success.

Application process:

As with the previous SICSS-Oxford (2019), we did not require a letter of recommendation for the application process which made it easier for to apply. We advertised SICSS-Oxford in the second week of January and set a deadline for the middle of March to apply.

Selection process:

We received 64 applications and accepted 24 (7 from Oxford). There has been a reduction in the number of applications in comparison to 2019, when we received 166. We selected participants in the following way: first, each organiser went through all applications separately; second, organisers met to discuss each application and produce a shortlist; third, each organiser re-reviewed applications from shortlisted applicants; fourth, organizers met to discuss shortlisted applications and produce a final list of accepted applicants.

2. Pre-arrival and onboarding

Prior to arrival, we requested participants to provide a picture and short bio for the website, which was done by all participants other than one. This allowed participants to get to know one another beforehand. We also requested that they login to Slack and familiarize themselves with the platform, as it was new to several participants, but, as always, was instrumental in the functioning of the course.

3. First week

We started with a ‘bring your own drink’ session in which organisers and participants presented themselves briefly. It was a good session to break the ice. However, in hindsight, we would have organised it differently. Besides meeting everyone in the same single Zoom room, we could have used breakout rooms to facilitate more in-depth discussions.

During the week, we organised much of the training with lectures and exercises led by our internal organisers and other known experts in the area. On the first day, David Brazel (University of Oxford) gave a workshop on reproducible workflows which was followed by an exercise, which was deemed as an extremely appropriate way to begin. On Monday evening, we had one of the first of five joint talks with SICSS-London, in which David Lazer spoke wonderfully about his research on polarization. Tuesday was the day on web scraping and APIs led by two of our organisers (Chris Barrie and Charles Rahal). In the afternoon, we had two research talks; one from Ridhi Kashyap (a former SICSS-Oxford convenor) and Ken Benoit (jointly organised with SICSS-London). We had a lighter Wednesday in terms of Zoom hours, although our content was a bit more statistically oriented. We talked specifically about non-probability sampling with examples from Facebook surveys. Emanuele Del Fava (Max Planck Institute for Demographic Research) led a workshop on post-stratification. On Thursday, Chris Barrie taught the materials on ‘text-as-data’ with several hands-on exercises that the participants found very insightful for their own work. Laura Nelson (Northeastern University) gave a research talk on the use of Wikipedia data. On Friday, we focused on machine learning and computer vision through two workshops organised by Tom Robinson (Durham University) and Noah Waterfield Price (Optellum). Saul Newman was gave a related research talk about Machine Learning in genetics. We ran a day on geospatial data on Saturday, which was led by Tobias Rüttenauer (organiser) and ended with a hands-on tutorial by Ilya Kashintsky.

We asked the participants every day to fill a survey providing feedback on how it was going in the summer school. Even if the response rate was not high, it helped us in adjusting in due course how we planned the days (e.g., more breaks, or sharing material in advance) We started every day at 10.00 am, but we could have started a bit earlier and had more breaks during the day. Moreover, it was a bit exhausting to run a full day also on Saturday from both the organisers and participants sides. Overall, we received very good feedback for all the sessions we organised and materials that we prepared. We had positive feedback for the research talks too.

4. Second week

In the second week, we worked on the research projects. We started by dividing the participants in clusters performing a minimum/maximum similarity clustering. We spent Monday morning by letting participants speak in breakout rooms and organise themselves in teams. In the following days, the groups used their Zoom accounts to work collaboratively. We organised coffee breaks at 11.00 am and 15.00 pm to answer questions and discuss research ideas with participants. The groups were heterogeneous in terms of affiliations as well as in terms of disciplinary background. We continued to have research talks during the second week, too. We heard from Eszter Hargittai (ETH – joint talk with SICSS-London), Peaks Krafft (UAL Creative Computing Institute – joint talk with SICSS-London), Viktoria Spaiser (University of Leeds - joint talk with SICSS-London), and Robin Lovelace (University of Leeds).

On the last afternoon of the institute, we had presentations of the projects. Most of the participants were eager to continue working on their projects. We were truly impressed by the high quality of the presentations and the proof of concepts presented given that it was all done online in an extremely short time-frame. An overview of the projects is as follows:

  • Exploring recent tweets about the Spanish enclave of Ceuta, William Allen (University of Oxford), Katharina Tittel (Sciences Po Paris), Yuru Li (University of Bremen), Mobarak Hossain (University of Oxford)

  • COVID-19 Vaccine Sentiment on Twitter, Tom Endbrooke (University of Edinburgh), Elis Carlberg Larsson (Linköping University), Hannah Philips (University of Oxford), and Deivyd Velasquez (University of Bordeaux);

  • Language insights in the Crypto Movement, Fatima Zahrah (University of Oxford), Hayley Pring (University of Oxford), Tasos Spiliotopoulos (Newcastle University), and Timo Koch (Ludwig-Maximilians-Universitat Munchen);

  • Collaborative knowledge production in times of crisis, Nicole Schwitter (University of Warwick), Hannah-Marie Büttner (University of Bremen), Hao Cui (Central European University), Maksim Zubok (University of Oxford), Reham Tamine (Qatar Computing Research Institute), Jingwen Zhang (University of Manchester), and Jiani Jan (University of Oxford);

  • Incivility of Czech Twitter, Michael Skvrnak (Charles University);

  • Stop-and-Search in the COVID-19 pandemic, Marti Rovira (University of Oxford).

5. Post-departure

We continue to use the Slack Workspace to communicate amongst ourselves, and are keen to hear about (and help with) all future plans for the collaborative projects!

6. General notes

It was challenging to a two-week long summer institute completely online, but we were happy to receive good feedback from the participants. Probably, what was missing was the more informal side of the conversation; the lunches, chats in hallways, and so forth. However, our conclusion was that it was generally a resounding success, with much learnt not just by the students, but by the convenors, too!

7. Pictures

Ilya Kashintsky plotted where SICSS-Oxford participants and organisers where coming from and live.

Smiling faces after two weeks of SICSS-Oxford.


SICSS-Rutgers

SICSS-Rutgers 2021 took place from June 14-25, 2021. The organizers included Michael Kenwick, Katherine McCabe, Katherine Ognyanova, and Andrey Tomashevskiy. Teaching Assistants included Burcu Kolcak, Luxuan Wang, Katie Krumbholz, and Gabriel Varela. We hosted the site virtually via Zoom. This post-mortem will discuss our application process and onboarding, first week, and second week.

Application Process and Onboarding

We began our outreach in January 2021. We wanted to recruit a wide range of participants of different disciplinary backgrounds, institutions, and career stages. For recruitment, we sent emails to former participants from SICSS-Rutgers 2020, several professional email lists, as well as posted on social media.

Our application deadline was March 22, and we informed participants we would get back to them by April 15. The application asked for name, email address, institution, career stage, discipline, self-reported frequency of using statistical analyses or machine learning in research, self-reported skill level in programming in R, as well as a CV, research statement, and writing sample. We had optional fields for gender and race/ethnicity. We did not require letters of recommendation. We used a google form to gather the information, which worked well. Some applicants, if using a google form, may not be able to access a form if their institution has strict security preferences. In those rare cases (approximately two), we asked that the applicant send the information via email. To facilitate communication with applicants, we created a free gmail account for the institute.

We received approximately 59 applications. The team of four organizers evaluated applications based on the applicant’s research in computational social science, contributions to public goods and creating educational opportunities for others, the likelihood the applicant would benefit from the experience and contribute to the educational experience of other participants, and the potential for the applicant to spread computational social science to new intellectual communities and areas of research.

We desired to keep the institute to a size small enough that we could reasonably hope to allow everyone a chance to participate in full group sessions. With that in mind, we did not accept all applicants this year. We targeted a site size of approximately 20-40 participants. Upon learning of their acceptance to the institute, we asked all applicants to fill out an RSVP form where they were asked to commit to attending each day of the institute, to review pre-arrival materials and seek TA help as necessary, and to uphold the principles of SICSS: “These include openness, patience, generosity, and togetherness. Our goal is to build a community of computational social science researchers that brings together a wide range of backgrounds and expertise to help each other learn and develop exciting new research ideas to address major issues and challenges confronting our society.”

We had approximately 39 participants commit to attend the institute. More than half of participants were female, slightly more than two-thirds were doctoral students, with the remaining third composed of a mix of postdoctoral scholars and early career faculty. Disciplines represented among participants included political science, education, health policy, public policy, sociology, psychology, communication, management, and applied math. About half of participants reported they were intermediate in R programming skills, about one-third reported novice level, and a smaller number of participants reported advanced knowledge. Participants were from a wide range of institutions and geographic areas, though the plurality had an association with Rutgers. During the first week, we had one participant need to withdraw and an additional 2-3 participants who were only able to attend a limited number of events due to unexpected circumstances.

After participants committed to attend the institute, we began onboarding. We sent an email asking participants to fill out a google form provided by SICSS with their name, bio, and demographic information; begin reviewing pre-arrival materials; and look out for an invitation to join our Slack workspace. One of our Teaching Assistants helped to onboard participant information through GitHub onto the SICSS Rutgers website. To facilitate pre-arrival learning, we provided participants with a recommended schedule for pacing themselves in going through the materials. Two of our Teaching Assistants also began providing limited office hours via Zoom each week from early May through the start of the institute. We also encouraged participants to post questions to the Slack, though this was used only occasionally prior to the start of the institute.

In addition to managing the application process, during the lead-up to the institute, the organizers met occasionally to prepare the curriculum, decide on a list of guest speakers to invite, recruit Teaching Assistants to support the institute, and solicit additional funding where possible from our institutional departments. We had four Teaching Assistants support the institute. This was an increase from one to four from the previous year’s institute, although only one teaching assistant remained participated throughout the event while the other three alternated based on their personal availability. All four of our teaching assistants were past participants from SICSS-Rutgers 2020, and that familiarity likely helped as they assisted participants with the materials. We primarily had two teaching assistants supporting us during each week of the institute.

First Week

The first week of the institute was focused on training and practice with a range of computational social science areas. Our schedule, along with hyperlinks to the lesson plans, is available here: https://sicss.io/2021/rutgers/schedule. The first week’s schedule generally ran from 10:00PM-5:00PM ET, with a lunch break and shorter afternoon break during the day.

Prior to the start of each day, we asked participants to review the relevant videos from the main SICSS curriculum when applicable. These videos were also a part of the pre-arrival schedule, though we recognized that not all participants were able to get through the full pre-arrival schedule in advance of the institute.

The morning of Day 1 was reserved for welcome remarks and introductions. We heavily relied on the Zoom breakout room feature to facilitate small group communication among participants. This generally worked well, as a full group of 39 participants makes it somewhat harder for everyone to participate within a single session. In the afternoon of Day 1, we participated in the small group exercise on Ethics developed by SICSS-Princeton. Our day concluded with a guest speaker, Dr. Vivek Singh, who spoke on algorithmic fairness.

Our Day 2 was focused on collecting digital trace data. Given the more advanced technical nature of the day, we began the day with live tutorials on web scraping and using the Twitter API to supplement the materials provided online through SICSS. The afternoon was then devoted to a small group exercise where groups would develop a research question, collect digital trace data relevant to the question, and begin to describe and analyze the data. The day concluded with short presentations from the small groups.

Prior to the institute, we encouraged participants to apply for access to the Twitter API via a standard or academic application. About half of participants ended up with access to the academic version, while others either did not gain access or had access to the standard version. As a result, our Teaching Assistants provided two parallel tutorial sessions so that participants could practice using R to access Twitter through the API they had available.

Day 3 was focused on text as data. In contrast to Day 2, Day 3 primarily operated through the flipped classroom model where the bulk of the day included small group time devoted to developing and beginning to answer a research question using text as data. We provided groups with a few datasets they could work with for the day. The day concluded with short presentations from each group, followed by a guest talk from Dr. Chris Fariss, who spoke on using measurement models to describe social constructs.

Our Days 4 and 5 deviated from the primary SICSS-Princeton schedule. We opted not to participate in the Fragile Families Challenge due to concerns about the ability to coordinate access to and collaboration with the data with a large number of participants.

Our Day 4 focused on skill-building in machine learning for prediction and classification. In the morning, we had a live tutorial on implementing machine learning processes in R. In the afternoon, we then had a small group exercise where participants were provided with a few datasets to choose from and asked to develop a prediction or classification question and implement and evaluate the performance of machine learning models they developed. We asked participants to Slack a summary of their projects. The day concluded with a guest research talk from Dr. In Song Kim who spoke on the development of a large-scale database on lobbying with applications evaluating the influence of lobbyists in politics.

The morning of Day 5 was focused on survey and experimental methods and was our shortest session. We had a short group exercise where participants worked together to develop an experimental design. We asked them to Slack a summary of their design. The morning exercise concluded with a guest talk from Dr. Dan Nielsen on digital field experiments.

The afternoon of Day 5 and morning of Day 6 were devoted to network analysis. The afternoon of Day 5 included a live tutorial on network analysis in R. On the morning of Day 6, participants then took part in a small group exercise where they gained practice collecting social media network data and began to analyze it.

Overall, the first week went well. Participants tended to enjoy having a mix of formats. Fatigue appeared to set in most when sessions comprised too much of any one type of activity. We used the Zoom breakout rooms for all small group exercises and encouraged participants to share screens with each other and communicate using the Zoom chat and Slack channels. During group work, we asked small groups to report any difficulties via Slack so that an organizer or Teaching Assistant could visit their breakout rooms to assist. We occasionally also spontaneously visited the breakout rooms to help monitor progress for groups that we did not hear from during the sessions.

One of the biggest challenges during the first week of training is managing the different skill levels of participants. Some participants reported that it was often the person with the most coding experience who took the lead in programming for the small group, which placed an extra burden on these more experienced coders, while also preventing less experienced programmers from gaining additional hands-on experience. We began encouraging small groups to have the least-experienced coders be the ones to share their screen and lead the group coding. We also encouraged participants to take advantage of break times, support from organizers and Teaching Assistants during the sessions, and office hours that Teaching Assistants provided directly following each day’s session to gain additional one-on-one assistance. However, the diversity in skill level remains both a strength of the institute for assembling and facilitating communication across a wide range of scholars, but also a difficult challenge to address for future institutes during the training sessions. Future sessions might place greater emphasis on discussing strategies for dividing up work among participants, perhaps tasking each team with creating a brief, informal memo describing how they plan to do so each day.

Second Week

The second week of the institute was primarily devoted to small group projects, beginning with a group formation process in the afternoon of Day 6. We followed the research speed dating model from Chris Bail. Prior to the start of the second week, we asked participants to fill out a spreadsheet indicating their research interests. This allowed us to skip this step at the beginning of the group formation session. Participants completed two rounds of small-group sessions where they were tasked with coming up with research project ideas. We then gave them approximately 20 minutes to choose the project idea they were most excited about joining. We ended up with 10 small group projects. We told participants that their group projects could take many different directions, including the launching of a traditional academic research project, the development of public goods (e.g., a database or Shiny application), or in-depth small group study of a topic for which they want to gain additional experience (e.g., we had a group focus on learning more about text as data and another group focus on learning how to access and analyze TikTok data). We had group projects of each type.

We asked participants to provide three check-ins prior to their Friday presentations of their projects. On Day 7, small groups provided a brief 1-page project plan. On Days 8-9, we asked groups to Slack a short one-paragraph update during the afternoon of each day. These updates allowed us an efficient way to track and provide feedback on each group’s progress. During Days 7-9, small groups were allowed to meet according to their own schedules. We told participants that the organizers and Teaching Assistants would actively monitor Slack to provide groups with assistance. We also had at least one Teaching Assistant or organizer available during the standard 10am-5pm window in the main Zoom session, where participants could drop in to seek assistance.

In addition to open small group work time, we had four guest speakers during week 2: Matthew Weber, Zhanna Terechshenko, Munmun De Choudhury, and Hana Shepherd, who spoke on a range of topics including web archival data, machine learning, and network analysis. Participants enjoyed the guest speaker sessions, although a couple remarked that these sessions contributed to Zoom fatigue. We found that participants tended to be most receptive to the presentations which were most clearly organized around substantive (rather than technical) themes and problems in computational social science. The guest speaker sessions during Days 7-9 also provided additional touchpoints for bringing the full group back together prior to the final day. We also made participants aware of the SICSS Festival, which took place during week 2 of our institute.

The last day of the institute (Day 10) was devoted to group presentations and closing remarks. Each of the 10 groups gave a 15-minute presentation in which each member had a role. Other participants attended and gave feedback via Zoom chat and Slack. We were incredibly impressed with the quality of the group projects. Several participants also remarked at how surprised they were with what groups could accomplish in such a short time relative to the usual pace of academia, which tends to be slower.

At the closing of the institute, we encouraged participants to keep in touch by continuing to use the Slack channel for collaboration and resource-sharing with their co-participants. We also encouraged participants to pay it forward by considering spreading the training to their own communities.


SICSS-Stellenbosch

SICSS-Stellenbosch organised by Richard Barnett (SICSS-Cape Town 2018) and Douglas Parry (SICSS-Cape Town 2019) at Stellenbosch University in Stellenbosch South Africa. Additional support and experience was provided by Petrus Schoonwinkel as a TA.

This post-mortem is broken down into the following sections: 1) advertisement, application, and acceptance; 2) pre-arrival and onboarding; 3) the first week; 4) the second week; and finally, 5) concluding remarks.

1. Advertisement, application, and acceptance

Applications for SICSS-Stellenbosch opened in February of 2021 with advertisement text having been sent to the heads-of-department of major social science, computer science and information systems departments at universities in South Africa as well as a number of field-specific mailing lists and lists for postgraduate students in relevant degree programs. We also leveraged Twitter and past participants of SICSS-Cape Town and SICSS-Stellenbosch for word-of-mouth advertising. We hoped and expected that this would lead to as much diversity in applications across South Africa. Based on the number of applications we received in 2020, we did not feel as though we needed to expand our advertising into additional groups. We also adjusted our application requirements in line with the adjustments we made after SICSS 2020 was virtualized.

We received a constant stream of applications and had to evaluate 83 applications to determine participation. From these applications, we accepted 19 participants, anticipating that some may not be able to attend due to various circumstances. Six applicants were not able to participate and withdrew prior to the start, leaving us with 13 participants. We set the deadline for application submission considerably later than most other locations and as a result we believe that we received a number of applications from participants who were rejected from other sites.

SICSS-Stellenbosch coincided with the first two weeks of a fresh COVID-19 lockdown in South Africa. Such lockdowns in South Africa have sought to contain COVID spread by placing a number of restrictions on movement and a number of universities shut their campuses. This meant that a number of participants were only able to participate asynchronously and would catch up on the daily videos and material, communicating asynchronously with us via slack. Other participants could fully participate both synchronously and asynchronously throughout the full two weeks.

The guiding principle we applied in accepting participants was based around the content of the virtual event. We looked to accept participants where their background and CV suggested they would benefit from learning computational science skills. To this end, we did not accept applicants who had career experience in data science and who’s CV suggested significant prior experience in R, or computational social science broadly interpreted. We also tried to give students who were starting out on their postgraduate studies more opportunity than those who had completed their studies or were close to doing so. As in 2020, we had to reject some applications that were from senior academics who felt that, as a growing field, computational science was something they should learn more about. While they were not a suitable fit for SICSS, we are engaging with them further to expand the field of computational social science in South Africa. We also decided to not accept any participants who were in time zones that were vastly out of alignment to our own. We felt that this was necessary to allow for better collaboration of the group. In the end, the vast majority of our participants were physically located in South Africa (but with nationalities from across Africa), with just a few from elsewhere in Africa (notably Nigeria and Zimbabwe) and a small number who were based in Europe.

2. Pre-arrival and onboarding

Based on our experience in 2020 and the intentional late application deadline, we limited our requirements for onboarding to the collection of information required by SICSS and ourselves. We made use of our own platform to collect participant information to ensure that we adhered with privacy regulations in South Africa and shared this information with the core SICSS team as was necessary after having received consent from participants. We required participants to submit biographical information and photographs as part of their acceptance to the programme which greatly enhanced the collection of this information over last year. We, once again, struggled a little with the use of Slack and would like to encourage SICSS to look into alternatives for next year.

3. The first week

Based on our prior experiences at SICSS-Cape Town and SICSS-Stellenbosch and our collective experience of two decades of teaching programming, we realised that the vast majority of applicants to SICSS-Stellenbosch were unlikely to have had much prior experience in R and that most will have had little formal experience with programming. While it has previously been a pre-arrival task for participants to learn R themselves, we felt that participants would benefit more from instructor led training in R than they would from self-study.

For each day (in both weeks) we split the content into four components. In the first component we began each morning with a short session in which we introduced the work for the day, took any questions, and discussed things to look out for or highlighted key extensions or examples. The second component involved a set of pre-recorded video sessions. In the first week these were videos that we produced for the workshop, while in the second week we used the core SICSS curriculum (with some adjustments). The third component involved, firstly, reading the corresponding chapters to the videos from R4DS and, secondly, working through a series of coding exercises adapted from these chapters. Throughout this time, as they arose, we interacted with the participants on slack either through DMs or the main group channel. The third component took place as an afternoon debrief on the day’s content. In these sessions we firstly presented a general discussion on the videos and the material. Following this we worked through all of the exercise questions with the participants asking questions or sharing their solutions as we went. These discussions often involved extra tangents, examples, or side discussions to expand on the content covered.

To manage the workload, we divided the days between ourselves such that each day was led by one of the organisers or TA with the others acting in support contributing as they needed in the discussions. Some of our participants struggled to always make the synchronous sessions so we recorded both the morning and afternoon sessions and made them available for participants to catch up afterwards if they couldn’t make it.

We kicked off our program with a live event the first morning for everyone to introduce themselves and for us to explain the structure of the program. The first week of the programme was dedicated to getting all participants up to speed in R, following the excellent book by Wickham and Grolemund: R for Data Science https://r4ds.had.co.nz/. This book, which is open access, served to give structure to our instructor-led training, which we presented via pre-recorded video lectures in the first week. Each day’s video lecture was accompanied by exercises from the book and textual discussion of the material on Slack. Throughout the day discussion took place on slack and, if necessary, one-on-one calls with participants took place.

To facilitate discussion participants were encouraged to post their code, errors, and successes on slack and respond to others’ questions. We held group calls each afternoon to catch up with participants and to debrief the day’s learning. In these calls we worked through the day’s exercises and discussed alternative solutions and related issues and concepts.

We purposefully limited the synchronous video component of the programme to limit the difficulties from poor Internet connectivity, which can be prolific in South Africa. While we did not have major connectivity issues in the calls we did have, the glitches we did experience suggest that this was likely a good decision. On Monday we focused on getting setup with R, RStudio, general good practices for project workflows, and programming in R. On Tuesday the videos and exercises focused on data manipulation with the tidyverse. The focus of Wednesday was working with diverse data sources, while Thursday focused on the core foundations of data analysis with different data types. Finally, Friday involved the foundations of data visualisation and introduced ggplot2 to the participants.

On most days the video content was between 45 minutes and two hours, with the exercises designed to take another two hours depending on experience and skill level. On top of this we held an hour-long discussion each afternoon alongside the chats and calls on slack. All-in participants spent approximately six hours per day in the first week. For most this was manageable (with many using study leave for SICSS) but for some this was quite a lot to juggle amongst other work and family-related demands at the time.

Feedback received from most participants was highly positive about the content of the first week, with many stating that this content is where they will benefit most in future. Most participants also indicated that this week’s content was necessary to prepare them for much of the content in week two. This feedback should be viewed as unsurprising, given that we knowingly accepted participants who we felt would benefit most from this content. Nevertheless, it was clear that participants were more readily able to engage with the content in week two than they had in prior years at SICSS-Cape Town.

4. The second week

In the second week, we focused on the core SICSS curriculum and presented the content that was pre-recorded by Matt and Chris. We largely followed the suggested programme on Monday through Wednesday, where the content was relevant to our audience.

We dropped the content on mass collaboration and experiments completely. This rearrangement was designed to try and alleviate fatigue on the Friday, and because the content on mass collaboration is not relevant to audiences where services such as Amazon Mechanical Turk are not available. In our attempt to alleviate fatigue, we appear to have introduced further fatigue, and by Thursday evening, most participants were behind in their progression through the videos and exercises.

It was particularly heartening for us to see the seriousness with which the participants engaged with the content on ethics, sparking significant discussion throughout the day, and with the debrief session running significantly over time. It was also exciting to see how the participants started applying the R skills and the trace data and text analysis content to their own fields and research.

The pre-recorded lectures proved to provide a better experience than had previously been possible at SICSS-Cape Town, due to the fact that they could be presented in the morning rather than starting only in the afternoon (due to the time-zone shift) and we would likely prefer these to live sessions for the same reason in future years. Collaborative work, as we expected, was severely impacted by the virtual nature of the event and we found it more practical to consider the programme an individual learning experience, with the collaborative aspects taking a backseat. We suspect that this may be some of the cause behind the fatigue as there was less scope for groups to work through problems together.

While we did not conduct research projects with our participants, we dedicated the final Friday of the programme to lightning talks where we encouraged participants to give a conference style presentation on their research and then to give a talk as to how they believed CSS could contribute. The organisers also offered their own talks, more focused on completed research in the CSS space. We found that this was a particularly informative process and through this and the rest of the programme we were able to give considerable feedback to some students, with one first year (postgraduate) student able to refocus their research question. Once again, we found that most students’ biggest takeaway from the programme is knowledge of R which – for some – will replace other statistical software such as STATA and SPSS.

5. Concluding remarks

Due to the limitations we experienced regarding COVID, electronic conferencing fatigue, and the slightly smaller group, we generally felt that 2021 did not have as thorough an engagement as 2020. This was counteracted by better knowledge of organising a virtual event, which we believe was presented largely without technical trouble. Nevertheless, both Douglas and Richard are in agreement that we would be interested in hosting SICSS again in the future, but only once an in-person workshop can happen.


SICSS-Taipei

Outreach

  • Traditional listserv through university institution
    • Our affiliated institution is Chengchi University. Our event information was circulated among different departments, including in social science and computer science. We also sent the information to department admins from other school we received.
  • Social Media (Clubhouse & Facebook)
    • We hosted an info session on clubhouse to discuss our views on computational social science in March. At the peak of the chat, there were about 50 participants in the audience and we received around 10 questions. It reached a wider audience beyond the affiliated institutions and our organizing committee’s existing network. We estimate at around 20% of our applicants heard about SICSS-Taiwan through this clubhouse chat. This format is quite useful to generate interaction from potential audiences and generate weak links in our outreach process.
    • For both our info session clubhouse event and the demo day of research presentations, we advertised them on Facebook.
  • Outcomes
    • We received 53 applications and had 25 attendees.
    • In our demo day on Zoom, the peak attendance was 65 participants, while the number of registered attendees was 100.

Application and selection process

Attendee composition

Current/Highest Academic Level Number of Applicants Proportion of Applicants Number of Accepted Applicants Proportion of Accepted Ones Acceptance Rate
Master Level 24 45% 11 42% 46%
PhD Course Level 6 11% 3 12% 50%
PhD Thesis Level 10 19% 7 27% 70%
Post-doctoral Level 5 9% 1 4% 20%
Early-career Faculty 8 15% 4 15% 50%
Total 53 100% 26 100% 49%
Gender Accepted Rejected Waitlisted Grand Total
No Response 4 2 3 9
Women 10 4 6 20
Men 12 11 1 24
Total 26 17 10 53
  • Information we collect
    • Basic information
    • Experience/suggestion on how to make virtual events better
    • Motivation on attending SICSS
    • Project proposal
  • How we evaluate our attendees
    • Three organizers score from 1 to 5.
    • A range of criteria
      • Is their project idea clear, specific and feasible?
      • Do they have specific learning goals?
      • How unique is their academic and work experience?
      • Have they articulated specific suggestions that might make the virtual environment of SICSS 2021 better?
      • Do they have strong potential in academic research?
      • Do they show a track record in collaboration? Would they provide learning opportunities to other attendees?
    • Correlation of the three organizers’ scores is quite low.
  • We encouraged attendees to form groups at the stage of applications. This was an intentional choice because we believed this method would drive up group registration as well as encourage attendees to begin thinking about potential projects they want to work on in week 2. Around half of our attendees had formed groups at the time of registration.

  • We accepted 25 attendees and decided to hire 1 applicant as our TA. We had a waitlist of 10. All accepted applicants made a security deposit of NTD $1,000 (USD$35). One had to withdraw because of parental commitment. One did not show up and make any communication. In total, we had 24 participants including our TA.

  • Lesson learned
    • In 2022, we plan to repeat our design of encouraging formation of groups prior to the event.
    • We should formulate scoring criteria among organizers prior to giving the scores.

Week 1

  • Group activities
    • On the following topics, we followed the main SICSS materials on group activities.
    • Successes
      • In the ethics discussion activities, which is the first ice-breaker between participants, each group had lively dialogue based on their background knowledge. Each group had unique opinions to share when emerging back to the big group.
    • Curriculum
      • We had the Application Programming Interfaces, Text Analysis, and Screen scraping/crawler group activities in the first week. We not only lectured the materials but also demoed the coding process for attendees. Attendees practiced doing these skills, and thought of potential research projects which may apply the natural language processing techniques.
    • Challenges
      • At the beginning of the code-along sessions, the familiarity level differences had some members doing the required task slowly and some others finding the course too easy for them. We soon adjusted the settings of the breakout rooms based on their proficiency — one for those who followed the steps of the tutorials and another for advanced attendees to discuss more on the topic.
    • Flash talks
      • Our attendees were from various disciplines, such as Computer science, education, political science, social work, and so on. Attendees tried to share their expertise and previous projects that are related to computational social science in 15 minutes, and discussion/QA 15 minutes on Friday of Week 1.

Week 1 and 2

  • Guest lectures
    • Successes
      • Two rounds of lecture and Q&A. Our guest lecture is 2 hours long. We divided it into two sessions so that we have two rounds of Q&A.
      • Most guest speakers delivered quality talks and generated a high amount of interaction during the Q&A sessions.
    • Challenges
      • Our speaker roster was not gender-balanced. We had 5 men and 1 woman as guest speakers. Some attendees gave suggestions on a list of women researchers that we can reach out to in the future SICSS-Taiwan events.
    • Lesson learned
      • Some guest speakers did not establish the right understanding of who the audience was, resulting in talks that are below expectation.
      • Attendees suggested organizers explicitly frame how the talks fit under the umbrella concept of computational social science to help the audience make connections between the guest lectures and the main SICSS curriculum.
      • We will build a list of researchers in computational social science from the input of SICSS-Taiwan 2021 participants for us to build a gender-balanced speaker roster in 2022.
  • Delayed viewing of SICSS Festival
    • This year, we tried to participate in the festival by viewing the recorded talks on YouTube. We allocated 1.5 hours where the first 10 minute was an introduction, followed by 40 minutes of watching the video on our own, and 40 minutes of discussion of the materials. It was a new format for most attendees and worked well virtually. It is a useful method to account for time zone differences from most other sites as well as provides enough time for facilitators to prepare discussion questions.
    • The two talks we watched and discussed are Digital and computational demography (2020) and Deep learning and causal inference (2021).
  • Because we want to have some group activities in addition to the research project in week 2 to formulate discussion outside of the research group, we had some guest lectures and viewing of the SICSS festival in both weeks 1 and 2. Some attendees expressed that the amount of time on guest lectures during week 2 meant less time for them to work on week 2 projects. In 2022, we should be mindful of this trade-off, possibly limiting the total time on non-project session to 15% of week 2.

Week 2

  • Research projects
    • 21 of our 24 participants were involved in a research project. They formed 7 groups in total.
    • In our application process, we encourage applicants to formulate their own research project and group.
    • At the beginning of the research speed matching process, 13 participants have formed groups and decided on tentative topics. All of them still participated in the matching process to discuss different topics. During the break out rooms in the speed matching process, some attendees enjoyed the opportunities beyond their original group members, while few felt awkward and did not enjoy the structureless process.
    • Group formation process took place in both the speed dating segment and informal segment.
    • We had a mid-point presentation on Wednesday afternoon where groups shared briefly on their progress. Many groups received relevant feedback from other attendees during this time. Many groups were able to act on the feedback to make their projects better in the next two days. We believe the feedback exchanged in this mid-point presentation was very helpful and will keep this feature in the future.
  • Case of TA’s group
    • During the speed matching process, TA decided to join two attendees for sharing mutual topics of interest.
    • Formed by members from diverse disciplines, the group had a clear division of labor and members shared the methodologies and techniques during discussion.
    • On the demo day, the three members respectively presented the question, result and implication parts.
    • After the SICSS workshop, the group members have been cooperating to continue the preliminary group work.
  • Demo Day
    • Public event, peak attendance was around 65 participants. Participants from SICSS-Hong Kong and SICSS-Tokyo also joined the talks throughout the day. 7 groups presented their work. Each group had 30 minutes for talk and Q&A.

Post-SICSS

  • In order to maintain the community beyond the two week workshop, we have been (or will be) doing the following
    • Organizing committee is meeting with attendees 1:1s for 45 to 60 minutes to discuss whether their expectations were fulfilled, whether/how they aim to continue on their research projects, and what suggestions they have to improve SICSS.
    • During 1:1s, some attendees, especially women, expressed that SICSS-Taiwan provided them with opportunities to build confidence in trying out computational methods.
  • In-person meet-up Attendees suggested ideas for in-person meet-ups at the closing activity.
  • Follow up events/study groups online We are using past talks from SICSS (starting with Brandon Stewart’s talk on topic models in 2019) to have bi-weekly online study group.

SICSS-Tokyo

SICSS-Tokyo took place from July 12-16, 2021, as a one-week event. The organizers were Hirokazu Shirado and Makiko Nakamuro and the teaching assistant was Atsushi Ueshima. We hosted the site virtually via Zoom. This was the first time for SICSS-Tokyo. The post-mortem discusses our application process, onboarding, event week, and self-review.

Application Process

We prepared the SICSS-Tokyo website in December 2020. In addition to the English version, we included the Japanese version on the website because we were concerned that some Japanese candidates might hesitate to apply due to the language barrier. In fact, we received a couple of emails to ask about what language we would use during the application period.

We began our outresearch in February 2021. To recruit a wide range of participants, we sent the advertisement to several professional email lists as well as posted on social media via organizers’ networks. We also sent the ad to the applicants of the SICSS-Tokyo 2020 that we had canceled due to the COVID-19 pandemic. However, we had difficulty finding effective outreach because the concept of “computational social science” was less common yet in Japan. (We’re also afraid that it might be mistakenly understood because the term is translated into “calculational” social science in Japanese). There is room for improvement in outreach (see below).

Our application deadline was March 12, 2021, and we informed participants we would get back to them by early April. The application asked for name, email address, institution, career stage, discipline, preferred language (English or Japanese), as well as a CV, research statement, and writing sample. We had optional fields for gender. We did not require letters of recommendation. We created a free Gmail account for the institution and used a google form to gather the information.

We received 19 applications and accepted 17. Surprisingly to us, over 25% of applications came outside of Japan, such as the USA and South Korea, even with the time difference. Next time, we should advertise SICSS-Tokyo not only to researchers who live in Japan, but also to junior researchers worldwide who are interested in the contents of SICSS-Tokyo.

Each organizer and TA went through all applications and evaluated whether to accept them separately. Because our evaluation was the same, we did not need further meetings to achieve a consensus. Three applicants (including 19) submitted their applications after the due date. We evaluated them as well as the other applicants. We completed the selection process by March 20.

Onboarding

We let the applicants know the result on March 22 and confirm the attendance of accepted applicants via another google form. All the 17 applicants accepted our offer, but eventually, one canceled just before the event due to a change of their university’s schedule. As a result, SICSS-Tokyo had 16 participants. We created a google group for SICSS-Tokyo and used the mailing list as well as Slack to inform participants of the pre-requirement and updated schedule. We also requested participants to provide a picture and a short bio for the SICSS website using the mailing list. One participant did not provide the information because their institution did not allow them. We needed the mailing list even with Slack because not all the participants were familiar with Slack.

The main organizer (Hirokazu Shirado) and the TA (Atsushi Ueshima) had weekly one-hour meetings from May 31 through July 21. The meetings helped us prepare the event contents, check the progress on time, and share improvements.

Event week

SICSS-Tokyo was a one-week event in 2021, from July 12-16. We decided on this period considering the typical academic schedule in Japan, where a semester ends in mid-July. However, some participants could not attend the event fully because they needed to participate in classes and tests. This schedule change of their universities happened because of the pandemic situation and the Olympics in Japan.

We set up three goals of participants for SICSS-Tokyo 2021: Learn about computational social science and its techniques Develop their research ideas in computational social science Develop their academic network In doing so, we provided them three components: live lectures, research-idea discussions, and participants’ talks. We decided not to run analytical and programming exercises because we found a considerable variation in participants’ interests and programming skills. Instead, we laid weight on developing research concepts in SCISS-Tokyo. We also decided to give live lectures rather than recorded ones.

Our schedule was as follow: Day1: we had welcome remarks; an intro lecture (Hiro), Hirokazu Shirado’s research talk (Hiro), and each participant’s short research talk (Atsushi). Day 2: we had a digital trace data lecture (Hiro), discussions on digital trace data (Hiro), and an automated text analysis lecture (Atsushi). Day 3: we had an experiments lecture (Hiro), discussions on experiments (Hiro), and an in-class experiment (Hiro). Day 4: we had a surveys lecture (Hiro), discussions on surveys (Atsushi), and an ethics lecture (Atsushi). Day 5: we had a frontier of computational social science lecture (Hiro) and each participant’s research idea pitch (Hiro). The name in brackets presents who led the component. When we completed the SICSS-Tokyo, we asked them to give us the post-survey.

Self-review

Overall, participants actively engaged in all the lectures and discussions. They gave many questions and comments on each content. More importantly, these questions and comments allowed the participants to go forward with their own work or/and find a new research direction. Thus, we think the event structure worked well for our goals of SICSS-Tokyo. If we run an offline two-week event, we might have wanted to add guest lectures and participant-led research projects. Also, we might like to exchange some topics such as surveys with social network analysis and machine learning to fit the participants’ interests. It might be better to ask participants in advance their priority to learn in SICSS.

We found several details that we might want to change: We had two discussions in the ethics lecture, but the second discussion (about research improvements from ethical perspectives) was not as active as others. We might need to focus on one discussion (about why one study is ethically acceptable, but the other is not) in the lecture. We should have prepared a more extended time slot for participants’ research idea pitch on the last day. We prepared 1.5 hours, but it actually took 2.5 hours for 14 talks (two participants did not give a talk). Some participants were not familiar with Slack. To make the online space more active, we might want to give a short task for all participants to use Slack at the beginning of the event.


SICSS-Zurich

SICSS-Zürich was hosted at the Center for Law and Economics (CLE) of the Department of Social Sciences (DGESS) at the ETH Zürich. This was the CLE’s second Summer Institute partnership. We were a fully self-funded, virtual partner with an synchronous one-week program (14-18 June 2021). The post-mortem follows the following (basically temporal) order: application process, developing the program, pre-arrival/onboarding, first week, and debrief/departure.

APPLICATION PROCESS

We mostly advertised SICSS in Switzerland: 1/ internally in DGESS & MTEC (among PhD students, post-docs, and faculty) 2/ externally in other top Swiss economics and political science department (mostly at San Gallen and University of Zürich). We also used the department’s website and social media presence.For the applications, although some other partner locations used FLUXX, we opted for the more basic but sufficient Google Forms. We were happy with the quantity and the quality of the participants: we could select around 20 participants, focused toward economics and political science applications. For the application itself, we would make a few small tweaks to the application itself: we did not ask for a letter of recommendation but asked students to describe their attributes/skills/projects.

DEVELOPING THE PROGRAM

In designing and developing the program, we relied on the input of previous participants, speakers, and the SICSS community. We were driven by a self-imposed constraint: we wanted to teach in python as this is the common first language of the teachers. We wanted to have both some teaching inputs, and give the participants some real experience of the tools and concepts. We also wanted to divide the days between zoom and non-zoom time (at least for the students. We ended up with a mixed schedule: teaching inputs in the morning, afternoon working in groups on a problem set, recap on the exercises in the late afternoon.

PRE-ARRIVAL/ONBOARDING

We told the students that the summer school would require some knowledge of python even though they could always work in R if needed. Yet, it seems that this information was not really taken seriously. We could have taken a more proactive approach and require that they follow an introduction to python. To cope with that, we dedicated the first afternoon to learning the basics of python (& jupyter notebook) to put everyone on the same page. We also circulated a Slack workspace to all participants and organizers. It was also actively used as the main means of communication (meeting links, webpages, papers, further discussions). We did not organize a social event, as we thought that everyone would have already had a lot of screen time.

FIRST WEEK

The first week was very similar to the schedules of many other partners. We used the mornings to discuss more deeply some of the key issues of Statistical Learning (Monday); Data Collection (Tuesday); Machine Learning( (Wednesday); Text Analysis (Thursday), and Computer Vision (Friday). We allocated these based on our distribution of skill sets and interest. We did 2 main deviations from Princeton’s curricula: statistical learning was more of an introduction, discussing the difference between machine learning and causal inference approach. The lower content of the Monday morning was put to the benefit of an introduction to python (and pandas) in the afternoon. Then, on Friday, we covered computer vision: we were excited to present some new-to-social scientists methods. We started every day at 9:30 and ended at 5 pm.

The afternoons were organized around the activities which uniformly followed the small group activities and returning to a large group to debrief. The most successful approach was to group students together based on common research questions & diverse technical skills: this way, they could learn from each other as well. This was largely successful although by the end of the week, it was clear that if we could have mixed up the pedagogy, that might have been appreciated. We also benefited from important inputs from participants, who presented some computing tools that they found useful (julia, docker container…).

We overall thought that the online nature was ok, but that it was limited by the time that everyone can spend interaction through a screen. It seems that our slack channel was more useful than the previous year, which shows that there is at least some substitutability. One important content that the participants really liked was that everyone was contributing to the content, providing interesting resources, tools and approaches to the daily subject of interest: these resources were curated and made available on our github.

POST-SICSS

The slack channel remained active for a few weeks but seems dead by now.