SICSS-Melbourne

June 22 to July 3, 2026 | Melbourne, Australia

Training Resources

This page compiles pre-reading, tools, and further reading for each confirmed session in the SICSS-Melbourne 2026 program. Use it to prepare before each day and to explore topics that interest you after the institute.

General Resources

  • Bit by Bit: Social Research in the Digital Age — Matthew J. Salganik, 2018. A comprehensive, open-access introduction to computational social science covering surveys, experiments, mass collaboration, and ethics. Book
  • Computational Social Science — David Lazer et al., 2009. The foundational paper arguing for a field that leverages large-scale digital data to understand human behaviour. Paper
  • National Statement on Ethical Conduct in Human Research — NHMRC / ARC / Universities Australia, 2023 (updated 2025). Australia's core framework for research ethics, essential for anyone working with human or digital trace data. Guide
  • Computational Analysis of Communication — Wouter van Atteveldt, Damian Trilling & Carlos Arcila, 2022. An open-access textbook with R and Python code covering text, network, and image analysis. It is not expected of the participants to be familiar with R or Python, but it is advisable to explore what they are and some foundational terms. Book
  • R for Data Science — Hadley Wickham & Garrett Grolemund, 2nd ed. A practical, free introduction to data science with R and the tidyverse for researchers with little coding experience. Another resource that would be useful as an introduction into the key aspects of working with data using R.Tutorial
  • Introduction to Cultural Analytics & Python — Melanie Walsh, 2021. A free online textbook covering Python basics, text analysis, and social media data specifically for humanities and social science scholars. Tutorial

Session-Related Resources

Day 1 — Introduction to Computational Social Science  ·  Monday, 22 June

11:00–12:30  |  Keynote Dialogue

What is Computational Social Science and Why It Matters in Australia?

Pre-reading

Tools & Platforms

Further reading

15:15–16:30  |  Keynote

Social Bias in Computational Social Science

Pre-reading

Further reading

Day 2 — Working with Data: Ethics and Practices  ·  Tuesday, 23 June

09:00–10:30  |  Panel

Ethics in Computational Social Science

Pre-reading

Tools & Platforms

Further reading

11:00–12:30  |  Workshop

Data Donations and Participant-Centric Research

Pre-reading

Tools & Platforms

Further reading

15:00–16:00  |  Talk

Nectar Research Cloud

Pre-reading

Tools & Platforms

Further reading

Day 3 — Data Collection and Working Across Disciplines  ·  Wednesday, 24 June

09:00–10:30  |  Panel

Does Computational Social Science Lack Theory?

Pre-reading

Further reading

11:00–12:30  |  Talk

The AIReD Platform for Australia-wide Social Media Discovery and Usage

Pre-reading

Tools & Platforms

13:30–15:00  |  Workshop

Collecting and Analysing Data Download Packages

Pre-reading

Tools & Platforms

  • Port — Open-source framework for locally processing donated data download packages.
  • Data Donation (datadonation.eu) — Resources on best practices for requesting and using DDPs from major platforms.

Further reading

15:30–17:00  |  Workshop

Working with Text Using Computational Techniques

Pre-reading

Tools & Platforms

  • tidytext (R) — R package for tidy text mining workflows.
  • quanteda (R) — Comprehensive R package for quantitative text analysis.
  • spaCy (Python) — Industrial-strength NLP library for tokenisation, NER, and text processing.

Further reading

Day 4 — Tools and Approaches to Data Analysis  ·  Thursday, 25 June

09:00–10:30  |  Workshop

Screen Capture for Data Collection

Pre-reading

  • MOAT — Description of the Australian Internet Observatory's Mobile Ad Observatory Toolkit (MOAT) and relevant work conducted.Guide

Tools & Platforms

  • AIO Mobile Screen Capture Tools — Research-grade mobile and browser extension tools for capturing personalised platform content (ads, feeds, recommendations).

Further reading

11:00–12:30  |  Workshop

Using LLMs to Create Data Analysis Pipelines for Text-as-Data Research

Pre-reading

Tools & Platforms

  • quallmer (GitHub) — R package for structured LLM-assisted coding, validation, and audit trails in text-as-data research. Setup
  • quallmer.app — Interactive Shiny companion app for manual coding, reviewing LLM annotations, and computing agreement metrics.
  • ellmer (R) — Backend package for connecting to multiple LLM providers (OpenAI, Anthropic, Google, Ollama).

Further reading

13:30–14:30  |  Workshop

RAG 101

Pre-reading

Tools & Platforms

  • LangChain — Popular Python framework for building RAG pipelines, with pre-built modules for document loading, embedding, and retrieval. Setup
  • LlamaIndex — Framework for connecting LLMs to external data sources, purpose-built for RAG applications.

Further reading

14:45–16:15  |  Workshop

Image Analysis for Qualitative and Quantitative Research

Pre-reading

Tools & Platforms

  • Image Machine — Open-source tool for clustering visually similar images using machine vision embeddings and identifying visual patterns in large datasets.
  • UMAP — Uniform Manifold Approximation and Projection for dimensionality reduction, used for 2D visualisation of image similarity.

Further reading

Day 5 — Disciplines, Careers, and Industry  ·  Friday, 26 June

09:00–10:30  |  Panel

Cross-Disciplinary Collaboration: Bringing Social Science and Computational Analysis Together

Pre-reading

Further reading

Week 2 — Collaborative Research Projects  ·  Wed 1 July

Wed 1 Jul  ·  09:00  |  Workshop

Validation in Computational Social Science

Pre-reading

Tools & Platforms

  • irr (R) — R package for computing inter-rater reliability statistics including Cohen's kappa and Krippendorff's alpha.

Further reading

Thematic Index

Foundations & Theory

This theme covers the intellectual history, definitions, and core theoretical debates in computational social science. Participants will explore the origins of CSS, discuss the relationship between prediction and explanation, and examine critiques regarding whether CSS lacks theory. Additionally, resources under this theme address key issues of validation and research reproducibility in computational studies.

  • What is CSS and Why It Matters in Australia — Day 1
  • Does CSS Lack Theory? — Day 3
  • Validation in CSS — Week 2

Key resources: Lazer et al. (2009), Bit by Bit, Hofman et al. (2021)

Ethics & Research Design

Computational social science introduces unique ethical challenges and data quality concerns that go beyond traditional social research. This theme focuses on identifying and mitigating algorithmic and data biases, navigating ethical review processes under Australian and international guidelines, and implementing participant-centric methodologies such as privacy-preserving data donations. Students will learn how to align their research designs with the FAIR and CARE principles.

  • Bias in CSS — Day 1
  • Ethics in CSS — Day 2
  • Data Donations and Participant-Centric Research — Day 2

Key resources: Olteanu et al. (2019), NHMRC National Statement, FAIR Principles

Data Collection Methods

This theme introduces practical techniques and infrastructure for gathering digital trace data. Resources cover the use of API-based dashboards (specifically the AIReD platform), methodologies for collecting and processing user-donated data download packages, and deployed screen capture solutions for observing personalized algorithmic feeds. It also covers the setup and use of the Nectar Research Cloud for scaling data collection.

  • The AIReD Platform — Day 3
  • Collecting and Analysing Data Download Packages — Day 3
  • Screen Capture for Data Collection — Day 4
  • Nectar Research Cloud — Day 2

Key resources: AIReD, Port, Nectar Cloud

Computational Analysis (Text, Images, LLMs)

Once data is collected, computational techniques are required to process and analyze it at scale. This theme spans quantitative and qualitative text analysis (using tidy workflows and R/Python NLP libraries), the deployment of large language models for text-as-data annotation, retrieval-augmented generation (RAG) pipelines, and computer vision methodologies for pattern discovery in large image datasets.

  • Working with Text Using Computational Techniques — Day 3
  • Using LLMs for Text-as-Data Research (Quallmer) — Day 4
  • RAG 101 — Day 4
  • Image Analysis for Qualitative and Quantitative Research — Day 4

Key resources: Text Mining with R, quallmer, Lewis et al. (2020) RAG paper, Image Machine (QUT DMRC)

Careers, Collaboration & Publishing

This theme focuses on the professional development of computational social scientists. It provides guides and advice on building successful interdisciplinary collaborations that bridge computer science and social research, working with industry and public policy partners, writing grants for CSS projects, and demystifying the publishing process in cross-disciplinary journals.

  • Cross-Disciplinary Collaboration — Day 5
  • Working With and In the Industry — Day 5
  • Career Success — Day 5
  • Grant Writing in CSS — Day 5
  • Demystifying Publishing in CSS — Day 2

Key resources: Bromham et al. (2016), Lazer et al. (2020)





Australian Research Data Commons Logo

The Australian Internet Observatory (https://doi.org/10.25956/twvn-ca19) is a co-investment partnership with RMIT University, QUT, University of Queensland, University of Melbourne, Swinburne University, Deakin University and the Australian Research Data Commons (ARDC) through the HASS and Indigenous Research Data Commons (DOI:10.3565/hjrp-b141). The ARDC is enabled by the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS).

Host a Location

You can host a partner location of the Summer Institutes of Computational Social Science (SICSS) at your university, company, NGO, or government agency.

Learn More