Tutorial 2: Extracting Information from Documents

Tutorial 2: Extracting Information from Documents

NLP and CSS 201: Beyond the Basics

We’ll demonstrate an extension of the use of word embedding models by fitting multiple models on a social science corpus (using gensim’s word2vec implementation), then aligning and comparing those models. This method is used to explore group variation and temporal change. We’ll discuss some tradeoffs and possible extensions of this approach.

Author: Andrew Halterman, Professor

Duration: 58:19