My work centers on the study of information controls and the development of statistical machine learning methods to analyze political texts. I focus on the political economy of information control in authoritarian regimes, with a specialization in Chinese politics. My focus on deep learning, natural language processing, and information retrieval has informed the large-scale collection and quantitative analysis of text data from various Chinese sources in my research. With these data, I observe how states leverage private companies to aid efforts to restrict access to information.

In much of my research, I focus on three questions: 1) how do modern authoritarian states make use of market capitalism to ensure survival, 2) why does politically sensitive content abound in authoritarian countries despite herculean efforts to stomp it out, and 3) how does computational propaganda and censorship affect individual-level political behavior and opinion?


Working Papers:

Active Learning Approaches for Labeling Text: Review and Assessment of the Performance of Active Learning Approaches

Authored by: Blake Miller, Fridolin Linder, and Walter Mebane

Abstract: In the case where concepts to measure in corpora are known in advance, su- pervised methods are likely to provide better qualitative results, model selection procedures, and model performance measures. In this paper, we illustrate that much of the expense of manual corpus labeling comes from common sampling prac- tices such as random sampling that result in sparse coverage across classes, and duplicated effort of the expert who is labeling texts... (continue reading)

Read Full Paper Here

The Limits of Commercialized Censorship in China

Authored by: Blake Miller

Abstract: Despite massive investment in censorship, politically sensitive content abounds online in China. Some have argued that censorship is centralized and targeted at specific types of content. However, leaked company censorship logs from popular social networking site Sina Weibo show no evidence of this claim. Instead, the logs, which are company notes made in process of censorship, indicate that prevailing explanations overlook fragmentation, principal agent relationships, and corporate delegation inherent in the Chinese censorship system... (continue reading)

Read Full Paper Here

Automated Detection of Chinese Government Astroturfers Using Network and Social Metadata

Authored by: Blake Miller

Awards: Best paper at the PolText 2016 conference in Dubrovnik, Croatia

Abstract: I present a method for automatically detecting pro-government astroturfers in China (colloquially referred to as the Fifty Cent Party), using comment metadata from a dataset of 45 million news media comments posted on 4 million news articles from 19 popular news websites in China.... (continue reading)

Read Full Paper Here

Astroturfing in China: Three Case Studies (Report)

Authored by: Blake Miller and Mary Gallagher

Abstract: We identify government-fabricated social media posts about three interesting cases of public opinion events: the G20 Summit in Hangzhou, the ruling of the International Tribunal on the South China Sea Disputes between China and the Philippines, and the explosions in the Port of Tianjin in the summer of 2015.... (continue reading)

Read Full Report Here

Frauds, Strategies and Complaints in Germany

Authored by: Walter Mebane, Joseph Klaver, and Blake Miller

Abstract: Many statistical methods that use low-level election vote count data to detect election frauds have the limitation that they have a hard time distinguishing distortions in vote counts that stem from voters’ strategic behavior from distortions that originate with election frauds... (continue reading)

Read Full Paper Here

Measuring Changes in Vatican Social Policy from Papal Documents

Authored by: Anna Grzymala-Busse and Blake Miller

Abstract: There is much debate about how, when, and why the Church has changed its emphasis on secular states and social issues. Unfortunately, large-scale, systematic research on papal policy... (continue reading)

Read Full Paper Here