This ongoing collaborative project with Catherine Lee (Rutgers Sociology) uses computational text analysis to study the use of diversity and population terms in biomedical research from 1990-2018sts race nlp viz
While existing research on race and affirmative action have documented the growth of diversity projects in education and employment, the field of science & technology studies have documented the persistence of scientists to molecularize race in biomedical research. In our paper, we employ computational text analysis to examine quantitative patterns in the use of diversity and population-specific terminology in biomedical research over the course of the past three decades. Drawing from two large samples of scientific abstracts, we document trends in both the raw and proportional usage of diversity and population-specific terminology over the past three decades. The study demonstrates a pronounced growth in the use of diversity-related terminology, including terms like diversity, race/ethnicity, and sex/gender. Moreover, while our results demonstrate that population terminology has grown dramatically over that same period, we note that specific kinds of population terms have varied in ways that previous research may not have suspected. Most notably, we find that the use of national and continental-specific terms continue to rise over the full extent of our analysis window while terms like race and ethnicity and other terms deriving from the US Census categories have become less prevalent over the past decade. Together, our work points to a quantitative shift towards biomedical researchers enacting population distinctions that rely on geographical distinctions rather than racial and ethnic classifiers.