2022–23 Center for Digital Humanities Data Fellow Caitlin Karyadi writes about her year of curating data and the impact it has had on her dissertation research.
CDH Data Fellows learn best practices in data selection, structuring, cleaning, transformation, and preservation, with the aim of producing a dataset suitable for computational analysis, open-access publication, and future use in research and in undergraduate or graduate courses. Princeton faculty, staff, postdoctoral fellows, and graduate students are eligible to apply.
This project, “Shen Nanpin and the problem of authenticity,” comes out of my dissertation, which focuses on the received legacy of the Chinese painter Shen Nanpin (1682–ca. 1760) in early modern Japan. Specifically, I consider the thousands of extant paintings attributed to Nanpin, who, decades after his brief journey to Japan, became a figurehead of the Japanese painting canon. While previous studies have disregarded the bulk of these paintings as “fake,” I rather interrogate how these objects were made, relate to one another, and thereafter constructed Nanpin’s art historical memory.
In order to organize and compare a large amount of data, I have enlisted practices and perspectives from the Digital Humanities to create and manage two datasets. The first consists of over two-hundred works that I have viewed and photographed firsthand in Japanese, Chinese, European, and American museum and private collections. My second dataset encompasses every “Nanpin” painting that has been recorded in museum databases, auction catalogues, and private collections, among other sources. As I have not seen many of these objects, the dataset is less granular compared to the first.
As a CDH Data Fellow, and with the helpful feedback of Emily McGinn, Leigh Lieberman, and Joshua Seufert, I refined, cleaned, and added additional entries to my second dataset. This dataset is still a work-in-progress, but it has allowed me to chart interrelationships and quantify trends among Nanpin-attributed paintings—analysis that I have incorporated into the third chapter of my dissertation. As part of my ongoing research, I intend to complete the second dataset (a few hundred more entries to go), continue to view as many “Nanpin” paintings as possible, and create meaningful data visualizations to represent this work.
Through the process, I have experienced how a well-curated dataset forms an argument in and of itself. Hence, while data curation can be slow and involved, designing an initial framework, routinely adding to the dataset, and then refining it based on usability and evolving goals has opened up new research directions and questions in my work.