NYWalker is a distributed project that aims to collect data and develop a rich database of places mentioned in various novels. As the name suggests, the initial focus is on novels about New York City, but the code doesn’t particularly care about New York City. We do.


As the geospatial Digital Humanities matures, a lot of the work being done on “space in literature,” as Franco Moretti refers to it, involves named-entity recognition of a giant dataset of novels. See, for example, the Textual Geographies project, which “presents, analyzes, and visualizes a collection of more than 5 billion named locations extracted from 5 million books and journals.”

We’re unsatisfied with the results available from that kind of analysis; it doesn’t answer the questions we have, as NER strips so much semantic (and probably more subjective) information away from each instance of a place mentioned in a text.

Instead, this application relies on time-consuming hand entry for one object at a time, serving as an example of tiny digital humanities. The default setting is not any more semantically rich than what NER would return (simply, place name and position in text), but it is not terribly difficult to expand the Instance model to include, say, a boolean for whether the instance is inside dialog. Or part of a trip. Or to create a Character model who is responsible for that instance in the text.


We want to understand specific novels better. We can do that, we believe, in part by creating a (geospatial) database that is of use to us in answering questions about U.S. novels primarily related to New York City. But we also want this database to be available to the outside world, as well. It’s an idiosyncratic product, possibly recreating issues related to selection bias, canonization, and the rest. But it’s a start.

In addition to the above research goals, we also use this application pedagogically. Entering data is part of the course requirements for the “Writing New York” course at New York University, and the software is also used in at least one version of NYU’s “Digital Literary Studies” course. We believe that it’s a lightweight point of entry into the (geospatial) digital humanities, providing both instant feedback (a map!) and also encouraging students to collaborate, act as detectives hunting down geographical data, and the rest.

Finally, the work is public-facing, fulfilling a final pedagogical goal, of giving students the opportunity to work on research projects with “real-world” applications.

Citing NYWalker

NYWalker depends on the labor put into it by students, staff, and faculty. For more information on how to cite this data repository, please see the citing page. This page is generated computationally, so it may take a while to load.


The above is a shorter version of the README.md at Github . More information, especially regarding the technology underlying the application, can be found there.