Skip to main content

Distant Reading: Distant Reading

Distant Reading
Distant Reading
    • Notifications
    • Privacy
  • Project HomeK12 Digital Pedagogy
  • Projects
  • Learn more about Manifold

Notes

Show the following:

  • Annotations
  • Resources
Search within:

Adjust appearance:

  • font
    Font style
  • color scheme
  • Margins
table of contents
  1. What is it?
  2. Why do K-12 educators care?
  3. What can it look like in the classroom?
  4. What should I be careful about?
  5. How can I try it?
    1. Got 5 minutes?
    2. Got a whole class period?
    3. Got a whole unit or course?
    4. Where can I learn more?

Distant Reading

What if, finding an intriguing term in her book, the reader pictured above wanted to see if it appeared in any of the books behind her? Not a realistic task without the tools of distant reading.

Photo by Seven Shooter on Unsplash.

What is it?

If close reading means zooming into a text, working at the micro-level of sentences and paragraphs to discover layers of meaning, distant reading means zooming out, working at the macro-level of a whole text or across texts to discover layers of meaning.

When we engage in distant reading, we take computational approaches—analyzing in ways that would be undesirable or impossible by hand. We can investigate class and gender in Shakespeare plays by quantifying the lengths of lines characters get. We can explore cultural values across epic poems to compare how frequently and when duty, valor, and honor appear in The Odyssey, The Ramayana, and The Iliad. We can chart the use of the word schadenfreude (taking pleasure in others’ misfortunes) in German texts[1] from the 1800s to the present—spoiler alert, it’s on the rise. We can even simply preview texts, looking to see what terms appear most frequently in a lengthy article on climate change.

While distant reading at its most sophisticated can involve complex programming and natural language processing libraries, you’ve probably already done some distant reading yourself. Ever use command+f or ctrl+f to find a term on a document? Ever create a word cloud to highlight the most frequent terms of an online discussion? If so, you have used computational methods to analyze a text. These approaches may seem simple, but they can be used to do some complex work. Plus, new apps such as Voyant can allow upper-level students and instructors to perform highly sophisticated computations without any programming.

Why do K-12 educators care?

Distant reading is particularly powerful in the classroom because it can hook both reluctant readers and avid readers into thinking about texts in new ways.

For reluctant readers, a quantified approach offers a fresh approach to exploring texts. In just minutes, students can uncover interesting finds that incentivize a deeper dive and promote greater willingness to stick with a text over weeks. Revisiting distant reading throughout the course of close reading can yield greater and greater layers of analysis as well.

For bookworms, distant reading provides the thrill of a completely new reading experience. Their insights as close readers can turn into quantifiable questions, and they can start to think about the sentence, paragraph, and page in the broader context of the full text, an author’s full body of work, or a whole genre.

Most importantly, distant reading flattens the hierarchy of the classroom. Students feel empowered because they know that the teacher doesn’t have a fixed answer in mind. In fact, the teacher may not even have a fixed question in mind—a freedom that is liberating for reluctant readers and ideal for avid ones. Anyone can make a discovery and offer it up for distant or close analysis or refining by the group. In its exploratory nature, distant reading values diverse minds and experiences around the table, as each student brings their own agenda to the quantifying.

What can it look like in the classroom?

Using command+f to investigate a sophomore’s off-hand question

Recently, a 10th grader reading The Scarlet Letter (who was quite challenged by the difficult text) asked idly why chapter 14’s title, “Hester and the Physician,” referred only to Hester by name and not Roger Chillingsworth, the “physician.” We used distant reading to investigate her question. The table of contents (a sort of manual distant reading) revealed that only Hester and her daughter Pearl are named in chapter titles—all other chapter title characters (all of them men) are referred to by their professions only, including the minister Arthur Dimmesdale, the father of Hester’s child.

Heading to the Project Gutenberg copy of the text, a quick distiant read using command+f showed that “physician” appears 70 times in the text, “minister” 204 times, Pearl 255, and Hester 408. In the Chrome browser, the terms are also displayed visually on the right-hand side relative to where they appear in the text, adding a mapping component to distant reading. The revelation spurred interesting and important questions and conversation about what this quantification said about the narrative, the characters, and the professions of the two men compared to their actions in the plot.

A search for the word “minister” in The Scarlet Letter reveals 204 instances. The orange lines running the length of the right-hand border displays the term’s sparse use in the opening third of the text followed by dense use in the remaining two thirds.

Using word clouds to compare presidential inaugural addresses in the 7th grade classroom

In light of increased divisiveness between the United States’ two major parties leading up to the 2020 election, students distantly read the transcripts of inaugural addresses from Yale’s Avalon Project by generating word clouds to compare term frequencies. Generating lively shout-out discussions, they popped their initial findings with a few notes into a collaborative slide show, and then dug into the meatier questions together. Students were motivated by their discoveries to read the addresses more closely, looking to answer the questions their distant reading uncovered. For example, one student noted that “common” was frequent in two speeches she compared—Bush’s 2001 and Obama’s 2009 addresses—and she wondered in what sense the word was used. That drove the students back into the texts, and they discovered that “common” was most often used as a unifier as in Grant’s “common country,” Roosevelt’s “common enterprise,” and Obama’s “common good,” “common purpose,” “common humanity,” and even “common dangers.” They started sharing the contexts of the finds, such as Bush’s line, “Our national courage has been clear in times of depression and war, when defending common dangers defined our common good,” discussing the potential power of commonality and the use of emotion as a rhetorical device.

Students share their inaugural address word clouds with each other on Google slides to generate lines of inquiry.

Using word clouds in 8th grade science to preview articles and boost literacy skills

Physical science teacher Peter Hill provides students with word clouds of articles before they read, asking them to “anticipate what this reading’s going to be about, what are the main ideas, and then what’s a weird word that you see that you don’t know?” In just five minutes, students have paved the way for the article to come.

Using word clouds in 4th grade to boost self-esteem

One Florida teacher used word clouds to wrap up last year in a positive way. She invited her students to write a little about each of their classmates, then created word clouds from their responses that she printed and sent home. The word clouds highlighted the most frequent adjectives, such as funny, hard-working, caring.

What should I be careful about?

Text availability. Students can distant read any text in digital format, from Supreme Court decisions to Tweets. But, when we think about the literature classroom, the rules of copyright shape what’s available. The majority of public-domain books, short stories, and poetry available are over 70 years old, so they often reflect social values, language, and a publishing world less open to a range of voices. Not surprisingly, lots of literature-based distance reading has focused on white voices on the page, which amplifies those voices even further.

The issue presents educators with two opportunities: 1) to have discussions about copyright and how it affects what’s openly available, and 2) to investigate the widely available text with a more critical societal lens.

A screenshot of the first 22 books on the top 100 books downloaded October 9th from Project Gutenberg reflect the prominence of white, male authors—voices that were even more dominant in publishing 70+ years ago than they are today.

Channeling the thrill of discovery into critical thinking. When students distant read, they are often able to make fast and thrilling discoveries. That leads to exuberant shouting out, which is great for collaborative nuancing and side explorations. Students will sometimes jump to conclusions, apply 21st century thinking to older texts, or simply misread, however. As a result, teachers may want to distinguish discovery time, where students explore a host of exciting paths quickly and generate lots of analytical possibilities, from investigation time, where students work together to validate or refine initial questions or conclusions.

Results skewed by translations and publisher text

At present, the most accessible distant reading tools don’t bring much human nuance into their processing. For example, a search I conducted in Voyant for Rama, the protagonist of the epic poem The Ramayana, suggested that he doesn’t exist in the text. Knowing that to be impossible, I took a closer look at the Project Gutenberg version I was using, only to realize that the translation transliterated his name as Ráma. A new search for the accented name found 1529 references—much more suitable for an epic hero. While that distant misread was an easily identifiable one, other translation issues may play into our unconscious bias. For example, investigating word complexity of a translated text reveals only the complexity of the translator’s decisions, not the original text.

Similarly, digital texts often include publication extras, from introductions to page numbers or chapter titles. For example, “pg” shows up in a word cloud of frequent terms for that version of The Ramayana, indicating the pagination of the text digitized.

Of course, these issues can yield fruitful conversations as well, as students consider what translation means in an age when algorithms shape our digital and lived experience and how publishers' additions to texts shape their interpretation.

How can I try it?

Got 5 minutes?

Convince yourself that distant reading can generate some fascinating questions. In fact, if you’ve only got 2 minutes, head to Google’s n-gram viewer to see how often any word, name, or phrase has appeared over time in their collection of millions of digital books. 

If you’ve really got 5 minutes, head to Project Gutenberg’s bookshelves to find a book you know pretty well or have taught. Grab the HTML option, if given a choice.

Use a simple command + f to search for terms, noting their placement and their quantities. Try:

  • Character or place searches: Who shows up the most? By what name? When do they appear?
  • Gendered language: What genders and gender roles dominate the text and where? He? She? Husband? Wife? Father? Mother? King? Queen?
  • Value searches: Where is love? Duty? Honor? Kindness?
  • Historical or domain term searches: What language unique to the time or subject comes up often?

As you explore, note interesting limitations and workarounds. For example, if exploring Little Women by Louisa May Alcott, just searching for “Jo” will yield 1,841 hits including where the two letters appear in words like “joy” in addition to the protagonist’s name. Searching for “Jo ” (including the space) will narrow that to a still-impressive 706.

Got a whole class period?

Use a free word-cloud generator, such as WordArt, to have students explore and compare texts broadly, getting an overview of word frequency without having to generate specific inquiries. Students can compare articles about the same event from different news sources, separate chapters of a novel, or different novels from the same author. This exploration is great for a full class period because students can really play, making initial finds and design decisions—and then follow that exploration up in a more reflective homework assignment where students think more deeply about their findings or propose continued investigation.

Got a whole unit or course?

For 9th grade and up, consider trying Voyant—a free web tool with powerful distant-reading capacities. Budget one day just for playing with the tool in a low-stakes way: perhaps comparing translations of three world epics: The Ramayana, The Odyssey, and The Aeneid. You can guide students through the myriad lenses the app provides, such as the microsearch, summary, or word trees, or you can appoint student teams to choose a lens, play, and report back to the group.

Where can I learn more?

“What Is Distant Reading,” from Peace Ossom-Williamson and Kenton Rambsy’s The Data Notebook, provides a great intro to the practice.

Interested in what distant reading can do for even short texts? Read Heather Froehlich’s experience in playing around with Kate Chopin’s “The Awakening.”

Ted Underwood’s “Distant Reading and Recent Intellectual History” and Lisa Marie Brody’s “Why I Dig: Feminist Approaches to Text Analysis,” both from 2016’s Debates in the Digital Humanities, offer some great scholarly context to distant reading.

Explore additional classroom ideas with the Resilient Educator’s thoughts on word cloud use to improve reading engagement and the Teaching Channel’s lesson share on using word clouds in the science classroom.

Interested in more guidance when playing with Voyant? Take a look at their tutorials or have your students review them and teach each other.


[1] At least we can do this investigation with the German texts that are part of Google Books’ collection of millions. Google’s n-gram viewer provides a quick search for words, phrases, parts of speech, and other text bits across their vast collection. But note that while they do have an impressive repository of texts in languages other than English, such as Chinese and Spanish, their sample search does play into Western cultural biases.

Annotate

Topics in Digital Pedagogy
Powered by Manifold Scholarship. Learn more at
Opens in new tab or windowmanifoldapp.org