There might not at first seem to be much common ground between minimal computing and artificial intelligence in its post-GPT-3 incarnation. But as the frictionless interfaces and cheerful personas of generative AI chatbots obscure the energy, resource, and human costs of training large language models (see for example Kgomo 2025; Narayanan and Kapoor 2024, 144; Rowe 2023; and Schwartz et al. 2019) and big tech companies hoover up increasing amounts of data, shatter benchmarks, dominate competitors, and generate higher revenue, bringing minimal computing and its principles into the classroom provides a countervailing perspective to the “arms race” ethos driving developments in machine learning, and presents students with a framework for thinking critically about the AI-powered technologies that underpin our daily lives. Concomitantly, the need for AI literacy is becoming increasingly pressing (see Dimock 2020, 452–3; Mollick 2024, 45; Raley and Rhee 2023, 191; Willison 2024), as the rise of downstream problems such as misinformation indicate that many people do not necessarily understand that a sequence of probabilistically generated tokens is not in the same epistemological category as a statement whose reliability can be gauged by referring to a credible source. The confusion between categories is compounded by the rhetorical authority AI chatbots perform, and by the fact that the outputs of large language models (LLMs) reflect the limitations of the corpora on which they have been trained.
It is this latter problem of the relationship between the outputs of LLMs and the corpora they have been trained on that I take up in this essay, which presents a minimal computing approach to corpus literacy and how it can be incorporated into the classroom. While “corpus literacy” is used in the field of language learning to designate “the ability to use the technology of corpus linguistics to investigate language and enhance the language development of students” (Heather and Helt 2012, 417), I am using the term here in a broader sense to refer to questions such as where the texts comprising a corpus have come from, the principles governing text selection, and the preprocessing steps that have been applied to them. If students have only a vague sense that AI chatbots rely on “scraping the internet,” they are likely even hazier on the processes involved in turning texts into data. What are the practical steps and the curatorial decisions that go into constructing a corpus, and what are the implications of these decisions for the computational text analysis that follows? In a single semester, it may not be possible for teachers with limited resources to train a model, though there are ways to do this on a laptop that better align with the minimal computing principle of, as Roopika Risam and Alex Gil put it, “using only the technologies that are necessary and sufficient for developing digital humanities scholarship in … constrained environments” (2022), than with the gargantuan models which require new data centres to be built. However, focusing on a prior condition for generative AI, corpus construction, makes it possible to address these questions and give students hands-on experience in constructing a corpus themselves.
In what follows, I outline how I take students through the process of corpus construction in my Digital Humanities for Literary Studies course at the University of Edinburgh, a course I designed in 2014 and have been teaching to final-year undergraduates and master’s students since. I am fortunate to work at a well-resourced institution with a cohort of students who, since the early 2020s, bring both laptops and smartphones to class. One constraint is that no computer lab is available for teaching the course, resulting in a mix of Mac, PC, and Linux machines in the classroom. This means using web-based interfaces and nonproprietary software which can be used across operating systems. Over the years I have put increasing emphasis on assigning students secondary reading on the historical and technical background to digitization projects such as Google Books, an approach which aligns the course with the ethos of minimal computing not so much in terms of a set of methodologies but rather as “a mode of thinking about digital humanities praxis that resists the idea that ‘innovation’ is defined by newness, scale, or scope” (Risam and Gil 2022). It also resonates with Jentery Sayers’s (2016) point that minimal computing has more to do with the material particulars of computation than elements of minimal design such as plain text, simplified layouts, and pared-back interfaces, which have been prominent in thinking about minimal computing so far. As AI hype in mainstream media discourse persistently directs attention away from the labor and the materiality underlying forms of computational work toward the latest model, the next benchmark achieved, and the newest Rubicon crossed, consciously going back down the big data scale to the OCR and preprocessing of a single text, and turning back in time from the apparently imminent AGI (artificial general intelligence) singularity to situate contemporary practices in their historical context can itself feel like a radical move.
Thinking Critically and Historically about Corpora
Scholarly attention to corpus construction predates both LLMs and Google Books by some decades, as the body of scholarship on the topic within the discipline of corpus linguistics makes clear (see for example Biber 1993; Hardie and McEnery 2011, 2–22; Tognini-Bonelli 2001, 57–62). But while students in linguistics departments—often institutionally housed within social science or computer science administrative units—might be exposed to this body of literature, humanities students are less likely to encounter it. A reading which I assign in order to open up questions around the politics of corpus construction without the unfamiliar language of corpus linguistics is Tressie McMillan Cottom’s (2016, 542, 543–4) essay “More Scale, More Questions,” which elaborates the ways that assumptions are always embedded in the corpora on which so-called big data depends, and asserts the need for quantitative textual analysis to begin with an interrogation of the power relations, and the economic forces, involved in the construction of a corpus.
In preparation for querying the Ngram Viewer (a free online interface allowing users to search and visualize how frequently specific words or phrases have appeared in books over time) and eventually constructing their own corpus, students are assigned three essays by Robert Darnton published in the New York Review of Books on the Google Books digitization project. Writing in 2008, a few years after Google began digitizing the holdings of large research libraries and public libraries, Darnton was initially enthusiastic about the project’s potential to widen access to books, calling it “the ultimate stage in the democratization of knowledge set in motion by the invention of writing, the codex, movable type, and the Internet” (Darnton 2008). But he also raises salient—and prescient—concerns, for instance around the obsolescence of electronic media, the potential imperilment of digitized books should Google’s corporate fortunes decline, errors made while scanning, practices that deviate from the standards established by bibliographers and thus impede discoverability, and the potential for texts which are not digitized to become less visible and thus perceived as less important. In the second article, published in 2009, Darnton sets the Google Books project in the context of the Enlightenment and the economic imperatives shaping the dissemination of scholarly knowledge in both the eighteenth and the twenty-first centuries. If a certain level of privilege was required for individuals to participate in the burgeoning intellectual networks of the Republic of Letters, so too in the digital era economic inequity inflects access to knowledge when, for instance, publishers increase journal subscription prices beyond what university library budgets can bear. Weighing the possible harms of putting so much power, and control over so much information, into the hands of a single tech company, Darnton points out that as Google is a profit-making enterprise, its motivations in preserving books will inevitably differ from those of libraries, meaning that librarians “cannot sit on the sidelines, as if the market forces can be trusted to operate for the public good” (Darnton 2009). The third article, from 2014, seeks to “imagine a future free from the vested interests that have blocked the flow of information in the past” (Darnton 2014). It puts forward some mechanisms by which access to digitized books and digitized cultural heritage can be opened up: the use of preprint repositories, library consortia negotiating sustainable pricing structures with publishers, and the creation of institutions such as the Digital Public Library of America, which widens access to digitized holdings in libraries across the US through a distributed structure.
Darnton’s perspective as a book historian and librarian enables him to—in line with McMillan Cottom’s exhortation—cast light on the economic underpinnings of the Google Books project, the data which the Ngram Viewer visualizes, and in the process complicating the view, common among students, that information “wants to be free” and that the costs of making it accessible online and preserving it in perpetuity are negligible. Taken together, his three articles not only situate Google’s digitization efforts in a historical line of other attempts to disseminate—and monetize—access to information, and emphasize the material conditions giving rise to those efforts, but also make clear what is at stake in a mass digitization project of this sort. There are many other large digital corpora to which there is only time to gesture briefly, for example HathiTrust and Chronicling America, along with “shadow libraries”: pirated archives of copyrighted publications such as Library Genesis (see Eve 2022). Choosing to focus on Google Books via Darnton’s critical eye is an attempt to temporarily pause students’ automatic recourse to the Google search box, and to help them see that behind the frictionless process of searching there are multiple operations with a great deal of friction built in, which have unfurled over time and which have economic and legal ramifications. As the three articles predate questions that rose to public prominence in the early 2020s around the ethical, labor, and copyright implications of the web-scraping tactics of the big tech companies building LLMs, they demonstrate to students the historical continuities to be found between the current moment of AI hype and previous points when tech companies have forged ahead with ingesting large amounts of text without paying sufficient attention to established principles from fields such as bibliography and information science.
Ngrams, OCR, and Data Literacy
Students now get to encounter the Google Books data—along with some of its problems—for themselves, by exploring the Google Ngram Viewer. They begin this section of the course by reading the canonical “culturomics” paper accompanying its launch (Michel et al. 2011), alongside an essay conveying the sense of excitement that the Ngram Viewer generated on its release (Cohen 2010).
The Ngram Viewer has a number of pedagogical benefits. Importantly, it is fun: students can quickly devise queries relating to their own interests. The site is fairly stable, as the ngrams are precomputed. The search box allows both for simple keyword queries and more advanced queries that use a specific search syntax, for instance grammatical tags such as “searchword_ADJ” to find adjectives. For students with little or no exposure to more advanced search strategies, such as using Boolean operators, this is a step up in search literacy which is not too intimidating, and which can be readily grasped via a search syntax cheat sheet by Alan Liu (2022).
Alongside its ease of use, the Ngram Viewer also offers multiple ways into the unreliability of the Google Books corpus, bringing to light problems such as inaccurate publication dates and OCR errors. The concerns Darnton raises thus come to life on students’ screens, revealing some of the errors and fallibilities that underlie what might initially seem like the trustworthy infrastructure of Google. Crane is another useful reading here, as he goes into more detail on the types of noise found in massive digital libraries, which for example make books in historical languages like classical Greek essentially unsearchable (Crane 2006). Talking through how and why these problems occur is part of what the course seeks to teach about data literacy: when students encounter oddities such as unexpected dips or bumps in an Ngram plot, they are encouraged to return to the source—the scans of individual books—to investigate the possible causes.1
One type of oddity students encounter in their exploratory play with Ngrams is plots that shift between jagged and smooth, and these provide opportunities for statistical literacy. A simple plot such as the one in Figure 1 for a search for “colour” and “color” over the past four hundred years gives no explicit indication of the smaller amount of data (i.e., the number of published books) available from earlier centuries. Students can be shown that the blockiness of the lines pre-1850 compared with their relative smoothness post-1850 hints at how many more books were published in the second half of this historical period and, correspondingly, the greater reliability of the data from that period. This can lead to further discussion about what the smoothing function in the Ngrams interface might obscure or distort.
Another problem that students notice, as they try to make sense of the plots, is unexpected shapes that, on closer inspection, turn out to be metadata errors. As of this writing, a search for “Smashwords”—the publisher and distributor first launched as an ebook publishing platform in 2008—shows a bump in publication dates in the 1940s (Figure 2). Following through to the scanned books reveals numerous metadata errors in Smashwords books, including incorrect publication dates (Figure 3).
Searches for “ebook” and “e-book,” meanwhile, show a spike in publication dates around 1900, thus demonstrating not only that caution is needed with metadata in Google Books data, but also that publication dates may be estimated or rounded to the nearest decade or century in ways that produce anomalies, as seen in Figure 4.
OCR errors also demonstrate to students the prescience of Darnton’s concerns that Google’s bibliographic capacities would prove somewhat lacking compared to those of professional librarians. Running a search for the common OCR error “tlie” from 1600 to 2022 shows how books in the earlier half of this period have considerably poorer OCR (Figure 5). Clicking through to view page scans of those books (Figure 6) illustrates the more uneven, blotchier printing, as well as features such as ligatures and the medial s, which are not as accurately classified by OCR engines trained on modern typography.
Clicking through from a scanned book that contains the “tlie” OCR error to Google’s metadata page reveals further errors. The title is given as Old English Drama: Students’ Facsimile Edition · Volume 100 (when it should be The Miseries of Inforst Mariage), the publisher as John S. Farmer (when it should be George Vincent), and an original publication date of 1598 (when it is 1607). The correct publication metadata is easily ascertained for this book by digitally flipping through the scanned pages, but for books whose full page scans are unavailable, checking the accuracy of the metadata is much more laborious.
As with search, OCR provides an example of a technological advance whose operation has become so frictionless that it can be hard for students to see it as a process with a history. As Ryan Cordell points out, treating OCR as an automatic process elides a substantial body of research by computer scientists: “[w]hile OCR certainly automates certain acts of transcription, it does so following constantly-developing rules and processes devised by human scholars” (2016). The human labor and expertise behind OCR was, for several years after 2019, brought vividly into visibility for students when I was joined by a coteacher, Dr. Bea Alex, a computer scientist and linguist who gave students firsthand insights into her work on historical OCR for the project Plague Dot Text (Casey et al. 2021). As Alex’s account of her work on this project made clear, actions that might appear to an end user to be well aligned with minimal computing principles—for instance uploading a scanned image file of text into Google Drive and having the OCR immediately returned—are the result of processes that are energy- and compute-intensive and that required the deep expertise of many computer scientists for their development. While it is difficult for any user of digital technology to have their hands entirely clean in this respect, focusing students’ attention on the extent to which ostensibly minimal computational processes often obscure their more resource-intensive aspects is a valuable component of both minimal and “maximal” DH teaching.
These tasks could be seen as having drifted away from the territory of minimal computing toward that of book history and science and technology studies (STS). However, this material provides essential background knowledge for students coming from literary and historical disciplines, before they build their own corpus. It is also a part of the course well future-proofed against changes: the history of Google’s digitization project and the still-visible evidence of its errors will “stand still” in a course in which much else needs to be updated year after year.
Digital and Social Infrastructures for Coconstructing a Corpus
Having gotten some insights into how the corpus sausage is made, students now work together to construct a corpus of their own, the analysis of which will later form the core of their final project. As students do not have access to a digitization suite or a scanner, I supply each with several dozen image files of pages for which they are individually responsible for OCRing, proofreading, and then saving—with the correct metadata—to a secure online repository. Performing OCR used to require applications such as Adobe Acrobat or ABBYY Finereader, but with the advent of reliable machine learning-powered text recognition on some mobile phones from 2021 onward, most students are now able to do OCR with devices that came into the classroom in their pockets.3 On phones where this is not an inbuilt feature, scanning apps can perform the same function, and students without a smartphone can also upload PDFs to Google Docs and use its built-in OCR. On the one hand, these tools represent a considerable efficiency gain for boutique classroom OCR projects and a welcome workaround in place of expensive proprietary software. As Risam and Gil observe, working from the principle of using what we have encourages students “to focus on the assets available to them and thus resists a deficit mindset for those of us who are working under constraints” (2022). However, this part of students’ workflow also exemplifies one of the points of compromise for minimal computing principles: using tools from big technology companies, the development of which uses considerable resources and may, in addition, be underpinned by unethical and harmful practices.4 Acknowledging the difficulties of extricating oneself from the big tech companies, Risam and Gil argue that a complete divorce from maximal forms of computation is currently impossible, and so dependence on big tech and social media companies’ infrastructure will remain in place for the foreseeable future. This enmeshment with the products of big tech in our everyday lives does, however, offer opportunities for class discussion: later in the semester students read Kashmir Hill’s (2019) sobering account of how difficult she found it to divest herself of the products of the five largest tech companies, both in terms of her work as a technology journalist and the practicalities of everyday life. If those who teach digital humanities cannot—at least for the moment—avoid using digital infrastructures whose low costs, reliability, and ease of use make them a logical choice where time is limited and students’ access to the latest technology is unevenly distributed, this tension can at least be used to prompt critical reflection on the compromises involved in using such technologies.
Once students have copied the text produced by running OCR on their images and pasting the results into a text file, they now need to correct it, a task that some assume will involve little more than proofreading. As we undertake it in class, however, students begin to raise questions: “Should I put two hard returns between paragraphs, or indent them?”; “How are people representing the start of a new chapter?”; “Should I put spaces between each of the three dots in an ellipsis mark?” These might appear to be trivial formatting matters, but—as the students learn later when they begin querying the corpus with a concordancer—decisions about seemingly unimportant characters such as periods can have consequences when going beyond simple text searches to queries using wildcards and regular expressions. When I hear these kinds of consequential formatting problems coming to light in students’ discussions, I have taken to letting a few slip through the net, so that students discover for themselves at a later stage the importance of standardization across multiple contributions, and learn how to go back through the finished corpus to address those inconsistencies retroactively. Thus, though this task appears to be an individual one, it inducts students into what might be the most crucial skill of all for digital humanists to develop: working collaboratively with others. Used to writing essays on their own, or contributing in atomised ways to group projects, students learn that they need to check in with their fellow OCR correctors, keep a record of group decisions, and adapt their own practice accordingly.
Beyond the exigencies of standardization and the importance of consulting one’s collaborators to develop standards everyone can agree on, there are other obligations owed to one’s groupmates. As the task progresses, some students complete their OCR correction earlier than others and want to move on to the analysis, but no one can advance until the whole corpus is ready. Students thus learn how important it is to meet deadlines and to carry out their assigned tasks so as not to hold others in the group up. Here, project management tools can be useful, especially for accountability.5 As students finish their files and need to store them somewhere accessible and secure, emerging questions such as “Who OCRed this section?” and “Where can I find the latest version of this file?” reveal the importance of considerations of file-naming conventions, logical directory structures, accurate metadata recording, and version control.6 These considerations—which tend to be new to the humanities students in the class—show how data infrastructures are essential to the functioning of even the most minimal of digital projects. These data management lessons hold across operating systems and applications, and they are transferable: students report carrying them into their other classes and their capstone dissertation projects. Moreover, having to confer on file-naming conventions, directory structures, and so on is another way of putting the friction back into what would otherwise be largely frictionless processes. While I take care not to make the volume of work too onerous—choosing the length of the text to be digitized based on the number of students signed up for the class—one of my aims is for students to appreciate that, done properly, the creation of digital artifacts is laborious. Time, human labor, material resources, and other considerations7 must all be weighed when assessing how and by whom a corpus has been built, which goes well beyond the labor of authoring the texts that constitute it.
Close Reading Through Data Cleaning
The idea that “the best way to get to know your data is by cleaning it” is likely a familiar one to anyone who has built a corpus from scratch. Accordingly, a benefit that flows from students doing OCR correction and proofreading is that they get to know their allocated section of the corpus well. Inspecting digitized text carefully for errors can function as a form of close reading, something which is useful preparation for the analytical work that follows. For instance, a student might spot something of interest in their allocated section and develop it into a query to be applied to the corpus as a whole. Care needs to be taken not just with cleaning and curation decisions, however, but also with the notion of cleaning as a process separate from, and subordinate to, the analysis. As Katie Rawson and Trevor Muñoz point out, the paradigms and practices that are gathered under the heading of “data cleaning” too often go unspecified in humanities work, and explicitly articulating these is important if one wants to work with data “without risking the foreclosure of specific and valuable humanistic modes of producing knowledge” (2019, 280). The work of data preparation itself incorporates cultural critical practices, they point out, and separating these from the work of analysis risks reinscribing the binary between cultural criticism and data analysis.
Rawson and Muñoz (2019, 282) illustrate this point via a case study of their efforts to “clean” a data set of food items listed on digitized historical menus, where variant spellings and orthographic conventions led them to seek ways to standardize labels for dishes, and in the process to discover that the data model around which the data had originally been organized was different from the one they needed to answer their research questions. Thinking about a data model is a crucial part of the intellectual work of analyzing and representing data, but asking students to conceptualize a data model’s relationship to data preparation decisions, research questions, and, eventually, the argument they want to make about their corpus can be forbiddingly abstract. However, if data modeling is a bridge too far for students new to the field, then preparing a corpus at least puts them in a position to see how decisions made at the level of cleaning (e.g., choosing to standardize variant spellings) and structuring (e.g., adding part-of-speech tags) will have effects on downstream queries, such as the ability to extract toponyms along with their collocating prepositions in order to map these as part of an argument about a corpus’s spatial imaginary.
Once students have their corpus—with uncorrected OCR files in one directory, corrected OCR files in another, and meaningful file names consistently assigned—they get to experience the satisfaction of knowing that they have, unusually for a course within a literature program, built something tangible. Much can be done with that thing, analytically and pedagogically, but even before any of that happens, something else has been achieved: a nascent community of practice. This is especially valuable in the face of the widely reported post-COVID decline in student engagement observed at universities across the globe (see, for example, Grove 2024; McMurtrie 2022; and Otte 2024). Meaningful collaborative activities like this—where no one can query the corpus unless everyone pulls their weight in building it—are a way of keeping students engaged with a course and accountable to each other. Scaffolding students’ ability to work collaboratively in what remains a resolutely individualist discipline—English literature—and in the specific context of the UK university system, whose relatively low contact hours mitigate against group projects, is not only valuable in itself (as illustrated by Croxall and Jakacki 2023; Ermolaev et al. 2024; and Kim 2024, among others), but is also well aligned with the ethos of minimal computing, given that these are skills for which no computing is required at all.
Returning to the analytical opportunities presented by a hand-curated corpus, these can also be explored in minimal ways. In class, we use the concordancer AntConc (Anthony 2022), an application which is free to download, supported on multiple platforms, relatively lightweight, and which has robust documentation. Students build on what they learned earlier about search syntax and Boolean expressions with the Ngram Viewer and take their searching up a level, for instance by specifying their own operators, which AntConc allows users to customize if they want to override its defaults. This introduces them to thinking about text searching at a more complex level than keywords or phrases, which can be connected to wider literary and theoretical ideas they have encountered in other parts of their degree. My go-to example is to have students compose a search string to investigate gendered pronouns before and after particular verbs, for instance, and then to point out that the verb to look and the greater prevalence of he looks/looked/is looking at her compared to she looks/looked/is looking at him is an illustration of the male gaze (see Mulvey 1975). Pursuing investigations of their own among the keyword in context (KWIC) lines and collocation tables that AntConc allows them to generate, students can then take these insights and incorporate them into an argument, which they will present as part of their final project for the course.
These text analysis techniques are thus at a small enough scale to be manageable on a laptop. Importantly for a course located within an English literature degree, the quantitative analysis is done in conjunction with close reading: we think about how to move between the two modes, and how readings at scale might be informed by, and integrated into, scholarly articles on the primary texts. Rather than succumbing to the move toward ever-bigger data, the aim is for students to begin to be able to see the value of producing and working analytically with structured data. Even a basic level of structuring and a small dataset can be useful. Alphabetizing KWIC lines delivered by a concordancer can speed up the process of identifying variants in place names, for instance, which is more feasible than writing scripts to perform named entity recognition, as it requires a level of coding expertise that is difficult to achieve in a single semester if students have no programming experience.
Conclusion: Building the Foundations for AI Literacy
The more seamlessly processes such as OCR happen behind the scenes, the more important it is to teach students that they are activities which require labor, time, and resources, which were built in particular ways in specific times and places, and which will therefore have cultural, linguistic, and other biases embedded within them which reflect the specificities of their construction. Students’ experience of Google is that its suite of products manages emails, calendars, photos, documents, and much else seamlessly, but as Alison Booth and Miriam Posner put it, the errors and glitches in the scanned pages of books it has digitized serve “to peel back the glossy outer layer of Google Books to reveal an enterprise far from omnipotent” (2020, 15). Such reflections might be old news to those who have been working in digital humanities for some time, but are likely not so obvious to those whose habits of technological use have been formed in an era where IT infrastructure is, for the most part, the wallpaper of daily life: simply there, usually functional, and not something to be interrogated. Prompting such interrogation by pointing students to OCR errors—and asking them to do OCR themselves—would ideally go beyond following the principles of minimal computing to a recognition of the Anglophone and US biases that so frequently underlie digital infrastructures: coding languages whose putative human readability extends only to those who read English and in which American English is the default, applications that break when nonstandard characters such as accented letters are used, the dominance of left-to-right languages, and more.
To return to the bigger picture of AI literacy: this is a forebodingly large thing to take on in the classroom, not only because LLMs are generally perceived as black boxes, but also because engaging critically and mindfully with them requires knowledge that extends across disciplinary boundaries. As a first step, however, understanding how and what is at stake when textual corpora are assembled can move us toward a better grasp of how the corpora on which LLMs are trained may be partial or flawed, and of the materiality and labor underpinning the otherwise ethereal experience of conversing with an AI chatbot. With a single text or a small number of texts, students do not directly address for themselves the question of selection and what is excluded, underrepresented, or overrepresented in a corpus, but trying out a tool like Ngram Viewer and reading scholarship on corpus selection can help to show them the implications of those choices at a larger scale.
In this way, even a beginner-level digital humanities class can provide students with the opportunity to start to unfold some of the sociotechnical assemblages that constitute what is commonly referred to under the sign of “AI,” and that have produced the corpora on which generative AI and other algorithmic technologies are based. The STS scholar Lucy Suchman (2023) underscores the importance of this work of unfolding, urging us to talk about AI not as a static, singular thing, but rather in terms of the processes, material histories, actors, and other components that work to constitute it. The activities I have described above relate to only a few of the many practices and processes that need to be examined for this ongoing work of conceptual dereification to proceed, but they are one place to start with AI literacy, especially in a context where tech companies are not forthcoming about the specific texts that they have ingested into their training corpora. Coupling minimal computing principles to literary- and book history-aligned instantiations of DH in order to put some of the friction back in to corpus-building is one thing we can do to better equip students to see through the rhetorical ferment of AI hype which drives the idea that bigger, better, and faster is the only way forward, and to resist technocapitalist imperatives to generate ever larger profits, use greater amounts of energy and water use, and naturalize the idea that computing needs to involve the disproportionate consumption of resources.
