Reading and Writing Electronic Text

Reading and Writing Electronic Text (Schedule Spring 2024)

Syllabus here. Readings should be generally available on the web, unless otherwise indicated. If you’re having trouble accessing a resource, try the NYU Library Proxy (it’s very easy to set up). Please contact me ASAP if you have trouble accessing any of the readings.

Items marked as “intertexts” are related work (artworks, poems, etc.) that are in dialogue with the readings and concepts presented in class. You’re encouraged to review and engage with these works.

Please use this form to turn in your homework assignments.

NOTE: If at any point you’re having trouble running the example code, try using the code/notes on Binder instead.

Session 01: Strings and expressions

Date: 2024-01-25.

Assignment #1

Due at the beginning of session 02.

Many well-known poetry generators, especially those from the beginning of the computational era, work by drawing randomly from a limited vocabulary of strings and composing those strings using straightforward string manipulation techniques (like concatenation, splitting, joining, indexing, etc.). Create your own poetry generator using these techniques. Use one of the generators implemented in this notebook as a starting point for your creation. It’s okay if at this point you’re just copy-and-pasting code! You should be able to identify the parts of the code you can change without breaking things. (But also don’t be afraid to break things.)

In your documentation, discuss the poetry generator you created and how it differs from the code it was based on. Why did you choose the vocabulary that you chose? Does your generator succeed in creating output that surprises you?

Reading assigned

To be discussed (briefly) in session 02.

The short chapter from Hartman’s Virtual Muse sets up a theoretical framework for understanding the appeal of computer-generated poetry: juxtaposition. Consider Hartman’s thesis in the context of the poem generators you modified in this week’s assignment. “Bots Should Punch Up,” on the other hand, presents a moral framework for computer-generated texts (and algorithmic action in general).

Optional, on that Pound poem cited in the Hartman chapter above:

Session 02: Lists and lines

Date: 2024-02-01.

Reading assigned

To be discussed in session 03.

How (if at all) is randomness uniquely suited to creating surprising juxtapositions in poetry? How does a procedural technique reflect the intentionality of its creator? What effect does the choice of source text have on the “output” of a procedural technique?

Optional but recommended:

Session 03: Lists and lines, continued

Date: 2024-02-08.

Assignment #2

Due at the beginning of session 04.

The digital cut-up. Create a notebook program that reads in two or more texts and stores portions of them in Python data structures. The program should create textual output that creatively rearranges the contents of the text. Use functions from the random module as appropriate. You must use lists as part of your procedure. Choose one text that you created with your program to present in class.

Intertexts

On cut-ups.

Session 04: Keys and values

Date: 2024-02-15.

Reading assigned

To be discussed in session 05.

These readings concern the relationship of form, content, and affordance. What is a poetic form? To what extent are form and content independent? Does a particular subject matter or phenomenology demand a particular form? What kind of forms are procedures effective at implementing (or emulating)?

Session 05: Grammars

Date: 2024-02-22.

Assignment #3

Due at the beginning of session 06.

The genre study: Choose a genre or form of writing. This can be anything from a literary genre to a meme format. Gather a small number of exemplars of the genre and make a list of the characteristics that are common to all of them. Write a program that produces texts that emulate a particular form or genre of language.

Intertexts

On genre and poetic form.

Session 06: Natural language processing

Date: 2024-02-29 “Leap Day Spectacular”

Reading assigned

To be discussed in session 07. The Bucholtz reading addresses the process of turning the speech event into text, and how there is no such thing as a perfect transcription. Reflect on the importance of this for computational models of language, which must always begin with some kind of transcription. Drucker presents a criticism of computational text analysis (often called “distant reading”) based in a distinction between “reading” text (“self-production and subject enunciation”) and “computationally sorting” text. Do you agree with this distinction? Does “distant reading” in fact “relieve us of the task of reading”? Soria relates the difficulty of making language digital to begin with—if the language in question is not already politically and economically entrenched.

Recommended:

Tatman’s talk illustrates how easy it is to unintentionally build “social category detectors” with text analysis that then serve as tools for discrimination and gives some guidelines for avoiding those outcomes in new systems.

Intertexts

On transcription.

Session 07: Distributional semantics and vector representations of language

Date: 2024-03-07.

Intertexts

Assignment #4

Due at the beginning of session 08.

The digital cut-up revisited. In assignment #2, the tools available to you for cutting up and rearranging texts relied only on information present in the character data itself. Since then, we’ve learned several methods for incorporating outside information concerning syntax (i.e. with spaCy) and semantics (i.e., word vectors) into what we “know” about a text in question. Adapt your original digital cut-up assignment, making use of one of these new sources of information. What new aesthetic possibilities are made available if the unit of the cut-up can be a type of syntactic unit (instead of words, lines, characters), and if stretches of text can be algorithmically selected not at random, but based on their meaning?

Session 08: Language models, part 1

Date: 2024-03-14.

Reading assigned

Hartman and Bender et al. are opposite ends of the history of language models. Hartman explores the creative potential of Markov chains (a simple form of language model), concluding that their value partially lies in “the wickedness of exploding revered literary scripture into babble.” On the other hand, Bender and her colleagues outline the ways that large language models like GPT-3 can contribute to inequity in society and externalized costs like carbon emissions that cause climate change. What are the poetic potentials of language models? Can language models be used ethically, and under what conditions? Karawynn echoes many of Bender et al.’s criticisms about LLMs, and speculates that the phenomenon of attributing “intelligence” to LLMs is specific to cultural attitudes about the relation of language and intelligence. Golumbia’s piece is a provokation; let’s be provoked by it together.

Optional but recommended:

Intertexts

Session 09: Language models, part 2

Date: 2024-03-28.

Assignment #5

Due at the beginning of session 10.

Use predictive models to generate text: either a Markov chain or a neural network, or both. How does your choice of source text affect the output? Try combining predictive text with other methods we’ve used for analyzing and generating text: use neural network-generated text to fill Tracery templates, or train a Markov model on the output of parsing parts of speech from a text, or some other combination. What works and what doesn’t? How does neural network-generated text “feel” different from Markov-generated text? How does the length of the n-gram and the unit of the n-gram affect the quality of the output?

Session 10: Words and sound

Date: 2024-04-04.

Reading assigned

To be discussed at the beginning of session 11. These readings address how the way words sound can be deployed for poetic and rhetorical effect, the role of sound symbolism in language and literature at large, and how sound is represented computationally.

Optional, on sound symbolism:

Optional, on expressiveness and identity in the practice of spelling and speaking:

Session 11: Neighbors, clusters, classification

Date: 2024-04-11.

Session 12: Applications and leftovers

Date: 2024-04-18.

Session 13: Final project presentations, part 1

Date: 2024-04-25.

  • Final project presentations.

Session 14: Final project presentations, part 2

Date: 2024-05-02.

  • Final project presentations.