2 0 1 0

"BUILD-A-BEAR" RESEARCH PLANNING: CLARIFICATION OF DEFINITIONS AS THE KEY TO CHOOSING OR WRITING PROGRAMS

(distributed as a handout at the 2010 Chicago Colloquium on Digital Humanities at Northwestern University)

Dr. Cora Angier Sowa
Minerva Systems
Croton-on-Hudson, NY
http://www.minervaclassics.com
casowa@aol.com


Abstract

Like the oral bard, who composed his poems by putting together traditional formulae, type scenes, and themes, the software developer creates applications by putting together reusable program "formulae" or inventing new ones based on old templates. A parallel can also be found in the "Build-A-Bear" toy stores, where children can choose features to form the desired toy. In every case, the "audience" (or customer or scholar) works with the consultant (or singer or "Bear Builder") to request features that yield the most agreeable or meaningful results. Often, the scholar thinks he or she has precisely defined his or her goals, but on closer inspection, this turns out not to be the case.


Composing a narrative

The oral bard created poems like the Iliad or the Chanson de Roland by recomposing them on the spot before an audience, like a jazz musician, using traditional verbal formulae (like "rosy-fingered Dawn"), type scenes (like "the Hero puts on his armor"), and themes (like "The Hero's Journey and Homecoming"). He could sing about traditional stories, like the Trojan War, or he could make up a song about a contemporary event. Depending on audience interest, the song could be longer or shorter. Milman Parry, in the 1930's, discovered living parallels in the oral bards of what was then Yugoslavia. They too, could adapt the old formulae to compose new songs about new topics, including a song about Milman Parry himself, quoted in Albert Lord's The Singer of Tales. James Notopoulos, in the 1950's, recorded oral bards still singing in Cyprus, the Greek islands, and in villages on the Greek mainland. They, too, could sing old songs about the hero Digenes Akritas or about their own (embellished!) memories of World War II. Notopoulos (with whom I studied) always emphasized the role of the audience in shaping the oral narrative.

A computer program is a narrative of how a problem is to be solved. We, too, use formulaic elements (add, subtract, read a file, create a table or tree structure, etc.) and "type scenes" (reusable subroutines).

Building your bear

We can compare the design of an application to the Build-a-Bear Workshop, a chain of toy stores (now also online) where a Bear Builder guides the child (or adult!) to choose and design his or her stuffed toy, using customizable elements. The "furry friend" is stuffed and stitched up while you are there. You choose your animal, which can be a bear or other creature. You choose a 10-second message or song, to be placed in the animal, which will play each time the toy is hugged. It may be a ready-made message or you may record your own. Then you stuff the animal and give it a heart. You choose a "heart," make a wish, and help stitch it inside the animal. You give your animal a name, and receive a birth certificate. You choose a costume and dress your animal, and take it home.

Clarification of definitions

In the development of applications for studying texts, a user comes to the consultant (such as myself) with a request like, "I want to do X, Y, or Z; will your program help me?" My answer is, that depends on how you define your aims. Usually, the description of X, Y, or Z appears to be very precise, but in fact is not. When computerization of literary material first got underway, the drive was to put everything in digitized form. Then, the question, as Greg Crane of the Perseus Project put it, was "What do you do with a million books?" Now, programmers have written thousands of programs, hoping they will be useful in analyzing texts, and the question is "Which one do I use (if any)?" Having worked in both the academic and commercial worlds, I have seen that in analyzing requirements for both commercial and literary projects, you have to figure out exactly what the user is driving at. In the excellent book How to Do Systems Analysis by Gibson, Scherer, and Gibson, Chapter 10 is called "The 10 Golden Rules of Systems Analysis." Rules 1 and 2 are,

  1. Rule 1: There always is a client. . .
  2. Rule 2: Your client does not understand his own problem. . .

Examples:

I have worked with two scholars, whose projects are as follows:

Spenser and Shelley: a study of intertextuality.

I have worked with Scholar 1 on finding or developing a program to find literary echoes of the type seen in the following lines, from Spenser's The Ruines of Time and Shelley's The Triumph of Life,

Fled back too soone unto their native place (-Spenser)

Fled back like eagles to their native noon (-Shelley)

When I asked the professor how he would precisely define what he was looking for, he simply quoted the lines again, and (at my request) others like it. I have communicated to him my own analysis of what we might look for to describe such echoes to a computer:

  1. Identical lexical items ("Fled back"; "(un)to their native").
  2. Identical phrases in identical metrical positions.
  3. Similar metrical shapes ("too soone unto"/"like eagles to"; "native place"/"native noon").
  4. Rhyming allusions ("soone"/"noon").
  5. Some combination of all of these, or of other factors.

The first item is the easiest. It could be handled by a program developed by Joseph Raben and by David Lieberman in the 1970's (in FORTRAN; it would have to be rewritten). As input, it requires a canonized dictionary of all words in the text. As it happens, my Minerva System includes such a dictionary and the means to expand it. It was developed for my Clump Finder, but can be reused.

Herodotus' description of the Battle of Marathon.

Scholar 2 corresponded with me about programs to study the language of Herodotus, exemplified by Herodotus' description of the Battle of Marathon. He is using an English translation to test his methods. His definition of his goal is as follows:

"Our interest is in the recognition and linguistic analysis of sentences that express historical events."

Again, this seems precise, but needs to be refined. In order to find (or design) an appropriate program or programs, I have to ask the following questions:

  1. What is meant by a historical event? A battle? Founding a city? An important political debate?
  2. How large a semantic unit fits the definition of "sentences that express historical events"? Does this only mean sentences that are part of a paragraph or chapter describing a single event? What about an event referenced in a single phrase that is part of a sentence describing another event?
  3. What kind of "linguistic analysis" is meant? Parsing? Semantic representation? Mapping to logic? Mapping to a database? Statistical analysis? Symbolic analysis? Relations between concepts?
  4. If the initial study uses a translation, only the factual part can be done, not the semantic part.
  5. What questions does his team want answered, and what kind of answers are they looking for?

The Minerva Project Planner

My Minerva System for Study of Literary Texts, which I have demonstrated at the Chicago Colloquia (and which contains a number of specific applications), now contains a Project Planner with screens to fill in. (The idea is borrowed from the Appointment Planners, Wedding Planners, etc. that let you fill in all the things you need to remember). These screens let you define your goals and subgoals, identify outputs (what do you want to get out of it?), and inputs (what tags or other information should the texts, databases, etc. contain?), choose appropriate programs, and identify who (principal scholar, student assistants, tech support) is assigned to do what when. Space is also provided for a final evaluation. After filling out all the screens, the user will have a complete Project File, describing the entire research project.