Support

Support Options

Submit a Support Ticket

MMDA Transcription and Annotation Guidelines

These guidelines for creating MMDA transcriptions in MS Word (for subsequent conversion into TEI XML) are still in the process of being developed. If you are assisting with the transcription or annotation of a notebook, contact Cristanne Miller or Nikolaus Wasmoen for guidelines with more explicit examples and instructions.

General principles:

Our central aim is to produce a clear and accurate transcription of the words written on the page. We are not attempting to recreate the exact visual layout of the original document.

Numbered lines contain only: authorial handwriting, superscript numerals pointing to footnotes, and [tag] markers, as specified below.

In your initial transcription, editorial commentary or description should appear nested at the appropriate line-, page- or image-level depending on the scope of reference and type of information:

1) in an editorial comment tag bracketed with [!  !] to indicate that the commentary is not part of the transcription and is for internal use only;

2) in an [imagedesc] tag to indicate that it describes the image or the page;

3) in a footnote located within a numbered line of transcription, nested after the most immediate point within the line to which the footnote refers.

The word "image" designates the digital copy of a MM notebook or manuscript page, by number—e.g., image 001 would refer to the digital representation of the first image of a manuscript, which may contain one or more "pages" depending on how the notebook is structured and how it has been photographed. Many "images" contain a facing verso and recto opening. Those images will generate two "pages" of transcript, whereas some notebook images only contain one "page."

I. Specific Guidelines for Transcription:

1) Save each transcription file in MS WORD .docx format: title of file should be: [Number].[Genre].[Date(s)].[initials of editor].docx: e.g. "07.03.11.Misc.1929-1940.CM.docx."   

2) Transcribe a row of authorial script as one numbered line—each row of script receives a new line preceded by a bracketed numeral followed by a single space, e.g. "[01] Text of Line number 1." Where MM leaves white space between rows of print, leave a blank numbered line. See VI ( ) below. 

3) Each notebook page will constitute a transcription "page" (whether it continues onto more than one WORD.docx page or not).

4) We do not reproduce non-authorial marks or pre-printed words, although in specific cases if this content might be crucial to transcription or to understanding the manuscript writing we could address it in an editorial image description (see VII) or an annotation.

5) Omit stray marks that have no relevance to punctuation, a word, or to underlining, cancellation, or other authorial mark with clear relation to the text.

6) Representation of a notebook "line": type everything that seems to be on the same row of script as one transcribed line, including brief material that slants off at the end of a row of script (if there are more than a few words, this needs to be described as a vertical line; see below).

7) Illegible words should be marked as question marks within brackets:

a) [????] = 4 contiguous characters that are indecipherable;

b) [??] [????] = 2 indecipherable words, of two and four letters, respectively. Use your best guess for the length of the word in characters

c) Bou[???]= word beginning "Bou" followed by 3 unclear characters; flo[??]d = word with two unclear characters in the middle.

d)If you can make a guess, but are not sure, enclose the guessed at part in an [unclear] tag. e.g. [unclear]When Pigs Fly[/unclear].

e) If you want to suggest multiple variants or alternative readings, just add [alt#] tags inside [unclear]. e.g. [unclear][alt1]When Pigs Fly[/alt1][alt2]When Pugs Flow[/alt2][/unclear].

f) MM frequently rewrites a word to clarify or writes over an initial word with a different one. Ignore simple rewriting, especially if it is of a single letter or if the rewritten word is entirely clear. If the rewriting raises questions of clarity, mark the second instance as an addition using an [add] tag: e.g. [add]rats[/add]; typically she will have deleted the first instance, which should also be noted using the [del] tag: e.g. [del]rats[/del][add]rats[/add].

8) IF a letter is ambiguously capitalized, assume what is syntactically or bibliographically appropriate (where there is no source text to check from). Ditto for punctuation.

9) Put annotation in MS Word footnotes.

10) Where MM uses square brackets in her text, describe this below the numbered lines for the page, as one would describe any other non-representational marking (see VIII).

e.g. [imagedesc]MM draws square brackets around the words on line 23.[/imagedesc]

11) We are not creating a genetic edition or a full record of Moore's composition process, our aim is to provide a transcription of the words as they appear in the facsimile. As a result, we generally do not designate the precise timing or grouping of specific additions/revisions (i.e., whether material was added concurrently or after initial composition of a passage). We assume that readers can see the transcribed page, including MM’s use of different writing implements. If, however, MM writes over a word that is still legible, use [del] and [add] tags to record all of the visible words or characters: e.g. th[del]e[/del][add]is[/add]. In this case, MM has written over the "e" to change "the" to "this." In cases where the editor feels it is necessary, additional clarification may be provided in notes.

12) Misspelling: leave incorrect spelling and MM’s graphic errors without comment (no "sic" or correction). E.g., she often omits the "e" on words that begin "ex": xist [exist], xcel [excel], and so on.

II. Punctuation

1) Make all MM dashes em-dashes. Do this whether MM’s own dash is long or short and regardless of how much space is left on either side of it. An em-dash has no space preceding or following the punctuation.

2) Do not correct MM’s misuse or absence of punctuation. E.g., she writes many contractions without an apostrophe: dont [don’t] or cant [can’t].

3) In annotation, where syntactically a period should follow an initial, use only one period, e.g., at the end of a sentence, the initials U.S. should be written "U.S." not "U.S.."

III. Metadata and Bracketed Keyword tags

Transcriptions should be done using plain text, written in standard characters, except for the MS Word footnotes used for annotation. To facilitate later TEI XML encoding, we use bracketed typographical tags to add structure, styling, and other metadata to the transcription files. All metadata appears in square brackets surrounding simple keywords or abbreviations, "tags," that correspond to what later are transformed to conform with the MMDA's TEI XML encoding schema. Eventually these tags will be available within our online editor tool so that we don’t have to enter them manually each time.  

Some tags will be "empty," meaning that they do not contain any "child" text or elements. An example would be [image=0012], where the tag does not have a separate or second closing set of brackets.

For tags that contain child text or child elements, there will be an "open" and a "closing" set of brackets: e.g. "[source]MM’s naming or citation of a source [/source]" where "[source]" indicates the start point of the citation or title and "[/source]" indicates the end. The closing tag always starts with a slash "[/tag]."

You can embed multiple tags within each other, such as "[add][???]tilus[/add]" where the uncertainty tag, [???] and the bit of text that comes after it "tilus" are both contained within the same parent element ("add"). This means that MM added a word that begins with 3 unclear characters and ends with "tilus."

You cannot have tags that contain the opening or closing of their own parent element: for example, "na[add][?????[/add]s]" would be invalid, because the uncertainty tag is both a child and a parent of the [add] tag. The correct way to handle this case could be to nest the tags like this "[add]na[?????]s[/add], indicating that a word was added that begins with "na" followed by 5 illegible characters and then "s."

Do not add spaces between bracketed tags and authorial text where none exists in the original document, or these will be interpreted as intentional spaces within the subsequent TEI XML encoded transcription.

IV. Scansion

To represent Moore’s scansion of poetry, use the WORD prosody mark or a hyphen ( - ) for unstress and accent aigu (ˊ) for stress on line above the actual language.

e.g.

    -      ˊ     -      ˊ        -       ˊ    
The frog is green and black.

V. Spacing

MS Word spacing between words will have to be reinserted after translation into TEI XML (because Word spacing does not register as a space in TEI XML translation), try to approximate the spacing generally on the page for ease of reference in proofreading.

Verse Indentation

In transcribing poems MM quotes (that is, not her own drafted lines in poetry notebooks), mark a passage with consistent indentation or spacing with [space=#] tags:

e.g.

[01] [space=5]This is all
[02] [space=5]she wrote in
[03] [space=5]this indented passage.

Vertical spaces

Where MM skips 1 or more lines in writing, indicate the spaces by leaving a blank line on your page; the example below shows a skipped line (vertically) and a major indentation, approximately indicated, horizontally:

e.g.

[12] [color=pencil]rational at the time[/color]
[13]
[14]                     Spectator

VI. Editorial Comments

Editorial comments and any child elements or text they might contain will not be published or displayed, but may be useful if you want to call attention to something for a later proofreader or encoder who will be interpreting your transcription, or to keep notes for your own use during transcription and annotation. Insert a comment of any length at the appropriate spot in the transcription in the following form.

e.g.

[! This is how to write a comment. !]

[! This is how to write a comment
that spans more than one line. !]

VII. Notebook Level Metadata

These fields will be part of the Template file that you open when you are starting a new notebook transcription. You will only fill out the Notebook Level Metadata fields once for each notebook transcription file.

NotebookID

A standardized MMDA notebook ID number, using the Rosenbach numbering system. Do not use roman numerals.

e.g. [notebookid=07.03.11]

NotebookStartDate and NotebookEndDate

The start/end dates of the notebook in format YEAR.MO.DY.

e.g.

[notebookstartdate=1929.06.26]
[notebookenddate=1940]

If Date or Month are not known, just fill in what you do know. In this example, the exact start date is known, but the end date is only known to the year.

NotebookGenreDefault

The primary genre type of the notebook. If a poetry notebook contains some reading or conversation notes, it would still be described generally as a "Poetry" notebook. A notebook is "Miscellaneous" only if a majority of images contain more than one genre or switch genres.

Common Notebook Level Genre Codes (more to be added as needed): 

r = reading 

misc = miscellaneous 

conv = conversation

p = poetry

l = lecture

concert = concerts

fin = financial

t = travel

e.g.

[notebookgenredefault=r] indicates that only genres besides reading notes would be tagged in this notebook.

To switch the genre within a notebook, use a [g=genre_code] tag. Any text that is not contained in a [g] genre tag will be automatically treated as the default genre for that notebook.

NB: Remember to include the closing [/g] genre tag to indicate the end of a non-default genre passage.

e.g.

[16]      & amazement because of its uncertainty[/g][g=poetry]
[17]
[18]    With simplicity & not w irony—
[19]    ivory white oyster white    without much stirring up[/g]
[20] [g=r]88  So one adds this brief position

In these lines MM moves from reading notes to poetry and then back to reading notes.

NotebookDescription

Provide a physical, bibliographic description of the whole notebook.

e.g.

[notebookdescription] Yearbook 1921, red leather binding. Includes  8 pages of pre-printed information of various kinds, e.g. Time Differences, Thermometers; Greatest Altitude in Each State; followed by lined calendar pages divided in half so that each leaf contains two dates. First image with MM’s writing is the inside front cover. RML number written in pencil: 1250-27 [/notebookdescription]

NotebookEditor

Identifies the person who made the initial primary transcription of the notebook, with annotation and introduction, and when it was started/finished; if multiple editors, include details about who did what.

e.g.

[notebookeditor]Cristanne Miller, begin 05/01/2015, end 10/15/2015, transcribed and annotated, pages 1-50.[/notebookeditor]
[notebookeditor] Robin Schulze, begin 08/16/2015, end 08/21/2015, transcribed and proofed, pages 51-60[/notebookeditor]

NotebookEdit or NotebookProof or NotebookCorrect

Add a new entry for each batch of proofing, revisions, corrections, or (if made by the primary editor, "edits") made after the initial transcription was completed. The site will store backups of earlier states of these files, but we should document when changes were made.

e.g.

[notebookproof]Heather Cass White, begin 08/25/2015, end 09/26/2015, proofed transcription and notes, pages 1-30[/notebookproof]

[notebookedit]Cristanne Miller, begin 02/22/2015, end 08/24/2015, edited transcription and footnotes, pages 1-30[/notebookedit]

Further Notebook Metadata Fields

Additional notebook-wide metadata can be gathered automatically using the embedded tags in the body of the transcription, such as a list of all the poems being drafted or of all of the writing implements being used within a given notebook. The editors do not need to create or maintain such lists during the initial transcription process.

Metadata defaults

"Hand" default is always MM; writing implement default is always black ink. Only note a new default if it differs from these assumptions (such as a notebook written primarily in pencil). If a change for the default continues for more than a page, still only note the beginning and ending of the change once (not per line or per page).

VIII. Image-level Metadata 

Image Number

The number of the image based on the standardized filenames. This will appear at the top of each WORD  page above the numbered lines

e.g.

[imagenumber=0002]
[page=r.0002]
[01]
[02]

Image Description

An optional description of any special features contained at the image or page level, including drawings or expressive markings by Moore, as well as features of the image itself worth noting, such as tears, shadows, discoloration, or inserted items.

e.g.

[imagedesc]MM draws a fly, two wings and a small black body. This fly appears in a full-page advertisement for Johnnie Walker Scotch drawn by Alan MacNab, picturing Mr. B as a portly man wielding a newspaper in a vain effort to kill a fly while a sly and composed Lord C looks on amused. "Johnnie Walker Sees a Quaint Encounter," Illustrated London News (London, England), October 18, 1930, 654. [/imagedesc]

Do not repeat standard information—e.g., if an initial Notebook-wide entry describes the typical notebook page (perhaps lined, with pre-print features) or writing instrument (e.g., black ink) do not repeat this information for each image or page.

Ignore where MM’s handwriting appears in relation to pre-print information unless she is responding specifically to a pre-print cue (extremely rare).

IX. Page-Level Metadata

Page

Pages will be designated by the image number and "r" for recto" or "v" for "verso."

e.g.

[page=r.0021] indicates the recto page contained within image 0021.

On recto/verso see https://en.wikipedia.org/wiki/Recto_and_verso, but note that our numbering is based on the order of images, not of the underlying physical pages, such that "r.0021" does not necessarily mean the "21st recto page" but only "the recto page contained within image number 0021."  Where images do not contain both recto/verso within a notebook, only the image number is required.

Page-level Genre

The genre tag at the page level will be used only for text that varies from the NotebookGenreDefault, or from the genre at the bottom of the previous notebook page; that is, if MM is taking reading notes on three consecutive pages in a poetry notebook, mark the beginning and ending of this genre at bottom and top of each page, and then mark return to [g=poetry] where appropriate.

Mid-page, mark the beginning and ending of a non-default or continuing genre: for example, in a reading notebook, use a genre tag when MM adds a drawn figure or quoted conversation. In miscellaneous notebooks, you may need several tags per page. "Genre" may be abbreviated as "g." To mark the end of a non-default genre, you do not need to repeat the specific attribute.

e.g. 

[16]      & amazement because of its uncertainty[/g][g=poetry]
[17]
[18]    With simplicity & not w irony—
[19]    ivory white oyster white    without much stirring up[/g]
[20] [g=r]88  So one adds this brief position

In this example, the poetry passage is tagged as beginning at the end of line 16 and stopping at the end of line 19. 

Acceptable values:

draft

c (conversation)

r  (reading; any material working from a written source, starting with the date on which she is reading, including citation of source and page numbers, through actual reading notes) 

v (verse; for where she quotes published verse from others)

drawing (for any figure drawn by MM herself)

prose (where MM does not seem to be quoting either written text or conversation but is writing prose;)

e.g. [g=conv]recalled conversation[/g]

A drawn figure requires no genre tag; annotation should occur in the location where it occurs on the page.

Within the value [genre=r] also use [qu] and [source]:

[qu]any material quoted or paraphrased from a written source, identified or not[/qu]

[source]title, naming, or citation of material she is reading or from which she is quoting or paraphrasing[/source]

e.g. [g=r]Dec 2 1932 [source]the Bible[/source] says [qu]cited or paraphrased passage[/qu][/g]

When MM continues quoting from a single text across several pages, mark the end of each page with [/qu] and the beginning of the following page with [qu], immediately followed by an annotation giving the source quoted. Annotation should immediately follow [qu].

e.g.

[01] [qu]1 28  w colors on of
. . .

1 W. D. Wilcox, The Rockies of Canada, G. P. Putnams Son's, NY: 1909, p. 28. 

When a genre is mostly likely "Conversation" but it is unclear, use: [g=conv?]  [/g].  Make your best guess.

Genre designators may continue across recto/verso or across several images (pages).

Poetry Notebook drafts of particular poems

In poetry notebooks, mark passages where MM is clearly working toward a single particular poem as follows.

1) Indicate where on the page MM begins the draft, including the genre=draft tag, title of poem being drafted, first publication date of the poem, and inclusive lines in which MM engages in this draft: [draft="title," first date of publication, lines 06-22].

2) Where an entire page is a draft toward the poem, no need to list lines; where the draft resumes on the following page, use the same tags for that and every following page as long as the draft continues.

3) Where MM is drafting two or more poems simultaneously, list all poems; tags can be embedded, for where some lines in the middle of a page or draft section concern a second poem in addition to the first.

4) Do not add annotation with publication or other information about the poem, since this will be keyed into the bracketed genre tag.

5) At the conclusion of the draft, or bottom of the page, use [/draft="title of poem"]; you can use abbreviated reference to title at conclusion of draft.

e.g.

[01] [draft="Silence," first published The Dial 77 (October 1924), lines 01-07, 16-23] —have to be shown Longfellows grave
. . .
[23]            not in silence but restraint [/draft]

Color

The color tag will eventually be separated into two kinds of tags in the XML, one kind for different implements and another for different colors. There is no need to set a color tag at the page level unless it differs from the default color for the entire notebook; similarly, if a notebook is primarily in black ink but a page is primarily in blue ink, you need only note if there is a departure at the line-level from this default color value set at the whole-page level.

Use "color" for inks; include implement for non-ink implements and colors:

color=black (black ink, this is the general default)

color=blue (blue ink)

color=red (red ink)

color=purple (purple ink)

color=green (green ink)

color=pencil (gray lead pencil)

color=pencilgreen (green pencil)

color=crayonblue (blue crayon)

XI. Line-Level Metadata

Line numbers

The start of each new row of script is indicated by [##] followed by a single space. Use initial zeroes for [01]-[09] You can have as many lines in a page as you want, and numbers may spill over a page break inside the MSWord document.

e.g.

[01] Text of line 1.
[02] Text of line 2.

Transcribe an insertion or interlineated word as its own textual line with its own line number, using [add][/add] (carets must also be noted with an [add][/add]).

e.g.

[01] Text of line 1.
[02]       [add]insertion[/add]
[03] Text of [add]^[/add] line 3.

Cancellations

Only mark cancellations when they are specific to particular words; horizontal lines drawn across all or most of a page and slanted or wavy lines cancelling passages of 2 or more lines will be described at the bottom of the page in [imagedesc]  [/imagedesc].

For cancellations, use the tag[del] [/del].

e.g.

[15]  without [del]w[/del] perceptable bend or

This means that in line 15 the letter "w" is cancelled.

Underlined words

For underlining (underscoring) use the tag [ul] [/ul].

e.g.

[24] [ul]that but only one at a time[/ul] – shall go on it

Vertical lines

Transcribe vertically written material below the numbered lines of horizontal writing as the next sequential line—i.e. if there are 30 horizontal lines on the page and 1 vertical line, the vertical will be numbered [Ver 31]. Add descriptive language about where the text is found in a [desc] element after the line number element. The precise location of vertical text will be marked up using visual zones, so the line numbers are more of an index value than a precise descriptive value

e.g.

[21] Text of line 21.
[22] Text of line 22.
[23] Text of  line 23.
[Ver 24] [desc]span lines 03-17, left margin[/desc]Text of vertical line 24.

For vertical lines, there is no need for an end-tag. In the example above, the vertical text spans much of the horizontal body text, beginning line 3 and concluding line 17, in the margin. Vertical lines where the beginning of the line begins at 17 would be tagged: [Ver 24] [desc]span lines 17-03, left margin[/desc].

Vertical lines written in more than one row of print would be tagged as such and use the standard slash— / —to indicate where the rows of print break.

e.g.

[Ver 31] [desc]left margin, lines 29-19, in 8 rows of print[/desc] the raving / [del]a decent[/del] madman / of good taste / [del]It is the[/del] / that eats when it is in the mood / as much as any more than some / it is the choicest & / [!shorthand!]the worst

Marks, squiggles, wavy lines, etc

For non-representational marks (circled or bracketed passages, cancellations across several lines, arrows, marginal lines apparently of emphasis), provide description following the transcription of vertical lines.

e.g.

[Ver 24] [desc]span 03-17, along left margin[/desc]Text of vertical line.
[imagedesc]MM draws a circle around right-margin words, 05-58; MM draws an arrow linking line 02 to 24[/imagedesc]

Do not use end punctuation, and description of all non-representational marks on a page may be combined in a single [/imagedesc] note.

XII.Annotation

For now, we are putting annotation in MS Word "footnote" format, which is then transformed into TEI XML note elements.

Annotation numbers should appear after commas and periods, but before em-dashes.

Quotations should be annotated to the extent necessary to clarify the source of the text; do not provide comparative text. If MM provides page numbers, no need to corroborate in annotation but do give the edition she is quoting from. If you have a URL for the precise edition or for the essay from which MM quotes, add it at the first reference, within editorial brackets: [!  !]. Repeat annotation for quotation on each page that MM quotes that text—e.g., if she quotes a single text on several successive pages, repeat it at the beginning of each one.

If there are several references within a single notebook to a glossary entry, use abbreviated short title to distinguish point of reference.

e.g. Davis, Frank ("Persian Painting): 33 (see earlier note).

Where there is no author’s name, refer to an abbreviated title of the source.

e.g. "The Jewel": 13 (see earlier note).

In annotating unknown figures for glossary, use a question mark where appropriate.

e.g. Smith, John (1953?-2030).

Foreign Languages

Mark the presence of a foreign language using [lang].

e.g. [lang=french]L’Abbé Tempête Arnaud de Rancé[/lang]