The Interlinear Editor: Completing the Application

Basics of Events

This chapter builds on the foundations of Objects, Arrays, Classes, Events, and Views to describe a variety of applications which are usable within the the browser. The selection of applications described here are intentionally broad

interfaces which are specific to particular fieldwork contexts, which demonstrate how basic components may be recombined into new tools.

User Interfaces: “Views”

So far we have developed a system of models of documentary units. In addition to enumerating their defining properties and values, we have begun to add functionality to those models (in the form of methods) which “automate” properties for which automation is possible and helpful. Thus, we have seen how an instance of the Word class has a .phonemes property which is dynamically generated from that word’s .form property (which is itself a string). Functionality like this is at the heart of what makes DLx helpful: it can makes otherwise tedious or error-prone data-entry tasks manageable. And above all, our models serve to manage data.

But what about the process of actually entering all that data? It’s nice that we can access a .phonemes property of a Word instance to access a list of phoneme strings transparently, but that fact does nothing to alleviate the difficulty of inputting the word form in the first place. What does that process look like, and how should it be informed by the techniques which documentarians already make us of during fieldwork? To answer this question, we turn to a new field: application design, and it’s correlated domains of User Interface (UI) design, and User Experience (UX) design.

A Basic Documentation Suite

The following section describes high-level requires for what might be considered a “basic” suite of applications which would serve to create documentation which meets the critieria set out in @@create a list of criteria, like Bird & Simon@@.


A LanguageEditor is an interface for editing Language metadata.


A WordEditor is an interface for editing Words.

Lexicon Editor

Sentence Editor

Text Editor

Text View

Corpus Editor

Corpus View

Grammar Editor

Grammar View

System Editor

Beginning with Transcription

What is the minimum that we expect a transcription editor to do?

One of the advantages of browser-based applications is that the barrier to creation is quite low. In the simplest 1 case, there is no server configuration, there is simply an HTML file with embedded content (HTML), code (Javascript within a <script> tag), and style (CSS within a <style>) tag.

The examples below, I use the following HTML file as the basis for the application:

<!doctype html>
  <title>Empty Page</title>
  <meta charset=utf-8>





All CSS rules are placed within the <style> tag, and all Javascript within the <script> tag. 2

At the very least, we expect to have an input box.

Version 1

<input class=transcription>

This is not terribly exciting, it’s merely a text box with a class name which will help us to identify the element within a page which is bound to become more complex.3 But since our transcriptions are not likely to fit within the default width (which happens to be 133.333px in Firefox, and is of a comparable scale in other browsers). We may simply choose to change this with a CSS rule:

input.transcription {
  width: 90%;

The reader may ask at this point whether the addition of a text input to a page is at all meaningful as a first step in the construction of a transcription editor. To this we reply, the best way to start is simply to start.

Tokenization demo

This is a simple demonstration of an interactive tokenization algorithm: as you type words in the input box, the text is “tokenized” (split into words) and rendered in a numbered list.
    (tokens will appear here)

Beyond Basics

This section contains examples of novel interfaces which are specific to particular fieldwork contexts, which demonstrate how basic components may be recombined into new tools.

An Illustration Editor as a Lexicon Editor

Among other topics, contains information about health, primarily consisting of a small number of dialogs on particular health-related topics, produced by Emily John-Martin. Each such dialog contains a TextView with time-aligned playback, and some are also accompanied by illustrations, which were also produced by John-Martin. Topics include, for instance, ‘Knee Pain’, or ‘The Circulatory System’. While the dialogs and illlustrations proved useful and interesting for community members, the diagrams were produced in a painstaking way, being hand-drawn by John-Martin.
This process is of course a bottleneck on the speed with which the diagrams can be produced. But, there are pre-existing, open-source resources available. A search on Wikimedia Commons for “human body diagrams” on leads to a useful overview page with links to dozens of images, all of which are released under a Creative Commons license, and are hence legally reusable.

Many of these diagrams are in SVG format. SVG, or Scalable Vector Graphics, are diagrams that are essentially stored as structured textual data, in a dialect of XML. Because it is possible to embed SVG content staticly or dynamically into an HTML file, the same methods which can be used to make HTML content interactive can be used to make an illustration interactive. The HTML tags within the SVG diagrams may be parsed by the browser and inserted into the DOM.

Thus, DOM element methods, including event listeners, can be used to make an application
it is possible to create applications various parts of the diagram. that allows the user to change the content of the labels that point to . that was the basic premise: if we think about these things. in terms of DLx, we can analyze the resulting data. if we have a tool that allows us to edit the labels. on an illustration, then the answer is we’d end up with a lexicon. that would be one data structure you could end up with . you could alos end up wiht a text, but that’s less likely . as you’d want to dedicate the effort to preouce . text transcription to be focused . directly on something related to anatomy. like the dialogs mentioned before.. The basic point of this is that we want to be design applications that . produce data which is in the specifed DLx data format. by storing data in that format, we’re able separate the storage of linguistic data rom the . interface that we use to produce that data.. it’s interoperable with the output of mnay other applications. there’s no limit on the kinds of applications that we can design. when you first look at this application, you might. or a poster-editing application. or a diagram editing application. not think of it as an image editing application of some kind. image editing application. but be cause the DOM interface allows us to track interactions . with the interface, it’s possible to track correlations between structured. pieces of data, andto recombine that data into . Here’s the application. one of the familiar data structures of DLx.. or export a DLx lexicon as a JSON file. it has, basically, 3 panes the top pane is a header, it allows exporting edited images or . then there are two page below the header: the right bottom pane is the image. one of many appropriate images that might be imported from Wikipedia. the left pane is a lexicon editing interface. basically the way it works is . you click on one of the labels, and you can typi in translation. so for instance the word for ‘shoulder’ is chi̠yó. (There is also some transliteration tools here.). And we build uop the lexicon instance . the image updates as you change the contents of each label.

An illustrated minimal pair lexicon

One interesting application of the basic object hierarchy produced here is in applications that are simultaneously useful in both research and revitalization contexts. The use of illustrations within revitalization is well-established, @@find sources@@. They are accessible and interesting to learners and speakers at a varying degrees of expertise, and can serve as the basis for many kinds of pedagogical activities. By way of contrast, some tasks are often seen as more “research-oriented”, with only indirect relevance to revitalization work. The compilation of lists of minimal pairs, for instance, is often seen as a more research-oriented task associated with the fundamental task of establishing a phonological inventory — which is in turn necessary for determining a phonemic transcription system and perhaps a phonemic orthography.

In this section, I describe an application which designed to be wedged into the early stages of documentation, on a language which has some previous documentation in existence. We will take as an example materials from the Loma language (Mandé: Guinea, Liberia). There is a fair amount of documentation of the language, but it is varied as to dialect, orthography, and phonological analysis. 4

A sample application

  1. Applications
  2. The Data Cycle
  3. Implementing Javascript Classes
    1. instantiation and the new keyword
    2. methods
    3. constructor functions
  4. data classes: Word, Sentence, Text
  5. view classes: WordView, SentenceView, TextView
  6. The basis for interation: DOM events

Brief overview of programming concepts

  1. DOM nodes (objects which correspond to particular tags)
  2. events (objects which represent the results of particular user actions like clicks) and custom events (like normal events, except that they are defined in code, not “native”)
  3. event listeners (a DOM node object can run functions in reaction to events that take place “on” DOM nodes, either themselves or their children)
  4. view classes (objects which associate a DOM object and a data object, and which sets up events that can trigger the conversion of user behavior (clicking, typing, mousing…) into changes to or creation of data)

Web Browsers and the DOM

It might not be obvious that web browsers can open files in the same way that, say, a word processing application can. The process works in precisely the same way: choose “Open File…” from the “File” menu, and use the file navigation dialog to select a file. Now, if you try to do this with a file which is in a format for a word processing document (like .doc or .odf or .docx), the browser won’t open it directly, although it might open another application which can handle that format.

However, there are a number of formats which browsers can read and display directly. These include .txt files and .html files, among others. (Many modern browsers can also render .pdf files.) But of these, it is HTML that the browser was designed to handle. HTML is the default format that web browsers expect to handle. And as we shall see, most of the power of the browser is related to its ability to work with information of the type which may be stored in HTML files.

The easiest way to introduce yourself to this is to use the Javascript console, which might be thought of as a kind of toolbox for interacting with everything that the browser does.

Like the data structures we have been discussing, documents can be constructed as structured (heading notation, sections, chapters, etc) * a primary function of the web platform is to implement documents in this way * each “node” in the tree structure of the document is associated with a special Javascript object * those nodes are programmable: we can move them around within the tree, change their attributes interactively, even remove them from the tree altogether. * this approach to modeling interactive documents is called the DOM.

Browsers do more than present HTML files.

Most of the population of Earth (ICT 2017) is familiar with using the internet as a place to “consume” content such as text, images, video, and audio. Consequently, most often one thinks of a “web browser” as a program which “gets” (downloads) such content from the internet, and the presents the contents of those various files in a “browsable” form on the user’s computer. And that is a correct characterization, insofar as it goes: web browsers do download content from the internet and present it on the user’s computer. But modern web have a much less well-publicized “nature”, one which has evolved in tandem with the browser’s “consumption” nature.

“the Blank console”
“the Blank console”

We have seen that using a programming language like Javascript to create and search documentary data is useful. But thus far, the displays of data we have created have been fairly formulaic. In this chapter, we will explore a broader range of user interfaces for creating and using our data model.

<div class=word lang=eo>
  <p class=form>dom-o-n</p>
  <p class=gloss>house-N-ACC</p>

Which might be displayed visually as:



The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. The document can be further processed and the results of that processing can be incorporated back into the presented page. [W3C]

There’s a lot to unpack here, let’s take it step-by-step.


A View class is a custom class which maintains a relationship between a user interface (in the form of a DOM node) and some data.

As parameters, Views will require:

  1. A reference to an instance of one of the eight core data classes
  2. A reference to a DOM node within the user interface.

A typology of views

  1. Static renderings: HTML
  2. Visual renderings: CSS, CSS Grid
  3. Interactive views: Events
  4. Editable views: Changing model data from the view, re-rendering


What a Browser Does

When the browser reads an HTML file, its first task is to work out the structure which is embedded in that file. One of the important features of HTML is that it can encode hierarchy: that is, there is a built-in notion of “nesting” or “embedding” which is part of the HTML specification. Here’s how that works.

Let’s consider a very simple HTML tag, <h1>. That tag means “the most important heading on the page.” But there’s some flexibility about what an HTML tag ‘means’. If you are looking at a newspaper web site, the title of the newspaper might be put into an HTML page as the contents of an <h1>:

<h1>The Los Angeles Times</h1>

Alternatively, if we’re looking at a single article page on, the designers might decide that the title of the article is more important than the name of the paper. These are the kinds of choices web designers make all the time, and there isn’t an obvious “right” way to make such decisions.

You may recall that the very simplest basic, empty, but valid HTML page would look something like this:

<!doctype html>


Note that this empty document aready contains “nested” elements. The <html> tag opens after the doctype directive. But then we immediately hit the <html> tag, which is the outermost tag in any page. Its opening sequence is the first tag (after the doctype 5), and its closing tag is always the last tag. The <html> tag almost always contains only two direct “children,” or nested tags: and <body>. The

tag contains various kinds of metadata about the page, and the <body> tag contains everything that is either directly “rendered” or presented to the user within the browser’s viewport, or else somehow directly affects how that rendered either behaves or appears.

So it’s the browsers job to know these “facts” about all of these tags. You can think of the browser as “reading through” an HTML file, and “doing something” with each of the tags it finds. Exactly what it should “do” with each tag is part of what the browser is responsible for knowing.

The author of a web page is also responsible for knowing that something like an <h1></h1> can only go “under” the <body> tag, and never “under” the <head></head> tag. And just so, a <title></title> tag must only be placed “under” the <head></head> tag, and never “under” the body. Such rules are part of the HTML specification: ignoring such rules can lead to unpredictable results (although browsers try very hard to work around errors in HTML documents).

What HTML was Designed to Do

In the earliest days of HTML,

What HTML was Actually Does Now

An introduction to Events

Let us consider the lowly click.

When a user pushes down on their mouse button (or trackpad, these days), the browser said to “listen to” that click. Such a click is called an event. It is important to realize from the outset that an event is just another object, as far as Javascript is concerned, and you can assign a variable to an event as a name, or poke at it in the console, in order to query its properties.

There are many such events defined in the browser specification, but here we will be most interested in (1) click events and (2) what are called keyboard events, which correspond to the various ways in which we pound on the keyboard. In either case, there is some subtlety to the question of how events correspond to the DOM, and without an understanding of those subtleties, confusion may arise. So, we will begin with this question.

Consider the following small page:

<!doctype html> 
<html lang="en">
  <meta charset="UTF-8">
  <title>Event demo</title>
div {
  background-color: hsla(200, 50%, 50%, .5);
#inner {
  width: 30px;
  height: 30px;

#inner {
  width: 20px;
  height: 20px;


<div id="outer">
  <div id="inner">


It contains two nested divs, the parent being identified by the selector #outer, and its child by the selector #inner. Both divs are affected by the following CSS rule:

div {
  background-color: hsla(200, 50%, 50%, .5);

The value of the background-color property here is given in hsla format. This is not the place to discuss the details of CSS colors, but suffice it to say that this rule gives both divs a blue shade which is partially transparent. Thus, the inner div is slightly darker than the outer, because its blue background is “added” to that of the div below it. This styling emphasizes the fact that the two divs are in a sense “stacked”: the larger #outer is fully “present” in the DOM just as if the #inner div were not there.

Which brings us back to the question of a click event: if the user clicks on #inner, it is also true that they are clicking on #outer, because the area of #outer fully subsumes that of #inner. It is possible to click “only” within #outer, without clicking #inner, if you click in the lighter area surrounding #inner. However, and this is the important point, it is not possible to click only the #inner div.

But as we have seen, @@ the entire DOM is a tree model. The root of the tree is the <html> tag. Everything visible in the page is under the <body> tag, which is an immediate child of <html>. So, strictly speaking, it is effectively impossible to click anything in a rendered web page without also clicking the <body> tag. And it is this fact which necessitates a mechanism for being more specific about what, exactly, should respond to user activity.

The mechanism offered as a solution to this problem is the .addEventListener method of the every DOM element. @@Recall that when an HTML tag is parsed by the browser, it constructs an instance of the corresponding class for that tag, and that instance of that class is inserted at the right point of the DOM tree. Once this process has been carried out, we may refer to the erstwhile tag as a DOM “node”. Again: nodes are instances of specific classes. And it is those instances which have the method (attached function) called addEventListener.

Before looking at the syntax of how this method is method, we can think about it in general terms. The addEventListener method is used to associate a particular DOM node with a particular kind of event, and specifically, to associate that particular combination of a node and an event with a function. We’ll take the example above as a starting point:

“When the #inner node “hears” a click event, do the following thing to that event…”

Thus, we can see that .addEventListener is a kind of conditional instruction. Understanding this conceptually, we can go ahead and look at the actual syntax of for the method:

let inner = document.querySelector('#inner')
let logEvent = event => console.log(event)

inner.addEventListener('click', logEvent)

The final line may be read “when #inner is clicked, run the logEvent function with the click event object as a parameter.

@@other wrinkles to addEventListener

  1. useCapture parameter
  2. this binding
  3. vs event.currentTarget

@@ much more stuff on HTML, classes?

Mouse events

Keyboard events

Custom events

An Approach to Implementing Applications

Applications versus Documents

Software development is a vast and ever-changing field. At first, the very idea of attempting to learn a programming language, while simultaneously studying linguistics (and perhaps one or more human languages), can seem overwhelming. A goal of this dissertation is to design an ordered pedagogy — a chrestomathy — which introduces the programming in a way which is focused closely on examples of obvious utility to documentary linguists. We will prioritize strings (text) over the subtleties of numbers, for instance, because the primary data in linguistics is of course textual. At the risk of swamping the reader slightly, nonetheless, I here present a bird’s-eye view of the general kinds of programming abstractions we will make use of in this book.

Abstraction and the Psychology of Programming

Programming is all about abstraction.

Ignoring the details of levels below your level of abstraction is crucial.

Modeling data

Discuss oneness and sameness

Analyzing Data


true and false

functions, arrow functions

Programmable Documents: The DOM

Many are familiar with the HTML and understand that it is the format in which “web pages” are composed. Far less familiar is the nevertheless crucial notion of the Document Object Model, or DOM. The DOM and its associated technologies (Application Programming Interfaces, or APIs, about which see @below) are actually what make the Open Web Platform different from any other information platform available.

When the browser parses an HTML page, it builds a series of Javascript objects in memory which are structured in a tree. Each node on that tree is of a particular type, whose characteristics depend on the corresponding tag in the HTML. Take for example the following minimal (but valid) HTML page:

<!doctype html>
<meta charset=utf-8>

Here we see only an obligatory <doctype> tag, which indicates to the browser that it should parse the page as HTML, a <title> (which contains the text which is usually shown in a browser tab), the rather cryptic <meta charset=utf-8> (which indicates that the HTML page is composed in Unicode), and finally just two elements which are presented in the page itself: an <input> (which is a box where the user can type), and an empty <ul> (for “unordered list”, where elements may be appended

At this point, we will finally turn to the task of building our new applications which are interactive in the same way that the Demos we have used to study user interfaces are designed and built. While we have seen numerous In this chapter will cover a simple approach to meet this need.

Show why and how to create custom events

Custom events are created with the CustomEvent class, and may be used to define domain-specific events that are explicitly triggered in code.

  1. Using the CustomEvent class
    1. details
    2. {"bubbles": true}
  2. element.dispatchEvent()

This is one of the trickiest bits in the whole dissertation, tech-wise.

Rendering the phonological inventory

Perhaps the most common representation of consonantal and vocalic inventories are tables:

The complete IPA (pulmonic) consonant chart can be conceived of as a three-dimensional table, where the three dimenions (variables) are manner, place, and voicing. Any given symbol in the IPA pulmonic consonant chart can be represented as a symbol with a value for each of those three properties. We can visualize such data in various ways, perhaps most simply as a “flat” table where each phone is listed with its values for each of the three attributes. Here, for instance, is one analysis of the phonemic inventory of Loma:

This is a fairly direct and useful representation, but it is not the most familiar tabular format. How can we re-create that table? The key is to reimagine the data itself as a three-dimensional matrix, rather than an array of objects. The consonant table is a Cartesian-product” of the three variables: MANNER × PLACE × VOICING.

For the sake of simplicity, let us consider a much smaller inventory

  1. This is a minimum definition! By no means is this case taken to imply that all transcription editors should be designed in this way. In practice, a more likely file configuration would be a directory containing distinct files containing Javascript and CSS code, and HTML files would link to those files using <script src=externalJavascript.js> and <link rel=stylesheet href=style.css>.

  2. @@ explain why <script> is at the bottom of the <body> and not within the <head>.

  3. @@ why not id

  4. For example, Sadler (2006) describes “voiced labio-dental aspirated stop … pronounced by placing the lower lip against the upper teeth and momentarily stopping the air at this juncture” with “slight” aspiration. Later sources find no evidence of such a sound, reporting a voiceless bilabial fricative [β] in the same contexts. What steps would be necessary to determine the character of that sound?

  5. The doctype is really more like an instruction to the browser than a proper tag: in effect it is serving to identify the file type, and hence the parsing logic, that the browser should use to process the page. Hence, it’s really outside the scope of the discussion of nested tags.