4.7 Transcribing in IPA, Writing in a Practical Orthography

An important decision you will have to make to create transcriptions is whether you will transcribe in the International Phonetic Alphabet (IPA) or in a practical orthography (a script and system of rules that are used by the community to represent the language). 

Both will have unique challenges. 

  • IPA is not commonly known to non-linguists and needs at least a little specialized training to use accurately.
  • South Asian languages may be written in any one of the well-known scripts, such as Devanagari.  Some use the Latin alphabet. 
  • For under resourced languages, such as those being archived at CoRSAL, there is often no standardized writing system. 

Considerations for Orthographies

Here are some considerations for creating practical orthographies, writing conventions, and script for representing a language.  These may be useful in community orthography discussions.

Shallow vs. Deep Representation

The terms shallow and deep, when referring to orthographies, indicates the level of representation the orthography has.  In other words, how much detailed phonetic information is conveyed through the orthography? It may seem tempting to opt for as much phonetic information as possible.  But, over-specifying predictable phonetic features may overburden the reader and make it hard to read a passage.  For example, in Lamkang, a language spoken in parts of India and Burma, long vowels are tense.  But, the practical orthography does not indicate the vowel quality distinctions, only the length distinctions.  For example: tren [trɛn] ‘buy’  versus treen [trēn] ‘cut’.  A shallow representation will provide more phonetic details, a deep representation will assume some amount of predictable information will be filled in the reader based on their unconscious knowledge of language structure.

Typographic Ease

Ease of use refers to how easily speakers can actually employ the orthography in their day-to-day activities and how easily they can learn to read and write it.  How easily can it be typed? It is a digital world where smart-phones  and other devices can provide platforms for improving literacy.  How easily can the orthography be written using a smart-phone or standard computer keyboard? Certain special characters can easily be written by hand but can be troublesome to type although there are increasingly apps and keyboards that can be installed with ease to help with the typing of diacritics.

Spacing and Punctuation

What constitutes a word and where should spaces be provided between sequences of morphemes?  In English, we write going to as two separate words even though we pronounce them as one word, i.e., gonna.  This is by convention that has been around form may hundreds of years.  But for under-resourced languages, conventions are new and not followed by all writers, so we find variation in what is considered one word or more than one word.  In Lamkang, we see the following variation, for example: mthungbi or mthung bi 'then'.  


 Is the orthography similar to the orthography of the dominant language that the community may already be used to using?  This is also important not only for ease of learning and use, but also easier for the community to accept.  For example, in the northern belt of languages in India, Devanagari i is often used by Indigenous communities.


Another common concern with orthographic choice is aesthetics .  Beauty is in the eye of the beholder and so in a look that this not acceptable.  Some language communities adopt conventions similar to major language orthographies, like English or Hindi.  In Lamkang, for instance, it was deemed necessary to add schwas to prefixal consonant clusters as the clusters did not "look" like language.  Another example from Lamkang is the representation of repeated vowel sounds. Some community groups wish to replace the second vowel sound with a semivowel because they perceive it as more aesthetically pleasing, like kruung vs. kruwng 'lord, god'.

Differences across dialects

Another difficult issue with orthographies is linguistic variation across different dialects. Which variety should be represented in the spelling system?  If there are only two or three varieties and the differences are predictable, then one might maintain three sets of conventions.  However, the community's thoughts on standardization and prestige of one variety over another will be the deciding factor.


Even when developed by community members, it may take years for spelling conventions to be widely accepted and implemented by the community.  Orthographic choices turn out to be a very visible sign of affiliation--be it to an individual, religion, or community history.  For example, the Lamkang community has a Latin-based orthography due to their predominantly Christian beliefs. Attempting to use the Devanagari script or another orthography associated with non-Christian cultures may not be successful. 

The Importance of Writing for Revitalization

Within a documentation project, as you create transcriptions for your recordings, be consistent in your representation.  You could also make clear that your transcription is for analytic purposes, or that it can be easily modified to a community standard.  But be aware that many of the materials you are creating will be feeding into revitalization efforts.  Literacy is an important tool for revitalization, and for literacy to thrive, there have to be materials to read (you are creating them.  We hope that a writing system that is easy to read, easy to type, and pleasing to they eye will be universally adopted and used by the community in all genres--from the novel to the shopping list.  Language adoption and frequent language use by younger generations is a goal that could be met to some extent through the written language.