4.7 Transcribing in IPA, Writing in a Practical Orthography

An important decision you will have to make to create transcriptions is whether you will transcribe in the International Phonetic Alphabet (IPA) or in a practical orthography (a script and system of rules that are used by the community to represent the language). 

Both are not without challenges. 

  • IPA is not commonly known to nonlinguists and needs at least a little specialized training to use accurately.
  • South Asian languages may be written in any one of the well known scripts such as Devanagari, some use the Latin alphabet. 
  • For under resourced languages, such as those being archived at CORSAL,  there is often no standardized writing system and the individual writer must make decisions about spelling conventions.

Considerations for Orthographies

Here are some considerations that go into creating practical orthographies, writing conventions, and script for representing a language.

Shallow vs. Deep Representation

The terms shallow and deep, when referring to orthographies, indicates the level of representation the orthography has. In other words, how much detailed phonetic information is conveyed through the orthography? It may seem tempting to opt for as much phonetic information as possible.   But, overspecifying predictable phonetic features may overburden the reader and make it hard to read a passage.  

For example, in Lamkang, a language spoken in parts of India and Burma, long vowels are tense but the practical orthography does not indicate the vowel quality distinctions, only the length distinctions.  For example: tren [trɛn] ‘buy’  versus treen [trēn] ‘cut’.

Typographic Ease

Ease of use refers to how easily speakers can actually employ the orthography in their day-to-day activities and how easily they can learn to read and write it. This is another reason having too-deep of orthography can be problematic -- it can require more symbols to be learned and more diacritics (e.g., accent marks) to be written.  Another consideration is how easily can it be typed? It is a digital world where smart-phones  and other devices can provide platforms for improving literacy.  How easily can the orthography be written using a smart-phone or standard computer keyboard? Certain special characters can easily be written by hand but can be troublesome to type. Luckily, some indigenous language communities are able to make apps that let users input less conventional characters via phone and computer with ease. 

Spacing and Punctuation

One of the tough decisions for a language that is not written frequently is what constitutes a word.  In English, we write going to as two separate words even though we pronounce them as one word, i.e., gonna.  This is by convention that has been around form may hundreds of years.  But for under resourced languages, conventions are new and not followed by all writers, so we find variation in what is considered one word or more than one word.  In Lamkang, we see the following variation, for example: mthungbi or mthung bi 'then'.  


Something else to keep in mind is the familiarity of the orthography being used. Is the orthography similar to the orthography of the dominant language that the community may already be used to using? This is also very important not only for ease of learning and use but also for the next item -- acceptance. It could be difficult for a community to accept an orthography based on the Tamil writing system when the community is in a region using predominantly Devanagari. This can be tricky, though, as we’ll see in the next list item.


Another common concern with orthographic choice is aesthetics -- how nice does the orthography look. Some language communities want orthographies that remind them more of major language orthographies like English or French. Another example from Lamkang is the representation of repeated vowel sounds. Some community groups wish to replace the second vowel sound with another representation because they perceive it as more aesthetically pleasing, like kruung vs. kruwng 'lord, god'.

Differences across dialects

Another difficult issue with orthographies is linguistic variation across different dialects. Which variety should be represented in the spelling system?  If there are only two or three varieties and the differences are predictable then one might be able to develop conventions for each variety.  However, the community thoughts on standardization and prestige of one variety over will be the deciding factor.


Even when developed by community members, it may take years for spelling conventions to be widely accepted and implemented by the community.  Orthographic choices turn out to be a very visible sign of affiliation be it to an individual, religion, or community history. 

For example, the Lamkang community has a Latin based orthography due to their predominantly Christian beliefs. Attempting to use the Devanagari script or another orthography associated with non-Christian cultures would be unsuccessful. Another example of how politics and culture impact orthography use is with Chechen, spoken in the North Caucasus and modern-day Chechnya. Chechen was written with Arabic, Cyrillic, and now Latin throughout the past century due to the political and religious influences of the region.

The Importance of Writing for Revitalization

Within the documentation project, as you create transcriptions for your recordings, be consistent in your representation.  You could also make clear that your transcription is for analytic purposes, or that it can be easily modified to a community standard.  But be aware that many of the materials you are creating will be feeding into revitalization efforts.  It is a huge responsibility because literacy is an important tool for revitalization and for literacy to thrive there have to be materials to read (you are creating them!) and there has to be guidelines for writing.   We hope that a writing system that is easy to read, easy to type, and pleasing to they eye will be universally adopted and used forward by the community in all types of genres from the novel to the shopping list.  Language adoption and frequent language use by younger generations is a goal that could be met through the written language to some extent.