NTI Buddhist Text Reader

Using the NTI Reader as a Translation Aid

One of the main purposes of the NTI Reader is to help translators. This page describes the ways that the web site can be used in different translation workflows. Canonical texts included on the site may be used to help translate or the site may be used to translate general Buddhist texts. Answers to more general questions are in the Frequently Asked Questions page.


Getting Started

Several of the most immediate benefits of using the NTI Reader for translation are that

  1. It can save you time by looking up many words at the same time
  2. You can look at the dictionary definitions for whole phrases or sentences at one time.
  3. The NTI Reader contains a large number of Buddhist terms, literary Chinese words, and modern Chinese words, which saves you from having to buy and consult a collection of different specialist dictionaries. If you do need more information on a term, the NTI Reader helps with references using a system of abbreviations in the dictionary entry notes.

Translation involves more than just looking up words. Translation needs to consider the lexicon, the syntactic structure of each language, and the message that the source text describes (Vinay & Darbelnet 1995, pp. 11-12). The NTI Reader aims to help translators by allowing them to look at the English equivalents of many Chinese words at one time and to drill down on individual Chinese words to relate these back to their use in canonical texts.

Translation can be difficult for any document because of a number of different challenges, including the different structure, conventions, and cultural conventions of the two languages and their environments (Vinay & Darbelnet 1995, p. 4). Translation of a canonical text will be especially difficult because it will start with the challenging task of understanding a source text written in a very different time and culture. This will most likely start with a document analysis that may include

  1. Reading the source document, historic or modern, and using the web site to understand more about certain words or phrases.
  2. Getting a sense of an untranslated historic document. A full translation may not be feasible initially but you want to get an overall sense of the content and nature of the document, say by examining the keywords.
  3. Commentary or summary in a modern language, say from a secondary source, such as an encyclopedia.

Translation Terminology

Here is an explantion of some of the terms used in the remainder of this page:

Casual Use for a Private Translation

In this workflow a user does not join the GitHub project or have contact with the NTI Reader site owner but simply wants to use the web site as tool for a private translation. The user has their own private copy of the document and does not wish to share it. The user wants to use the site to search word meanings and for translation memory.

  1. Phrases or words are cut-and-paste from the private Word document into the word or phrase search page to find the word meanings.
  2. If a word is found that is not in the dictionary then the user can email the site onwer at alex@ntireader.org with the additional vocabulary to add.
  3. The user can copy material from the web site under the terms of the Creative Commons Attribution-Share Alike 3.0 License license.

Collaborative Translation

In this workflow a user will either join the GitHub project or send linguistic artifacts to the site owner by email. Typically, the linguistic artifacts will be either entries to the words file containing more or changed vocabulary or corrections to the files containing the canonical text. The artifacts can be added to the site and the dictionary and / or corpus rebuilt.

Getting the Sense of a Canonical Text

In this workflow the user wants to get the overall sense of a document in the Taisho.

  1. Find the document in the Table of Contents and browse the text. Mouse over the Chinese text to get a quick sense of the meaning and click on any word to get a full definition that also relates the word to how it is used in the canon.
  2. Look at the content analysis for the text. A link for a content analysis for each text is given at the bottom of the colophon page for each document. The content analysis includes a list of proper nouns, frequencies of lexical words, frequencies of all words, and bigrams ordered by frequency. For example, for the Treatise on the Awakening of Faith in the Mahāyāna大乘起信論》 (T 1666) the colophon page is here and the content analysis is here. This may be useful for terminology extraction.


Language tone relates to the context it is used in and is indicated by word choice and other stylistic choices. For example, the modern English word 'deceased' indicates an administrative tone, while 'dead' indicates a conversational tone (Vinay & Darbelnet 1995, pp. 17-18). In translation of text from a modern source language to a modern target language the tone is usually preserved in the translation process. However, this is most often not the case for translation of historic texts, including canonical Buddhist texts.

The tone and style of translated Buddhist texts will usually vary with the intended audience. A target audience of lay Buddhist readers will probably appreciate a more digestable form compared with a target audience of Buddhist scholars will appreciate greater precision, and already be familiar with basic terminology. For example, consider translation of the term 公案 gōng'àn “kōan” or “gong'an”. A lay readership will probably more easily understand “koan”, without the diacritics, which is a word now included in general English dictionaries, such as the Oxford Living Dictionary. A translation for a scholarly audience will probably prefer to use the term “gong'an” for a Chinese Buddhist text or “kōan” with diacritics for a Japanese canonical text.

Sanskrit and Pali words are particularly numerous in canonical texts. Take the word 大悲 dàbēi “mahākaruṇā” or “great compassion”. A lay audience will probably prefer an English equivalent “great compassion” The problem with this is that it would be a guess to back translate from that English equivalent to the source word. This might be thought of as an imprecise translation by a scholarly audience, who may prefer “mahākaruṇā”, which the diacritics. This is one reason that the NTI Reader includes several English equivalents and an indication in the notes of which are Sanskrit, Pali, and Japanese. The NTI Reader uses the International Alphabet for Sanskrit Transliteration for diacritics of Sanskrit words, as described in the NTI Reader Style Guide.

Larger translation projects will generally use a style guide. If your project does not have one, you may consider consulting the Wisdom Publications’ Style Guide for books on Indian and Tibetan Buddhism, although it may not be sufficient for Chinese texts.

There are many more nuances to style in translation than this. For example, if the Chinese is itself a transliteration of a Sanskrit word then it may be best to keep the Sanskrit form in the target text.

If something about the NTI Reader makes it difficult for you to translate according to your style guide, please send an email to alex@ntireader.org.


Translation methodologies are often guided by linguistics. Vinay and Darbelnet developed a translation approach based on early modern linguistics in the 1950s (Malmkjær, 2011, p. 58-60). Thier approach views the translation process from the three levels of lexis, syntax, and message. One of the important aspects of their method is consideration of cultural context for both the source and target languages. The the approach of Catford lexis is dealt with using collocations and lexical sets (Malmkjær, 2011, p. 60-62). Nida's approach emphasizes grammatical structure using concepts from Chomsky's generative grammar (Malmkjær, 2011, p. 62-64). Bells approach and Halverson's approaches emphasize psycholinguistics and cognitive linguistics respectively (Malmkjær, 2011, p. 64-67).

According to the classic translation methodology text by Vinay and Darbelnet, a translation unit is “the smallest segment of the utterance whose signs a linked in such a way that they should not be translated individually” (Vinay & Darbelnet 1995, p. 21). They recognise several types of translation units

  1. Functional units - forming a syntactic group, for example, “at a location”
  2. Semantic units - having a unit of meaning, for example, “main feature”
  3. Dialectic units - expressing a unit of reasoning, for example, “on the other hand”
  4. Prosodic units - with the same intonation, for example, “You there!”

The NTI Reader can assist you identify translation units by (1) giving a holistic view of a sentence and (2) through finding collocations for specific words. Consider the word qiú 'to seek'. Looking at the detail page for this word you will find the collocation qiú lì “to seek profit”, which may help you identify the translation unit and choose a combination of English equivalents that fit well together.

Terminology Extraction

Terminology is vocabulary for a specialized domain. Buddhism has a very large amount of terminology and this includes subdomains and genres with their own terminology. Terminology extraction tools are software aids that help compile glossaries for terminology (Kenny, 2011, pp. 462-463). Although the NTI Reader was not specifically designed as a terminology extraction tool it has general corpus analysis features that can help in terminology extraction. The Corpus Analysis of the Taishō version of the Chinese Buddhist canon may be used to help extract terminology for the canon as a whole. The corpus analysis includes frequencies of lexical words, frequencies of all words, and frequencies of bigrams. Frequency lists of of lexical words exclude commonly occuring stop words, that are mainly function words that occur very frequently. Individual word entries relate the words back to the corpus with listing by frequency of occurrence, collocations, and usage examples.

Corpus Analysis includes Frequencies of Lexical Words by Genre, for genres such as āgama, jātaka, and avadāna. Individual texts include content analysis for the text, including proper nouns, rrequencies of lexical words, frequencies of all words, and bigrams.


  1. Kenny, D 2011, “Electronic Tools and Resources for Translators”, in: Kirsten Malmkjaer and Kevin Windle(eds), The Oxford Handbook of Translation Studies, Oxford University Press, Oxford.
  2. Malmkjær, K 2011, “Linguistic Approaches to Translation”, in Kirsten Malmkjær and Kevin Windle (Eds), The Oxford Handbook of Translation Studies, Oxford University Press, Oxford.
  3. Vinay, J-P & Darbelnet, J 1995, Comparative Stylistics of French and English: A Methodology for Translation, John Benjamins Publishing: Amsterdam and Philadelphia.