How to import Chinese characters with ruby text into Adobe InDesign
When I was editing “The Little Prince in Cantonese with Jyutping” in 2017, I needed to mark Jyutping (a romanization system for Cantonese) annotations to all the words in the book. This process was challenging and not efficient in the current InDesign interface. Firstly, the English version of InDesign lacks tools for handling these ruby characters, so I had to install the Chinese version. Even then, adding phonetic ruby characters word by word or line by line in the program was still inconvenient. In practice, it would be much quicker to edit and proofread the text in a separate file and then import it directly.
In general, finding a solution to this kind of problems should be as easy as a quick Google search, considering that there are many children’s books with annotated phonetics. Strangely, I couldn’t find any solutions online, and I was perplexed about how those children’s book publishers managed to do it. So, I had no choice but to explore a way to quickly import ruby characters on my own.
Adobe InDesign Tagged Text
After some research, I discovered that in InDesign, you can export text as an Adobe InDesign Tagged Text File, and conversely you can import tagged text into the application too. Most importantly, these text files support ruby characters. The format of this tagged text file is somewhat intriguing, especially when font and layout styles are applied to the text, the exported files do not appear to be simple plain text files. However, if we do not apply additional formatting but are only concerned with having ruby characters, the markup syntax is not difficult to understand. Let’s take a simple example to illustrate:
<pstyle:><cr:1><crstr:siu2 wong4 zi2>小王子<cr:><crstr:>
The first two lines are basic specifications and can be ignored. The third and fourth lines start with <pstyle:>, indicating the beginning of a paragraph. The format for annotating Chinese characters is a bit unusual; it does not use opening and closing tags like HTML or XML. Anyway, we can see that the tags required for each unit of Chinese character is as follows. Note that the ruby text is specified within the first <crstr:> tag.
Additionally, it is important to note that this tagged text file needs to be saved in UTF16-LE format with Windows’ CRLF line endings; otherwise, InDesign cannot import it correctly.
Once the technical aspects are understood, the remaining task is to write a small program that converts the Chinese text and corresponding ruby text into a proper format required by the markup text file. With this, you can easily import a large amount of Chinese characters with ruby text into InDesign without the need to manually edit each character. This is why I wrote the tool “ruby2indesign”:
ruby2indesign accepts .txt files in the following format:
The Chinese text for each sentence and its corresponding phonetic annotations are placed on alternate lines. Odd-numbered lines contain the Chinese characters, while even-numbered lines contain the ruby text. The Chinese characters do not require spacing, but a single space is needed between each group of ruby text.
nei5 hou2 , siu2 wong4 zi2 !
baai1 baai3 laa3 !
After the format conversion, full-width punctuation marks will not be shown with their corresponding half-width punctuation marks. The imported result will look like this:
Importing the tagged text file
Create a text frame in InDesign:
From the menu, select 檔案 > 置入 (File > Place). Note that ruby text is only imported in the Chinese version but not the English one.
Then select the _indesign.txt text file converted with ruby2indesign.