Constructing Names from Chinese Characters (漢字)
Constructing Japanese, Korean, Chinese, Vietnamese, and Okinawan Names for the SCA College of Heralds.
by Choi Min, last updated 2023.01.09
Introduction
A lot about Japanese and Chinese names have been covered by Mistress Sǫlveig Þrándardóttir and Master Ii Katsumori 井伊勝盛. However I recognize that Mistress Solveig's book is not accessible online, and Master Ii is not focused on the heraldic sector. So my goal is to make a concise(?) explanation that is geared towards consulting and commenting heralds, and hopefully it also helps any submitters and non-SCAdians who stumble across it as well.
This will be focused on Japanese and Chinese names because that is the bulk of what comes through as far as East Asia goes, but I will try to touch on other languages as I learn more. On paper, I have six semesters of Japanese, and one semester of Chinese Calligraphy/Writing systems from my University degree, but I have done a lot of linguistic reading since Uni, and am trying to learn Korean in my free time.
As I am writing this I am also running it by various SCAdians who have more linguistic experience than me; their names are Henric of Drachenwald, and Situ Zeming 司徒澤銘 of Ansteorra. Situ is contributing some writing to this page, and has a B.A. in Linguistics as well as a B.A. in Chinese Language and Culture. Also, somewhat amazingly and conveniently, someone is going through English Wikipedia as I am trying to compile this (late Sept and early Oct 2022) and is providing meticulous detail to the Korean, Japanese, and Ryukyuan language history Wiki's with good-looking citations, so if you need more detail please look into those (and I now have too many new books to get). .
Yours in Service, Choi Min 崔敏
Stepping Back from the European Lens
A message for consulting heralds:
The submitter, especially for names from regions that use Chinese characters, will come to you likely with some kernels of what they want that will be useful. They will also likely know more about how that language works that you do, if this is not your specialty. Let them speak, and while they do so, make note of everything they say.
Starting Linguistic Terms
All definitions below are pulled from Oxford Languages Dictionary. *Except this one, which is pulled from Merriam-Webster. Where Chinese characters are used, they are Simplified Chinese Characters (简体字) with pinyin transcriptions, and the eventual plan is to update these to traditional across the board to lessen confusion. Some small deviations for differences between common and linguistic usage.
Morphemes:
a meaningful morphological unit of a language that cannot be further divided. (e.g. "un-" or "-done" are mophemes in the English word "undone" because they cannot be further divided and retain their meaning)
Morphograms:
A morphogram is a single symbol which represents a complete morpheme which has a pronunciation that is langauge-specific. (e.g. "蝴" and "蝶" are morphograms in the Mandarin Chinese word "蝴蝶" (húdié) which means "butterfly" in English. Further, 狗 (gǒu, pinyin) is the Mandarin morphogram which corresponds to the English word "dog" but is not to be confused with 狗 (gau2, Cantonese pinyin) which is the Cantonese morphogram for the same concept.)
Pictograms/Pictographs:
a pictorial symbol for a word or phrase. Pictographs were used as the earliest known form of writing, examples having been discovered in Egypt and Mesopotamia from before 3000 BC. (e.g. early hieroglyphics in Egypt and oracle bone inscriptions in China)
Phonograms:
a symbol, or group of symbols, representing a vocal sound.
Ideograms:
a written character symbolizing the idea of a thing without indicating the sounds used to say it, e.g., numerals and Chinese characters.
Logograms:
a sign or character representing a word or phrase, such as those used in shorthand and some writing systems. (e.g. the Mandarin word for "dog" is 狗 (gǒu) which is a logogram because it is a symbol representing a whole word and also a morphogam because it represents a unit of language that cannot be further divided)
Radicals:
any of the basic set of 214 Chinese characters constituting semantically or functionally significant elements in the composition of other characters and used as a means of classifying characters in dictionaries. These are most obviously recognized in characters 木 (mù), 林 (lín), and 森 (sēn) where 林 is two 木 placed side-by-side. Many radicals are slightly altered forms of their morphogram inspiration as can be seen in the radical for 人 (rén) which becomes 亻as a radical like on the left side of the character 你 (nǐ).
Syllabary:
a set of written characters representing syllables and (in some languages or stages of writing) serving the purpose of an alphabet.
Alphabet:
a set of letters or symbols in a fixed order, used to represent the basic sounds of a language.
Transliteration:
to represent or spell in the characters of another writing system.*
Transcription:
a written or printed representation of something.
(Since this is all the dictionary entry gave us, I will add - we are specifically using this in the "write down what something sounds like" way, but more specifically, we are using this to explain someone writing something they are hearing in a language they also know how to write in, because I do not currently know other more accurate term. For example, AI that knows English, can generally transcribe a video that is run through it, and generate captions. This word has other definitions, and can be used outside of this context, but for our purposes it will be same-language script.)
Example statements using the above terms
Transcription can be transliteration if the perspective changes from "native speaker + native writing system," to "native speaker + outsider writing system."
Transcriptions typically attempt to accurately portray the pronunciation of something spoken. Alphabets originated as transcription systems for spoken language.
Transliteration attempts to approximate the sound of something spoken in another language using the target laguage's writing system, regardless of any meaning associated with the characters used. Loanwords are often transliterated like the word "coffee" from English becoming 咖啡 (kāfēi) in Mandarin which approximates the sound of coffee in English.
English words are written using an alphabet where each letter, or a group of letters, makes a sound which are combined to represent certain words.
English uses ideograms to describe numbers, such as writing "1" meaning the number unit of "one." If a German speaker were to see "1" they would pronounce it as "eins." If an English speaker were to see a "1" at the beginning of a list of numbers, they could instead pronounce it as "first". Hence, "1" could have multiple pronunciations depending on linguistic context both within a language or between multiple langauges.
Morphograms have pronunciations because they're tied to a specific morpheme within a language. The word "dog" in English is a morpheme but not a morphogram because it is made up of 3 characters whereas 狗 (gǒu) is the Mandarin morphogram for the concept "dog" because it is a single character that represents a complete morpheme.
Ideograms or pictograms are detached from specific languages, and will hold their meaning regardless of the pronunciation or context. This is the case for arabic numerals (1, 2, 42, 785, etc.)
The radicals in Chinese characters, can sometimes indicate the sound a character makes or the meaning of a character, but this is not consistent and mostly only works for Chinese languages.
Modern written Chinese languages are mostly morphograms because many words are written with two or more characters to convey a whole word. Written Literary/Classical Chinese is more logographic, because in that use, one character represents one word on its own much more often.
Chinese as ideograms can be used to represent words in other languages, if the literate group all understand the meaning behind the ideogram. If a Japanese reader were to see: 書 in a blog, they would know it means "body of writing" or "book" even if that blog is written in Chinese.
Historically, Chinese character morphograms have been used to initially transliterate many languages such as Japanese, Korean, and Vietnamese, based on the sound of a word attached to a character (in Chinese) and how it was pronounced when it was introduced. This leads to a lot of sound-alikes and loan words in languages that adopted Chinese characters as a writing system.
For hundreds of years, the ideographic nature of Classical/Literary Chinese used as a written language of governance and high literature made communication across vernaculars and different kingdoms possible.
Japanese has two syllabaries that can transcribe different uses and pronunciations of a Chinese character's use. Usually one syllabary is used to transliterate foreign words into Japanese (katakana), and the other is used to transcribe grammar and spell words (hiragana).
Korean has an alphabet (hangul) made of vowels and consonants, where the letters are blocked together in syllables. This alphabet can be used to transcribe the old Korean writing system which used Chinese characters, and it can transliterate foreign words into Korean.
Chinese Characters Origins
A quote from Master Þorfinnr (Alec Story) (from a calligraphy class we run together - I'll update this link if we get it uploaded to KWHSS as proceedings one year),
"Chinese characters originated out of pictograms which we find as early as 6500 BCE, but we only have good evidence of a writing system around 1200 BCE on bronze vessels and "oracle bones" used for divination. The script evolved over the next 1000 years until it was formalized in the early imperial period (~1 CE, give or take a few hundred years). By 500 CE, the form of most characters had been fixed into more or less what we use today."
New characters are occasionally added, but the bulk of what the language needed ideologically and to communicate in writing, already existed. Over this time, over 100, 000 characters have been created to represent nouns, verbs, sometimes punctuation, et cetera. Most of these are never used. Generally, the government established a group or committee to occasionally weed out duplicates and compile the most common ones, and an average modern Chinese speaker and reader who has graduated high school today should know about the 5 or 6 thousand most common. An average Japanese high schooler might graduate knowing about 3, 000 if they're very good students. The lists of characters a fluent speaker might know will change depending on language, career path, and education levels. This compares historically as well.
An Example
The word "Gryphon" did not have a single character and/word to represent it directly in a historical context, as it is a European mythical creature. I was working on a scroll, and I wanted to carve a seal stamp representing my barony, Flaming Gryphon, thus needing a plausibly historical visage of the word. If I pull "griffin" up in a Chinese-English Dictionary, I get "鹰头狮" which converts directly to falcon-head-lion. To cross check that, I search the term as well and get "狮鹫" which is shorter, and the title of a Wiki page so I will assume it's the more commonly used word today, and it's constructed as "lion-vulture" or "lion-[large bird]". So we have a functioning word, but I want to make sure it's an oldest form as possible for this project. I run the word through Wiktionary, and/or other dictionaries. First issue: the dictionary for 鹫 says:
So, easy fix, one of the radicals within is "bird", and it is one of the Simplified Chinese characters which aren't pre-1600. I grab the traditional form instead, which is 鷲. Next is 狮, which I plug in, and again it says it has an old form - 獅, which I will use instead because for calligraphy purposes the traditional forms are almost always what would be written on a seal stamp, or a doorway plaque, et cetera, even if in writing a simpler form would be used if that form is within period. Following through the wiktionary link from there, it appears 獅's etymology is the character 師 for sound (how lion was pronounced), with the radical 犭added in front, because that is the radical meaning "beast". We now have an optimal set of characters to represent the meaning of a griffin - 鷲獅 - that I can carve/write.
I'll cover the aspects of seal stamp carving in another page, but we now have successfully compiled the traditional form (鷲獅) of "griffin/gryphon", using inspiration from the modern language and taking a literal approach to wordsmithing what a "griffin" is. Both "獅" lion and "鷲" vulture/condor would be visually recognizable and readable as separate concepts/words to a historical person based on their individual meanings, and that person would have the jump on the concept as a lion-bird creature whereas transliterating the phonetic syllables of "grif" and "fon" into some purely phonetic representation wouldn't be as effective in showing that meaning.
Romanization
Romanization is the term used commonly in most East Asian languages to refer to systems that transliterate into the Latin (English) alphabet. In Japanese, it can even be referred to as "romanji" (Roman Characters, as opposed to Chinese Characters, "kanji"). For heraldic use, the SCA requires a uniform transliteration system for Non-Latin Scripts. Generally, this is the last step of the process if you are building a name from Chinese characters first, and once you've built it and verified the way it is written in the native language, then you choose a romanization system and convert. Thankfully, by plugging in the characters into dictionaries, you can pull the correct romanizations from there.
You might be wondering, "How do I know this is a period romanization for this character?" The answer is: it's not. While there are currently scholars working on reconstructing late-Middle, early-Middle, and more temporal versions of the languages we are about to delve into, that information is not fully developed, nor is it widely agreed upon, and all of that means that outside of writing down Chinese characters for names, we have a lack of any historical transliteration systems. When you read Joshua Badgley's "Japanese Names," or Sǫlveig Þrándardóttir's Name Construction in Mediæval Japan, the transliterations you are seeing are first the furiga'na/ふりがな/振仮名 (small phonetic notations next to Chinese characters that tell to a reader how to pronounce them, essentially a transcription of the characters as best a medieval author knew how) in a period document, where the same writing system was used that we can now read, but the spoken language for them sounded different. Next, those furiga'na are being transliterated, or romanized via a modern transliteration system. We have much knowledge on how to interpret/read meaning from these period sources, but as for the period pronunciation, that is something that is not as available. Looking to the furiga'na gives us how the local people would have said it at the time but does not provide the full context for how that reading might have drifted due to linguistic change.
The difference between Late Middle English (Chaucer) and early modern (Shakespeare) English is a decent comparison for how different the language sounds would be across time.
In order to register a perfectly faithful "period" version of a name after documenting the Chinese characters, there would need to exist full pronunciation conversion tables, laying out what different notations/transcription systems sounded like for each spoken language in the regions during different centuries, and then new, uniform transliteration systems would need developed to convert this knowledge into English sounds.
There exists some research for this in the respective languages, but so far there's not an easy conversion system, nor a uniform one, especially when you consider that modern China alone has many languages (see map to the right for a small sample). This poses an immense barrier due to the historical data needing to consider location, time frames, and establishing patterns within some of those categories.
Ii Katsumori illustrates this well in Chinese Onomastics - "Different hanzi (Chinese characters) may be pronounced different ways at different times; for example, comparing Mandarin to Cantonese, Sun Zhongshan would be read Syun Jungsaan, Mao Zedong is Mou Jaakdung, and Bai Juyi is Baak Geuiyik, depending on the dialect. Unfortunately, to attempt to reconstruct each name for different regional and temporal variations is beyond the scope of this paper." As what Katsumori is saying here, when written down, Sun Zhongshan and Syun Jungsaan are the same name.
A map published in 1990 by the US CIA depicting the then linguistic layout within Chinese borders.
These variations are called "readings" of a character, as in how someone looks at a character and "reads" it in their language. So I could say "the Mandarin reading of 山 is shān." If English were part of this linguistic ecosystem, you could theoretically see 山 and go "Oh, that's pronounced 'mountain' in English."
Here's an example of radicals and readings in relation to each other: The Chinese name "Kunlun" 崑崙 (or 崐崘) is written with characters combining the phonetics of kun 昆 and lun 侖.
This is where we come to Chinese characters; even though all these variations existed, we have written records in this common writing system. Even if there's some hiccups, educated nobles from each region would have been able to identify characters and communicate on some level using writing and reading. This is why many often propose registering names in Chinese characters rather than just the phonetics, in order to skip worrying about budging this boulder until we have more information.
While we cannot include any characters in the final results of a name, having the characters included in commentary letters is extremely helpful for conflict checking, especially if we ever have similarly romanized names. Ideally, we could one day register one of the millions (billions?) of potential written name combinations with Chinese characters - which would be the way name-record-keeping was done through most of these regions in period - and then when historical and regional and linguistic transliteration systems to English catch up, submitters/SCAdians could apply more accurate pronunciations if they wish.
Transliteration Systems to Consider Currently:
Note, these are all systems for the modern languages unless specified otherwise.
Mandarin: Pinyin, Wade-Giles, Yale
Cantonese: [will insert here]
Japanese: Hepburn, Nihon-shiki, JSL
Korean: Revised Romanization of Korean (RR), McCune–Reischauer (MR)
Okinawan: unofficially modified Hepburn