Saturday, 15 December 2012

Guangdong - Neih sīkm̀hsīk góng gwóngdùngwá a?

Like most people, I commonly use the word Chinese when I'm talking about the language that people speak in China.  Of course, the reality is that there isn't one single language called Chinese, but rather a collection of dialects and topolects that, collectively, make up the Chinese 'family' of languages.  I'm sure that the distinction between a dialect and a language, is every bit as controversial and as equally dependent on the political situation, in China, as it is anywhere in the world.  I like the word topolect, as it suggests something more than a mere dialect.

Pǔtōnghuà and the rest

There's no disputing that Mandarin rules the roost in China - with more than 840 million speakers, Mandarin is, by far, the biggest Chinese topolect.  Its is also politically important, as it is the official language of the Chinese government and power structures.  Mandarin has been adopted as the 'standard' version of Chinese, called Pǔtōnghuà or 'common language'. 

Other major Chinese topolects include Wu which is spoken around Shanghai by about 90 million speakers, Yue (a.k.a. Cantonese) spoken in Guangdong and the south, including Hong Kong, by about 70 million people, Xiang (65 million) spoken in Hunan, Min (60 million) in Fujian, Hainan and parts of Taiwan, Hakka (50 million) spoken in Fujian and Guangxi and Gan (30 million) spoken in Jiangxi, Hubei and Hunan. 

The bigger linguistic picture

The Languages of China
The Chinese languages themselves belong to a wider language family that includes Tibetan and many of the languages of Myanmar (Burma).

As well as the Chinese languages, there are minority languages of hill tribe peoples, as well as Turkic languages in the west and Mongolian and Korean in the north.  The overall picture makes for a greater level of linguistic diversity than you would, at first, imagine.


Mandarin has the largest number of native speakers in the world, followed by Spanish, English, Hindi and Arabic, but Cantonese itself is a substantial language, equivalent in its number of native speakers to languages like Italian or Turkish.  Cantonese is also very much a world language and is spoken in immigrant communities all over Asia, Europe and the Americas.

Trying to learn Cantonese?

I tried to learn some Cantonese before taking a trip to Hong Kong - not so much to speak the language, as to get a sense of its sound and 'feeling'.  Whilst I have learned a tonal language before (Thai) and could, on a very basic level, understand the importance of hearing the different tones of Cantonese, it was the first time that I ever studied a language where there are two tape scripts for every listening exercise; one at the normal speed of a native Cantonese speaker and a second recording, slowed down for the benefit of the Cantonese student.  I think that says it all really! 

Lexical environment analysis

The different dialects of Yue/Yuet (Cantonese)
As part of my research, I did a very rudimentary lexical analysis of Mandarin and Cantonese, to get a sense of how different they are, at least in terms of some basic vocabulary.

I've used a simple test that I call Lexical environment analysis to determine the relationship between words in different languages.  I used a very similar test with Uighur to explore its relationship with other Turkic languages and possible influences from (Mandarin) Chinese. 

The concept is straightforward and is based on my knowledge of the relationship between English, German and French.  I believe there are certain words that are part of the Natural environment of a language - things such as body parts, natural elements, native animals etc. that surely existed in the language before colonisation or influence by another people/language.  Then there are things which are part of a Constructed environment - furniture, inventions and other innovations that were (perhaps) introduced by another people in their own language.

English, French and German

You can see what I mean by the comparison of English, French and German (below):

Natural environment lexicon


English German French
Finger Finger Doigt
Father Vater Père
Moon Mond lune
Rain Regen pluie
Swine Schwein cochon/porc
Earth Erde terre

It's obvious here that English has much more in common with German, which makes sense as English is, fundamentally, a Germannic language.


Constructed environment lexicon


Glove Handschuh gant
Boss Chef chef
Candle Kerze bougie
Umbrella Schirm parapluie
Pork Schweinefleisch  porc
Chair Stuhl chaise

The picture is more complicated here and you can see the influence of French on English words such as pork and chair.

Mandarin and Cantonese

 Natural environment lexicon


English Mandarin Cantonese
Finger shǒuzhǐ sau ji
Father fùqīn foo chan
Moon yuèliàng yuet
Rain yue
Swine zhū jùe
Earth Dìqiú dei kau

At a very basic level, it's clear that Mandarin and Cantonese are incredibly close in terms of their basic natural environment lexicon.  If we look at the constructed environment lexicon however, it's clear that the two 'languages' are in a process of separating and developing lexicons which are unfamiliar to each other's native speakers.


Glove shǒutào sau mat 
Boss lǎobǎn boh si
Candle làzhú laap juk
Umbrella sǎn
Pork zhūròu jue yuk
Chair yǐzi dang

I think it's a process that takes centuries, but you can definitely see a bigger difference with these more 'modern' words.  It's also interesting that, despite the fact that this is such a small sample, I can already see the influence of English on Cantonese, more so than on Mandarin (boss and boh si).

Cantonese Wikipedia

I think Wikipedia can be a good indication of how languages are doing in terms of their online presence, so I did a quick survey of which Chinese languages have their own Wikipedias.

Not surprisingly, Mandarin is right up there and is the 11th biggest Wikipedia in terms of the number of articles, not far behind Portuguese.  Cantonese is currently number 92 - not great in European terms (less articles than Sicilian!) but equivalent to other 'big' languages from outside Europe, eg. Gujarati, which has 49 million native speakers.  Min is also present on Wikipedia - interestingly written in a Romanised script. Wu, Hakka and Gan are also there, but with very small numbers of articles, equivalent to Maltese, Cornish and Corsican, respectively! 

It will be interesting to see how the 'other' Chinese topolects compete with Mandarin in the future - I wonder if they will forever be consigned to 'dialect' status, or whether they will become languages in their own right?

I'm going to leave you with a sample of how Cantonese sounds - you can find almost everything on YouTube, even a recital of Bai Juyi's beautiful poem, Song of Unending Sorrow, here spoken by Cantonese businessman James Chan.  Enjoy!



Image credits:

Both maps were taken from Wikimedia Commons and are in the public domain. 

No comments: