Monday, August 18, 2008

A guessing game with online translators?

In the translation forums, there have been comments to the effect that Google translate and Yahoo babelfish translators where not designed as a language learning tool and not designed to translate only one word at a time. Their main goal is to quickly translate or create whole sentences in other languages.

Although they may use technologies such as SMT (Statistic Machine Translation) or Language Weaver or license other algorithms to arrange the translation in the proper logical grammar sequence, it is clear that they still base their translation mainly on a word-by-word basis.

For example, in English, there are words that are spelled the same but have totally different meanings based on the context of the sentence. Some examples are:

Train (transportation vehicle - railway engine with its carriages)
火车 huǒchē or 列车 lièchē

Train (prepare oneself, through instruction, practice, exercise)
培养 péiyǎng – cultivate or 锻炼 duànliàn – exercise

Temple (building in which people worship)
圣殿 shèngdiàn or 寺院 sìyuàn

Temple (flat part on the side of the forehead)
太阳穴 tàiyángxué

Calf (young cow or bull - young of certain other mammals)
犊 dú

Calf (back of the human leg)
小腿 - xiǎotuǐ

English to ChineseNow enter these English terms into Google translate or Yahoo babelfish as a single word in Chinese. What word will it choose for English? The first word in its electronic dictionary list perhaps??? What about in a sentence? Will it be smart enough to know the context of the sentence and choose the best word? Try: the calf on my leg. Or try: calf muscle ... each time it chooses baby cow for those sentences instead of the part of the body.

Definitely there is an issue in the translation. How can Google or Yahoo solve this issue? They will either need a better translation algorithm to know the context of the sentence (very difficult), or allow users to select the proper meaning/word.
Reflection translation is often useful to make sure your translation is correct.

This would allow you to enter the sentence just created/generated and run it back though the translation engine. For example, in Google or Yahoo, re-run your translation from English to Chinese, back though as Chinese to English. You may be surprised of how well it translates your sentences (as in it often fails to guess properly).

A better way to check your Chinese translation generated by Google translate or Yahoo babelfish by this site, ThePureLanguage.com. If certain words of the translation are incorrect, (for example, Google guesses wrong at your English words), try translating those words in The Pure Language.

Chinese to English TranslationThe English to Chinese Translation will allow you to select the appropriate translation from the multiple word choices displayed.

The purelanguage doesn’t guess at the possible meanings of your English words. It lets you select based on your human intelligence. Who is more qualified to know the sentences context then the creator or the reader who knows the subject of the text.

Try the Pure Language today and improve your Chinese translations from Google and Yahoo translators.

Friday, August 15, 2008

Where's the Pinyin?

Chinese to PinyinMany people have asked why Google translate and Altavista/Yahoo babelfish translator cannot provide pinyin romanization for Chinese translations. Some would think that this would be an easy feature to add and would make their respective translators more useful.

Its often interesting to note that even native Chinese sometimes don’t know how to pronounce some of the less common characters. Would Pinyin help even a native Chinese? I have noticed that the older generations of Chinese people are not familiar with pinyin, however, I have found the younger generation are skilled enough to read pinyin and even write, although sometimes they not too sure about the tones.

So what really are the challenges faced by Google translate and Altavista/Yahoo babelfish in regards to showing Pinyin. One of the biggest challenges is how to display the pinyin cleanly with the character/symbol. You may have noticed here at the PureLanguage, for our translation output, we use tables to align the Chinese word/sentence/idiom, with the Pinyin and English. We also split the Chinese into logical words, so its not one big block of symbols. Splitting up the Chinese block, is also something that Google and Altavista/Yahoo do not attempt. To show the pinyin correctly positioned under the characters, they will need to change their output formatting and add the splitting functionality to their translation code.

Another obstacle they face is that there are some single Chinese characters that have multiple pinyin pronunciations. So which one to show? It’s difficult to know without understanding the context of the sentence. The Pure Language doesn’t guess at the meaning of the content, so it displays all variations. However, Google and Altavista/Yahoo do guessing to provide the translation and grammar, so it could easily select the wrong pinyin variation.

But wait! Hold on! Google has created a Chinese to English Dictionary feature that allows you to look up an English word and get its pinyin, and English translation. Unfortunately, its limited to only one word at a time and there are no Pinyin tones??!! That’s right no Tones?? Very strange!!!
For example when I translated the character: 长
I got the following (somewhat correct, but no tones):
[Pinyin] [chang]
[Pinyin] [zhang]
· long
· to grow
· a strong point ; strength of someone or something
· the length
· the person in charge
· a senior ; a superior
· elder ; older
· to increase ; to acquire

Chinese to English DictionaryThis is extremely limited for someone who wants more than a few words into pinyin. How could someone be expected to use this for a whole page of Chinese characters? What part of the block of Chinese characters is actually the word/idiom? Also, in regards to the above translation from Google, it is my experience that cháng refers to Long (length), and zhǎng refers to Older (chief). There is no distinction from Google here., whereas the Pure Language makes this distinction clearly separating each Pinyin and English variations with slashes.

But who really would want to use this dictionary feature for more than a few words? There are faster and better methods, such as this site Pure Language.
Someone said in another online posts, that Google/Altavista/Yahoo translate where not designed as a language learning tool, but being able to quickly translate or create sentences in other languages. I see their main goal. That is why I have created this site. Not only is it a great learning tool, but also you can use it to validate the translations generated by Google/Altavista/Yahoo and others, to make sure it guessed correctly at the words/meaning of your sentence. You might find it very surprising when you check what you got from these other sites.

On this purelanguage site, you can also generate Pinyin from English entered, or even translate Pinyin back into English and Chinese. Give it a try today! Enjoy!!!