Sunday, February 25, 2007

One Character Foods

Other matters are demanding my attention right now. So this post will have to be lighter than the first dozen. A sorbet, if you will.

The challenge is to find food words of only one character. Character needs to have the very specific technical interpretation of assigned code in Unicode 5.0. Word means something that would have an entry in a dictionary.

Read More

Listing all these words would be hopelessly tedious. So the actual challenge is to find interesting new examples for categories of one character words. In general, one example per script group.

The biggest source is the Unified Han characters. So, just by way of a concrete example, take U+98EF 飯 'cooked rice; dish; meal'.

Some of the early mostly phonetic scripts still had a few ideograms. For example, Linear B has a handful, such as U+1008E 𐂎 'wheat'.

Unicode has some ranges of symbols, mostly for compatibility with other encoding standards. The closest to a food I can think of from Miscellaneous Symbols is U+2615 , which is meant to be 'coffee' or 'tea', depending on which side of the Atlantic you are on, but might just be 'soup'. Any better suggestions? What about other symbol groups? Is there a better example from the Yijing than U+4DDA , the open mouth?

Some scripts assign code points to syllables. The most obvious example would be Hangul and Korean has many one syllable words. To be concrete, take U+BC25 밥 bab, which again means things like 'cooked rice', as in 비빔밥 bibimbap. (Bibimbap traditionally has meat in it, but Korean restaurants around here offer vegetarian versions; Wikipedia even says that might be the original. Those three syllables also share just the right number of sounds to show off just how elegant the Hangul writing system is.) What about other syllabic scripts, like Ethiopic? Are there any one syllable food words in Ge'ez or Amharic?

In general, Unicode does not encode ligatures, since, like fonts, they relate to rendering. But again some result from needing to offer reversible transcoding with other standards. In particular, there are three-letter Arabic ligatures in the U+FD50-FDC7 range. Are any of the ones of those that work in isolation food words?

Single letter conjunctions, copulas and prepositions are not hard to find, but what about food nouns? Diacritics are allowed, of course, provided Unicode offers a composed character with them. says that é (U+00E9) is a regional word for húng 'basil'.

I am disinclined to allow abbreviations. So, I do not think that the L in BLT counts; too much context is required. (Of course, around here it's VLT, which the editorial review for this book says is made with fried leeks.) Nor longer ones like trying to have U+FB00 ff stand for French Fries.

Leave a comment with additions or improvements. With luck, I'll have time for a more interesting post next time.


Peter said...

Japanese has a moraic script and we find there す su 'vinegar'. Unicode 3059, I think.