Monday, November 4, 2013

ne-rom-translit: Roman Transliterated Nepali input method for Linux


We already have ne-rom why another method?

ne-rom-translit is more intuitive, less things to memorize which means easier adoption. Its similar to Google transliterate or Deepak Khanal's http://unicodenepali.com/, except for an important difference, there is no dictionary in ne-rom-translit.
Lets have a few examples
देवनागरीin ne-rom
you type
in ne-rom-translit
you type
नेपालnepalnepaal/nepAl
नेपालीnepalInepaalii/nepAlI
ठिटोQiqoThiTo
मलाईmla{malaai
जाऊjaFjaauu

Is this transliteration scheme perfect ? Is there nothing to memorize ?

No, from the examples above you can see its far from it. There are enough differences between Latin and Devanagari scripts that make transliteration schemes complicated. To tackle this inherent impedance mismatch, different translation schemes that have been standardized and put into practice Harvard-Kyoto and ITRANS to name a few. These schemes have found their use in Sanskrit and Hindi. The current implementation does not strictly implement any of these standards but is closer to Havard-Kyoto, the goal being transliteration should be as natural as possible to us Nepalese. I took Deepak's implementation chart as reference.
Another thing that worries me is this might not promote शुद्ध नेपाली लेखन

I dont like typing an extra a in nepaal

Well there is an inherent ambiguity in transliterated nepali in latin
sometimes pal -> पल
in some cases pal -> पाल
I had to choose one, I chose the former. The problem is that there is an implicit अ in Devanagari consonants, when transliterating this sometimes translates into explicit 'a' sometimes it doesnt which results in this ambiguity.
I am not aware of standardization efforts at Government level. So, whenever there are choices like these its best that a community decides. We dont have to stick with a single input method, if there is no consensus we can go with multiple variants. Implementation is open source, everybody is welcome to contribute.

I typed `table` and something silly gets typed instead of टेबल

The main reason behind this is that English is what I would say is phonetically inconsistent.
no → नो, go  गो but do  डु and then done → डन ie same set of characters tend to be pronounced differently, we all know this, one of the pain points of learning english.
But when we write a english word in nepali the conversion is phonetic -- we write characters that sound similar to english pronunciation, which is almost always not what the correct spelling is
cat -> क्याट [kyat]
mouse -> माउस [maaus]
screen -> स्क्रिन [skrin]
chemistry -> केमेस्ट्री [kemestri] (you might say thats रसायनशास्त्र in Nepali, well how about calculus)
It is a pity that our language does not have many words specially in Sci-tech field and we are forced to use words borrowed words quite often. This comes as a real nuisance while using transliterated layouts -- you have to mentally form phonetically correct spelling before you type. If you want to see the result of this, look at the comments in mysansar, the effect ranges from funny to illegible and at times painful.
Another facet of this problem is that english words have been accepted into Nepali vocabulary (आगन्तुक शब्द) and they tend to be pronounced in more Nepali way.
table -> टेबुल
glass -> गिलास
Cant this be made natural ? may be with a dictionary or some sophisticated algorithm -- yes or may be we are trying to solve the wrong problem here. If you need to type regularly in Nepali, anything more than occasional facebook status updates and tweets, switch to some fixed layout where each keyboard button maps to a fixed devanagiri character -- MPP romanized layout if you wish, but the time tested Traditional layout which was used in typewriters should be worth practicing if you want speed as well. Here is one variation right here.

Where is the keyboard map ?

You cant make a keyboard map for this as such.
Here is a chart of which latin letters (or group of letters) map to a devanagari letter
स्वर Vowels




ि











a


aa
A

i


ii
I
ee
u


uu
U
oo
rri


rrii
rree

e


ai


o


au


M
N

H


व्यन्जन Consonants
Velar
k
q
c
kh


g


gh


ng


Palatal
ch chh j
z
jh yn
Retroflex
T
t/
Th
th/
D
d/
Dh
dh/
n/
Dental
t th d dh n
Labial
p ph
f
b bh m
Semi vowel
y r l v
w
Fricative
sh Sh
shh
s h
Compound
क्ष त्र ज्ञ
ksh tr gyn
Others
्+ZWJ ्+ZWNJ
M
N
*

|

\


You can also refer to Deepak Khanal's chart
There is a wiki usage page for usage examples.

How to install this in Linux ?

This input method is in early stages of development. Once it receives enough feedback and becomes stable, this will be to contributed to m17n so that it would be a part of standard distributions. 
For now installation is manual. Source and instructions at github.
Bug fixes and enhancements are welcome. You can also contribute by sending your feedback and suggestions, updating the wiki or report issues in the issue tracker page.

How to install this in Windows ?

You cant use this in Windows. There you have a better alternative Google Input Tools


Background info everybody can skip

I was looking for a decent implementation of Nepali Traditional layout for Linux and closest I could get was one half baked implementation of Nepali Traditional Layout by MPP in IBus with ibus-m17n which has several characters missing. I do appreciate the efforts put in by MPP, their pioneering work has brought Nepali unicode into mainstream but was never a fan of their Traditional Layout. In my opinion that layout is mostly academic. In Windows I have always used a customized version of their layout which is much closer to the TTF layout I am comfortable with. So when I switched to Linux I was looking for something similar. First thing I tried was to edit xkb map files which is similar in functionality with Microsoft Keyboard Layout Creator, except for a nice GUI. Then I stumbled upon m17n mim which turned out to be way more powerful than xkb. I have implemented a custom traditional layout for my use ne-trad-ttf. Then I saw transliterating implementations were available for several Indian languages like hindi, bengali, assamese, gujrati and others but Nepali input methods were in a sorry state. We could in theory use hi-itrans method to type Nepali as we share the same script Devanagiri but there are several characters that are never used in Nepali which are needed for Hindi. I took liberty to remove such characters and to make it suitable for everyday use for Nepali and hence ne-rom-translit.