Site hosted by Angelfire.com: Build your free website today!

Langmixer stuff

The very interesting tool Langmixer has caught my attention as I think it can help to learn foreign languages (especially with the more boring vocabulary learning part).

I am currently trying to learn Russian by myself, mostly by trying to read real texts (no useless textbook stuff). The problem with this approach is my russian vocabulary, which is not very large yet; this makes reading the texts difficult, as I have to look up nearly every world in the dictionary. Even computer dictionaries (KSocrat KSocrat) or web-based ones don't really speed this process up; especially as I also want to record the words I have read in a list, so I can memorize them.

KSocrat2Langmixer Script

I haven't found any truly free russian->english dictionary file to use with Langmixer on the web ... yet. If you know of any drop me a mail .

In the meantime, I hope to get around with the data file from the KSocrat. It contains a list of russian->english translation of words in a KOI8-R encoded text file. I wrote a little script to convert that file into the dictionary .js-files needed for Langmixer. Unfortunately, the use of the KSocrat data file is restricted to KSocrat only (as it seems to be from a commerial company). BUT, if you have KSocrat, you have the data file, and can use at as you please. You can download my conversion script, run it on the data file (which is somewhere around /usr/share/ksocrat)

Download ksocrat2langmixer.

How to use it:

Known Problems

Well... the major problem is the 3000-entries limit of the Langmixer dictionaries. I haven't yet found out, if this is in some way related to the Mozilla JavaScript implementation (the dictionary files are actually dynamically loaded JavaScript code files that define the entries as an array), a limit in the JavaScript language spec... or something else. The problem is the KSocrat dictionary which has about 50000 entries; this would normally be a good thing, but in this case it causes Mozilla to crash (as far as I see it) when trying to load the dictionary .js-file.
The current version of my script avoids this by only taking every 10th entry of the dictionary; which leaves you with a somewhat random choice of words in the output file, often leaving out important words.
I am not sure if there is a simple solution to this... some sort of rewrite of the responsible Langmixer code might be required.

Many words are also not recognized due to declination, conjugation, etc.. I think there are solutions to this in the Langmixer code, and I hope to be able to look into this at some point.