TOK2VOC
An AGI to SCI vocabulary resource converter by John Nickerson (Mokalus of Borg), mokalus@yahoo.com.au
(c) John Nickerson, January 2002
License Agreement
You are hereby granted the right to do whatever you like with this software
and it's source code, including redistribution and alteration, on the following
conditions:
-
The author's name and contact details (John Nickerson, mokalus@yahoo.com.au)
are not removed or hidden.
-
The author accepts no responsibility for any damage that may be caused
to your hardware, other software, or wetware, through use of the program.
-
No warranty is given, either express or implied, including, but not limited
to, usability or fitness for a particular purpose, except where prohibited
by law, and on the planet Neptune ('coz we all know what they're like about
warranties).
Executive summary of license agreement
Go nuts, but don't bother me.
Current version
1.1.0b ('b' for 'beta')
Using the program
(The file "tok2voc.exe" has been compiled from "tok2voc.py" with "py2exe", available from Starship Python, and requires "python21.dll" to run.)
(The file "tok2voc.py" requires the Python interpreter, which can be downloaded for free at www.python.org for a variety of platforms.)
The program can be run with either command-line arguments, or user input. The command-line syntax is as follows:
tok2voc agipath scipath
where 'agipath' is the path location of the AGI vocabulary resource (WORDS.TOK),
and 'scipath' is the path location of the SCI output file to write.
If no arguments are given on the command line, then the program will prompt for path names during execution. If you have problems getting the program to open files, try writing out the path in full.
There is no command for online help.
Further information/programs
The AGI WORDS.TOK format: http://members.ozemail.com.au/~ptrkelly/agi/specs/otherdata/8-2.html
To view SCI vocabulary resources: http://www.bripro.com/scistudio/download.html
The SCI vocabulary file format: http://freesci.linuxgames.com/scihtml/x5171.html#AEN5173
The Python interpreter and language: http://www.python.org
Known issues
-
Since SCI uses "parts of speech" in parsing, but AGI does not, TOK2VOC compensates by assigning all part-of-speech codes to all words except those already designated as "noword".
Fix: Using an SCI vocab editor (such as is part of SCI Studio), go through the file one word group at a time, assigning parts of speech manually.
-
SCI uses only 12 bits for word numbers, as opposed to AGI's 16 bits. This means that word numbers over 4095 cannot be directly ported over. TOK2VOC retains most of the original numbers, but some word numbers may be altered in the translation. (version 1.1.0b and above)
Bug reporting
To report a bug, please email me at mokalus@yahoo.com.au,
with the following format:
Name:
Date:
Program version:
Command line used:
Description of error:
Error message (if any):