Lingucomponent Sub-Project: Thesaurus Development
The goal of this project is to improve existing thesauri for OpenOffice.org and to create new thesauri for languages that don't have one yet.
This project started by searching for and finding a synonym list for English (US) that was compatible with the OpenOffice.org licensing and then using that list and some simple software to develop a thesaurus for OpenOffice.org 1.x. OpenOffice.org 2.x now uses a thesaurus automatically built from the data in WordNet. The internal file format has also changed to a text-based one.
TODO
- See the list of all open thesaurus issues
- Create new thesauri (see below)
Downloads
- MyThes-1.zip (4,5MB) - standalone version of the MyThes thesaurus code. This includes a thesaurus for en_US in its new format for OOo 2.0 (but not yet the WordNet-based thesaurus).
- wn2ooo, the script used to create the OOo thesaurus from WordNet data.
Creating a new thesaurus
If you are willing to maintain a website to collect and coordinate a community developed synonym list for any language we need your help. Please send an e-mail to dev@lingucomponent.openoffice.org listing your skills and interests in being involved in this project. A web-based software for building a new thesaurus is OpenThesaurus, which is already successfully used to maintain the German, Polish, and other thesauri. All you need is some knowledge of MySQL and a Java-enabled server space to run your own version of OpenThesaurus.