[Evangelism] Help to improve text search for East-Asia languages
Takeshi Yamamoto
tyam at mac.com
Thu Nov 6 04:33:18 UTC 2008
Let me post this other than Plone-AsiaPacific ML for the people
who uses Plone with non-English/Latin languages.
Some languages need to be handled differently for better text searching.
Help to improve text search for East-Asia languages
Let me post the first initiative (requesting help in other word) for
Asia Pacific area.
Some of you may know Japanese Plone community is working on improving
text search feature of plone for East Asian languages. For example,
Japanese
words can not be distinguished by space, as well as Chinese and Korean
languages.
Mr. Terada, CEO of CMSCOM has stood up and worked on google summer of
code
as one of Plone foundation-supported project this year.
Unfortunately, the student
has gave up and it was not complete. Terada-san has decided to make
it completed
and started it again as his company's project. Since that feature is
valuable for many
people(1.5 billion people are living in Kanji region), and it is open
source, and
we hope it could be built into out-of-the-box Plone, Japanese
community is
supporting this project. We will have a sprint event for this in the
World Plone Day 2008 Tokyo.
The software current status is BETA version and you can download and
try, or
just access to the test and play with it. We appreciate any of your
bug report
or suggestions. We do not have enough testers for "non-Japanese"
languages.
Languages what we would like to cover with that bigramsplitter are:
Japanese
Mandarin Chinese (Beijing)
Cantonese (Canton)
Taiwanese (Taiwan)
Korean (Korea)
Mongolian (Mongol)
Thai (Thailand)
Vietnamese (Viet Nam)
Jawi (Malaysia)
Bahasa Indonesia (Indonesia)
Hebrew (Israel)
Arabic (Middle-East)
etc.
The languages which are not used in Asia, but different from English/
Latin
languages are welcome, of course.
The project site is here:
http://code.google.com/p/bigramsplitter/
You can download the code from here:
http://code.google.com/p/bigramsplitter/downloads/list
The test site is here to play with.
http://c2search.cmscom.jp/
You may need an account to put some text to be searched in your own
language.
Request your login account here.
http://c2search.cmscom.jp/contact-info
Sorry for the test site is not well internationalized, but there is no
problem if you
write your request in English.
Thanks a lot in advance.
Takeshi Yamamoto / retsu
More information about the Evangelism
mailing list