[Product-Developers] Search feature slang

Raphael Ritz r.ritz at biologie.hu-berlin.de
Mon May 18 11:40:19 UTC 2009


Hi folks,

sorry if this is considered inappropriate but I'd appreciate input
from those who know more than I on searching technologies:

In short I need to support full text searches with

1. plural versus singular forms treated equally
2. American versus British English treated equally
3. (Obvious?) spelling errors corrected/taken care of.

Now from what I know this translates to

1. -> stemming support
2. -> appropriate normalization? thesaurus based search?
    (if so, what would be appropriate normalizers or
     thesauri?)
3. -> similarity search where similarity is defined
    according to some algorithm (e.g., Levenstein)

First question: did I get the vocabulary right here?

Second question: looking around I (obviously) consider
TextIndexNG but one thing I found there is that stemming
support is incompatible with globbing (wildcard) support.
While it seems obvious to me that they are kind of
mutually exclusive I'd like to know how others are dealing
with this (have two differently configured text indexes for
the full text search and query one or the other???).

Last but not least I'd appreciate pointers to docs teaching
the general concepts, constraints, and vocabularies so that
I know what I'm talking about in the future.

Thanks,
	Raphael





More information about the Product-Developers mailing list