|
Qizx/Open v0.4p2 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--net.xfra.qizxopen.util.DefaultWordSifter
A default word extractor suitable for European languages compatible with ISO-8859-1.
By default, words start on a letter, accept letters/digits inside. Optionally (and by default), characters are folded to lowercase and accented letters are converted to the corresponding non-accented letters.
| Constructor Summary | |
DefaultWordSifter()
Builds a case-insensitive and accents-insensitive sifter. |
|
DefaultWordSifter(boolean caseSensitive,
boolean accentSensitive)
Builds a sifter specifying case and accent sensitiveness. |
|
| Method Summary | |
char |
charAt(int ahead)
Returns the character at current position + ahead, or 0 if after end. |
boolean |
isWordPart(char c)
Returns true if the char can be part of a word. |
boolean |
isWordStart(char c)
Returns true if the char can be at start of a word. |
char |
mapChar(char c)
Normalizes a character (belonging to a word) |
char |
nextChar()
Moves to next character and return it, returns 0 if at end. |
char[] |
nextWord()
Gets the next normalized word, or null if no more words. |
void |
start(char[] text,
int length)
Starts the analysis of a new text chunk. |
char |
wildcardSeveral()
Returns the wildcard character which matches several characters. |
char |
wildcardSingle()
Returns the wildcard character which matches a single character. |
int |
wordLength()
Returns the original length of the last word returned by nextWord. |
int |
wordOffset()
Returns the offset of the last word returned by nextWord. |
| Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public DefaultWordSifter()
public DefaultWordSifter(boolean caseSensitive,
boolean accentSensitive)
caseSensitive - if false, uppercase and lowercase characters are equivalent.accentSensitive - if false, a letter with diacritic signs is equivalent to
the same letter without diacritic sign, for example '?' is equivalent to 'e'.| Method Detail |
public void start(char[] text,
int length)
WordSifter
start in interface WordSifterpublic boolean isWordStart(char c)
WordSifter
isWordStart in interface WordSifterpublic boolean isWordPart(char c)
WordSifter
isWordPart in interface WordSifterpublic char wildcardSeveral()
WordSifter
wildcardSeveral in interface WordSifterpublic char wildcardSingle()
WordSifter
wildcardSingle in interface WordSifterpublic char mapChar(char c)
WordSifter
mapChar in interface WordSifterpublic char[] nextWord()
WordSifter
nextWord in interface WordSifterpublic char charAt(int ahead)
WordSifter
charAt in interface WordSifterpublic char nextChar()
WordSifter
nextChar in interface WordSifterpublic int wordOffset()
WordSifter
wordOffset in interface WordSifterpublic int wordLength()
WordSifter
wordLength in interface WordSifter
|
Copyright Xavier FRANC 2003-2004 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||