xapian-core
1.4.27
|
Handle text without explicit word breaks. More...
Go to the source code of this file.
Classes | |
class | NgramIterator |
Iterator returning unigrams and bigrams. More... | |
Functions | |
bool | is_ngram_enabled () |
Should we use the n-gram code? More... | |
bool | is_unbroken_script (unsigned codepoint) |
void | get_unbroken (Xapian::Utf8Iterator &it) |
Handle text without explicit word breaks.
Definition in file word-breaker.h.
void get_unbroken | ( | Xapian::Utf8Iterator & | it | ) |
Definition at line 86 of file word-breaker.cc.
References is_unbroken_script(), and Xapian::Unicode::is_wordchar().
Referenced by Xapian::QueryParser::Internal::parse_term().
bool is_ngram_enabled | ( | ) |
Should we use the n-gram code?
The first time this is called it reads the environment variable XAPIAN_CJK_NGRAM and returns true if it is set to a non-empty value. Subsequent calls cache and return the same value.
Definition at line 41 of file word-breaker.cc.
Referenced by Xapian::TermGenerator::Internal::index_text(), and Xapian::QueryParser::Internal::parse_query().
bool is_unbroken_script | ( | unsigned | codepoint | ) |
Definition at line 71 of file word-breaker.cc.
Referenced by get_unbroken(), NgramIterator::init(), NgramIterator::operator++(), Xapian::QueryParser::Internal::parse_term(), and Xapian::parse_terms().