|
xapian-core
2.0.0
|
Handle text without explicit word breaks. More...
Include dependency graph for word-breaker.h:
This graph shows which files directly or indirectly include this file:Go to the source code of this file.
Classes | |
| class | NgramIterator |
| Iterator returning unigrams and bigrams. More... | |
Functions | |
| bool | is_ngram_enabled () |
| Should we use the n-gram code? More... | |
| bool | is_unbroken_script (unsigned codepoint) |
| bool | is_unbroken_wordchar (unsigned codepoint) |
| size_t | get_unbroken (Xapian::Utf8Iterator &it) |
Handle text without explicit word breaks.
Definition in file word-breaker.h.
| size_t get_unbroken | ( | Xapian::Utf8Iterator & | it | ) |
Definition at line 142 of file word-breaker.cc.
References is_unbroken_wordchar().
Referenced by Xapian::break_words(), and Xapian::QueryParser::Internal::parse_term().
| bool is_ngram_enabled | ( | ) |
Should we use the n-gram code?
The first time this is called it reads the environment variable XAPIAN_CJK_NGRAM and returns true if it is set to a non-empty value. Subsequent calls cache and return the same value.
Definition at line 43 of file word-breaker.cc.
References p.
Referenced by Xapian::TermGenerator::Internal::index_text(), Xapian::QueryParser::Internal::parse_query(), and Xapian::MSet::Internal::snippet().
| bool is_unbroken_script | ( | unsigned | codepoint | ) |
Definition at line 51 of file word-breaker.cc.
References p.
Referenced by is_unbroken_wordchar(), Xapian::QueryParser::Internal::parse_term(), and Xapian::parse_terms().
| bool is_unbroken_wordchar | ( | unsigned | codepoint | ) |
Definition at line 136 of file word-breaker.cc.
References is_unbroken_script(), Xapian::Unicode::is_wordchar(), and p.
Referenced by get_unbroken(), NgramIterator::init(), NgramIterator::operator++(), and Xapian::parse_terms().