xapian-core
1.4.26
|
Iterator returning unigrams and bigrams. More...
#include <word-breaker.h>
Public Member Functions | |
NgramIterator (const std::string &s) | |
NgramIterator (const Xapian::Utf8Iterator &it_) | |
NgramIterator () | |
const std::string & | operator* () const |
NgramIterator & | operator++ () |
bool | unigram () const |
Is this a unigram? More... | |
const Xapian::Utf8Iterator & | get_utf8iterator () const |
bool | operator== (const NgramIterator &other) const |
bool | operator!= (const NgramIterator &other) const |
Private Member Functions | |
void | init () |
Call to set current_token at the start. More... | |
Private Attributes | |
Xapian::Utf8Iterator | it |
unsigned | offset = 0 |
Offset to penultimate Unicode character in current_token. More... | |
std::string | current_token |
Iterator returning unigrams and bigrams.
Definition at line 52 of file word-breaker.h.
|
inlineexplicit |
Definition at line 67 of file word-breaker.h.
References init().
|
inlineexplicit |
Definition at line 71 of file word-breaker.h.
References init().
|
inline |
Definition at line 75 of file word-breaker.h.
|
inline |
|
private |
Call to set current_token at the start.
Definition at line 96 of file word-breaker.cc.
References Xapian::Unicode::append_utf8(), is_unbroken_script(), and Xapian::Unicode::is_wordchar().
Referenced by NgramIterator().
|
inline |
Definition at line 94 of file word-breaker.h.
|
inline |
Definition at line 77 of file word-breaker.h.
References current_token, and operator++().
NgramIterator & NgramIterator::operator++ | ( | ) |
Definition at line 110 of file word-breaker.cc.
References Xapian::Unicode::append_utf8(), is_unbroken_script(), and Xapian::Unicode::is_wordchar().
Referenced by operator*().
|
inline |
Definition at line 88 of file word-breaker.h.
References current_token.
|
inline |
Is this a unigram?
Definition at line 84 of file word-breaker.h.
Referenced by Xapian::parse_terms().
|
private |
Definition at line 61 of file word-breaker.h.
Referenced by operator*(), and operator==().
|
private |
Definition at line 53 of file word-breaker.h.
Referenced by get_utf8iterator().
|
private |
Offset to penultimate Unicode character in current_token.
If current_token has one Unicode character, this is 0.
Definition at line 59 of file word-breaker.h.