cjk-tokenizer.h File Reference

Tokenise CJK text as n-grams. More...

#include "xapian/unicode.h"
#include <string>
class  CJKTokenIterator
 Iterator returning unigrams and bigrams. More...




bool CJK::is_cjk_enabled ()
 Should we use the CJK n-gram code? More...
bool CJK::codepoint_is_cjk (unsigned codepoint)
void CJK::get_cjk (Xapian::Utf8Iterator &it)

Detailed Description

Tokenise CJK text as n-grams.

