|
Namespaces |
| namespace | Xapian |
| namespace | Xapian::Unicode |
| namespace | Xapian::Unicode::Internal |
Classes |
| class | Xapian::Utf8Iterator |
| | An iterator which returns Unicode character values from a UTF-8 encoded string. More...
|
Enumerations |
| enum | Xapian::Unicode::category {
Xapian::Unicode::UNASSIGNED,
Xapian::Unicode::UPPERCASE_LETTER,
Xapian::Unicode::LOWERCASE_LETTER,
Xapian::Unicode::TITLECASE_LETTER,
Xapian::Unicode::MODIFIER_LETTER,
Xapian::Unicode::OTHER_LETTER,
Xapian::Unicode::NON_SPACING_MARK,
Xapian::Unicode::ENCLOSING_MARK,
Xapian::Unicode::COMBINING_SPACING_MARK,
Xapian::Unicode::DECIMAL_DIGIT_NUMBER,
Xapian::Unicode::LETTER_NUMBER,
Xapian::Unicode::OTHER_NUMBER,
Xapian::Unicode::SPACE_SEPARATOR,
Xapian::Unicode::LINE_SEPARATOR,
Xapian::Unicode::PARAGRAPH_SEPARATOR,
Xapian::Unicode::CONTROL,
Xapian::Unicode::FORMAT,
Xapian::Unicode::PRIVATE_USE,
Xapian::Unicode::SURROGATE,
Xapian::Unicode::CONNECTOR_PUNCTUATION,
Xapian::Unicode::DASH_PUNCTUATION,
Xapian::Unicode::OPEN_PUNCTUATION,
Xapian::Unicode::CLOSE_PUNCTUATION,
Xapian::Unicode::INITIAL_QUOTE_PUNCTUATION,
Xapian::Unicode::FINAL_QUOTE_PUNCTUATION,
Xapian::Unicode::OTHER_PUNCTUATION,
Xapian::Unicode::MATH_SYMBOL,
Xapian::Unicode::CURRENCY_SYMBOL,
Xapian::Unicode::MODIFIER_SYMBOL,
Xapian::Unicode::OTHER_SYMBOL
} |
| | Each Unicode character is in exactly one of these categories. More...
|
Functions |
| int | Xapian::Unicode::Internal::get_character_info (unsigned ch) |
| | For internal use only.
Extract the information about a character from the Unicode character tables.
|
| int | Xapian::Unicode::Internal::get_case_type (int info) |
| | For internal use only.
Extract how to convert the case of a Unicode character from its info.
|
| category | Xapian::Unicode::Internal::get_category (int info) |
| | For internal use only.
Extract the category of a Unicode character from its info.
|
| int | Xapian::Unicode::Internal::get_delta (int info) |
| | For internal use only.
Extract the delta to use for case conversion of a character from its info.
|
| unsigned | Xapian::Unicode::nonascii_to_utf8 (unsigned ch, char *buf) |
| | Convert a single non-ASCII Unicode character to UTF-8.
|
| unsigned | Xapian::Unicode::to_utf8 (unsigned ch, char *buf) |
| | Convert a single Unicode character to UTF-8.
|
| void | Xapian::Unicode::append_utf8 (std::string &s, unsigned ch) |
| | Append the UTF-8 representation of a single Unicode character to a std::string.
|
| category | Xapian::Unicode::get_category (unsigned ch) |
| | Return the category which a given Unicode character falls into.
|
| bool | Xapian::Unicode::is_wordchar (unsigned ch) |
| | Test if a given Unicode character is "word character".
|
| bool | Xapian::Unicode::is_whitespace (unsigned ch) |
| | Test if a given Unicode character is a whitespace character.
|
| bool | Xapian::Unicode::is_currency (unsigned ch) |
| | Test if a given Unicode character is a currency symbol.
|
| unsigned | Xapian::Unicode::tolower (unsigned ch) |
| | Convert a Unicode character to lowercase.
|
| unsigned | Xapian::Unicode::toupper (unsigned ch) |
| | Convert a Unicode character to uppercase.
|
| std::string | Xapian::Unicode::tolower (const std::string &term) |
| | Convert a UTF-8 std::string to lowercase.
|
| std::string | Xapian::Unicode::toupper (const std::string &term) |
| | Convert a UTF-8 std::string to uppercase.
|