Public Types | Public Member Functions | List of all members
Xapian::QueryParser Class Reference

Build a Xapian::Query object from a user query string. More...

Public Types

enum  feature_flag {
  FLAG_BOOLEAN = 1, FLAG_PHRASE = 2, FLAG_LOVEHATE = 4, FLAG_BOOLEAN_ANY_CASE = 8,
  FLAG_WILDCARD = 16, FLAG_PURE_NOT = 32, FLAG_PARTIAL = 64, FLAG_SPELLING_CORRECTION = 128,
  FLAG_SYNONYM = 256, FLAG_AUTO_SYNONYMS = 512, FLAG_AUTO_MULTIWORD_SYNONYMS = 1024 | FLAG_AUTO_SYNONYMS, FLAG_DEFAULT = FLAG_PHRASE|FLAG_BOOLEAN|FLAG_LOVEHATE
}
 Enum of feature flags. More...
 
enum  stem_strategy
 Stemming strategies, for use with set_stemming_strategy().
 

Public Member Functions

 QueryParser (const QueryParser &o)
 Copy constructor.
 
QueryParseroperator= (const QueryParser &o)
 Assignment.
 
 QueryParser ()
 Default constructor.
 
 ~QueryParser ()
 Destructor.
 
void set_stemmer (const Xapian::Stem &stemmer)
 Set the stemmer. More...
 
void set_stemming_strategy (stem_strategy strategy)
 Set the stemming strategy. More...
 
void set_stopper (const Stopper *stop=NULL)
 Set the stopper. More...
 
void set_default_op (Query::op default_op)
 Set the default operator. More...
 
Query::op get_default_op () const
 Get the current default operator. More...
 
void set_database (const Database &db)
 Specify the database being searched. More...
 
void set_max_wildcard_expansion (Xapian::termcount limit)
 Specify the maximum expansion of a wildcard term. More...
 
Query parse_query (const std::string &query_string, unsigned flags=FLAG_DEFAULT, const std::string &default_prefix=std::string())
 Parse a query. More...
 
void add_prefix (const std::string &field, const std::string &prefix)
 Add a probabilistic term prefix. More...
 
void add_boolean_prefix (const std::string &field, const std::string &prefix, bool exclusive)
 Add a boolean term prefix allowing the user to restrict a search with a boolean filter specified in the free text query. More...
 
TermIterator stoplist_begin () const
 Iterate over terms omitted from the query as stopwords.
 
TermIterator unstem_begin (const std::string &term) const
 Iterate over unstemmed forms of the given (stemmed) term used in the query.
 
void add_valuerangeprocessor (Xapian::ValueRangeProcessor *vrproc)
 Register a ValueRangeProcessor.
 
std::string get_corrected_query_string () const
 Get the spelling-corrected query string. More...
 
std::string get_description () const
 Return a string describing this object.
 

Detailed Description

Build a Xapian::Query object from a user query string.

Member Enumeration Documentation

Enum of feature flags.

Enumerator
FLAG_BOOLEAN 

Support AND, OR, etc and bracketed subexpressions.

FLAG_PHRASE 

Support quoted phrases.

FLAG_LOVEHATE 

Support + and -.

FLAG_BOOLEAN_ANY_CASE 

Support AND, OR, etc even if they aren't in ALLCAPS.

FLAG_WILDCARD 

Support right truncation (e.g.

Xap*).

   Currently you can't use wildcards with boolean filter prefixes,
   or in a phrase (either an explicitly quoted one, or one implicitly
   generated by hyphens or other punctuation).

   NB: You need to tell the QueryParser object which database to
   expand wildcards from by calling set_database.
FLAG_PURE_NOT 

Allow queries such as 'NOT apples'.

These require the use of a list of all documents in the database which is potentially expensive, so this feature isn't enabled by default.

FLAG_PARTIAL 

Enable partial matching.

Partial matching causes the parser to treat the query as a "partially entered" search. This will automatically treat the final word as a wildcarded match, unless it is followed by whitespace, to produce more stable results from interactive searches.

Currently FLAG_PARTIAL doesn't do anything if the final word in the query has a boolean filter prefix, or if it is in a phrase (either an explicitly quoted one, or one implicitly generated by hyphens or other punctuation). It also doesn't do anything if if the final word is part of a value range.

NB: You need to tell the QueryParser object which database to expand wildcards from by calling set_database.

FLAG_SPELLING_CORRECTION 

Enable spelling correction.

For each word in the query which doesn't exist as a term in the database, Database::get_spelling_suggestion() will be called and if a suggestion is returned, a corrected version of the query string will be built up which can be read using QueryParser::get_corrected_query_string(). The query returned is based on the uncorrected query string however - if you want a parsed query based on the corrected query string, you must call QueryParser::parse_query() again.

NB: You must also call set_database() for this to work.

FLAG_SYNONYM 

Enable synonym operator '~'.

NB: You must also call set_database() for this to work.

FLAG_AUTO_SYNONYMS 

Enable automatic use of synonyms for single terms.

NB: You must also call set_database() for this to work.

FLAG_AUTO_MULTIWORD_SYNONYMS 

Enable automatic use of synonyms for single terms and groups of terms.

NB: You must also call set_database() for this to work.

FLAG_DEFAULT 

The default flags.

Used if you don't explicitly pass any to parse_query(). The default flags are FLAG_PHRASE|FLAG_BOOLEAN|FLAG_LOVEHATE.

Added in Xapian 1.0.11.

Member Function Documentation

void Xapian::QueryParser::add_boolean_prefix ( const std::string &  field,
const std::string &  prefix,
bool  exclusive 
)

Add a boolean term prefix allowing the user to restrict a search with a boolean filter specified in the free text query.

For example:

* qp.add_boolean_prefix("site", "H");
*

This allows the user to restrict a search with site:xapian.org which will be converted to Hxapian.org combined with any probabilistic query with Xapian::Query::OP_FILTER.

If multiple boolean filters are specified in a query for the same prefix, they will be combined with the Xapian::Query::OP_OR operator. Then, if there are boolean filters for different prefixes, they will be combined with the Xapian::Query::OP_AND operator.

Multiple fields can be mapped to the same prefix (so for example you can make site: and domain: aliases for each other). Instances of fields with different aliases but the same prefix will still be combined with the OR operator.

For example, if "site" and "domain" map to "H", but author maps to "A", a search for "site:foo domain:bar author:Fred" will map to "(Hfoo OR Hbar) AND Afred".

As of 1.0.4, you can call this method multiple times with the same value of field to allow a single field to be mapped to multiple prefixes. Multiple terms being generated for such a field, and combined with Xapian::Query::OP_OR.

Calling this method with an empty string for field will cause a Xapian::InvalidArgumentError.

If you call add_prefix() and add_boolean_prefix() for the same value of field, a Xapian::InvalidOperationError exception will be thrown.

In 1.0.3 and earlier, subsequent calls to this method with the same value of field had no effect.

Parameters
fieldThe user visible field name
prefixThe term prefix to map this to
exclusiveIf true, each document can have at most one term with this prefix, so multiple filters with this prefix should be combined with OP_OR. If false, each document can have multiple terms with this prefix, so multiple filters should be combined with OP_AND, like happens with filters with different prefixes. [default: true]
void Xapian::QueryParser::add_prefix ( const std::string &  field,
const std::string &  prefix 
)

Add a probabilistic term prefix.

For example:

* qp.add_prefix("author", "A");
*

This allows the user to search for author:Orwell which will be converted to a search for the term "Aorwell".

Multiple fields can be mapped to the same prefix. For example, you can make title: and subject: aliases for each other.

As of 1.0.4, you can call this method multiple times with the same value of field to allow a single field to be mapped to multiple prefixes. Multiple terms being generated for such a field, and combined with Xapian::Query::OP_OR.

If any prefixes are specified for the empty field name (i.e. you call this method with an empty string as the first parameter) these prefixes will be used for terms without a field specifier. If you do this and also specify the default_prefix parameter to parse_query(), then the default_prefix parameter will override.

If the prefix parameter is empty, then "field:word" will produce the term "word" (and this can be one of several prefixes for a particular field, or for terms without a field specifier).

If you call add_prefix() and add_boolean_prefix() for the same value of field, a Xapian::InvalidOperationError exception will be thrown.

In 1.0.3 and earlier, subsequent calls to this method with the same value of field had no effect.

Parameters
fieldThe user visible field name
prefixThe term prefix to map this to
std::string Xapian::QueryParser::get_corrected_query_string ( ) const

Get the spelling-corrected query string.

This will only be set if FLAG_SPELLING_CORRECTION is specified when QueryParser::parse_query() was last called.

If there were no corrections, an empty string is returned.

Query::op Xapian::QueryParser::get_default_op ( ) const

Get the current default operator.

Query Xapian::QueryParser::parse_query ( const std::string &  query_string,
unsigned  flags = FLAG_DEFAULT,
const std::string &  default_prefix = std::string() 
)

Parse a query.

Parameters
query_stringA free-text query as entered by a user
flagsZero or more Query::feature_flag specifying what features the QueryParser should support. Combine multiple values with bitwise-or (|) (default FLAG_DEFAULT).
default_prefixThe default term prefix to use (default none). For example, you can pass "A" when parsing an "Author" field.
Exceptions
Ifthe query string can't be parsed, then Xapian::QueryParserError is thrown. You can get an English error message to report to the user by catching it and calling get_msg() on the caught exception. The current possible values (in case you want to translate them) are:
 @li Unknown range operation
 @li parse error
 @li Syntax: <expression> AND <expression>
 @li Syntax: <expression> AND NOT <expression>
 @li Syntax: <expression> NOT <expression>
 @li Syntax: <expression> OR <expression>
 @li Syntax: <expression> XOR <expression>
void Xapian::QueryParser::set_database ( const Database db)

Specify the database being searched.

Parameters
dbThe database to use for wildcard expansion (FLAG_WILDCARD and FLAG_PARTIAL), spelling correction (FLAG_SPELLING_CORRECTION), and synonyms (FLAG_SYNONYM, FLAG_AUTO_SYNONYMS, and FLAG_AUTO_MULTIWORD_SYNONYMS).
void Xapian::QueryParser::set_default_op ( Query::op  default_op)

Set the default operator.

Parameters
default_opThe operator to use to combine non-filter query items when no explicit operator is used.

The most useful values for this are OP_OR (the default) and OP_AND. OP_NEAR and OP_PHRASE can also be useful.

So for example, 'weather forecast' is parsed as if it were 'weather OR forecast' by default.

void Xapian::QueryParser::set_max_wildcard_expansion ( Xapian::termcount  limit)

Specify the maximum expansion of a wildcard term.

Note: you must also set FLAG_WILDCARD for wildcard expansion to happen.

Parameters
limitThe maximum number of terms each wildcard in the query can expand to, or 0 for no limit (which is the default).
void Xapian::QueryParser::set_stemmer ( const Xapian::Stem stemmer)

Set the stemmer.

This sets the stemming algorithm which will be used by the query parser. Note that the stemming algorithm will only be used according to the stemming strategy set by set_stemming_strategy(), which defaults to STEM_NONE. Therefore, to use a stemming algorithm, you will also need to call set_stemming_strategy() with a value other than STEM_NONE.

Parameters
stemmerThe Xapian::Stem object to set.
void Xapian::QueryParser::set_stemming_strategy ( stem_strategy  strategy)

Set the stemming strategy.

This controls how the query parser will apply the stemming algorithm. Note that the stemming algorithm is only applied to words in probabilistic fields - boolean filter terms are never stemmed.

Parameters
strategyThe strategy to use - possible values are:
  • STEM_NONE: Don't perform any stemming. (default in Xapian <= 1.3.0)
  • STEM_SOME: Stem all terms except for those which start with a capital letter, or are followed by certain characters (currently: (/@<>=*[{" ), or are used with operators which need positional information. Stemmed terms are prefixed with 'Z'. (default in Xapian >= 1.3.1)
  • STEM_ALL: Stem all terms (note: no 'Z' prefix is added).
  • STEM_ALL_Z: Stem all terms (note: 'Z' prefix is added). (new in Xapian 1.2.11 and 1.3.1)
void Xapian::QueryParser::set_stopper ( const Stopper stop = NULL)

Set the stopper.

Parameters
stopThe Stopper object to set (default NULL, which means no stopwords).

The documentation for this class was generated from the following file:

Documentation for Xapian (version 1.2.19).
Generated on Tue Oct 21 2014 by Doxygen 1.8.5.