xapian-core  1.4.26
Classes | Public Types | Public Member Functions | Static Public Member Functions | Static Public Attributes | List of all members
Xapian::Query Class Reference

Class representing a query. More...

#include <query.h>

Public Types

enum  op {
  OP_AND = 0 , OP_OR = 1 , OP_AND_NOT = 2 , OP_XOR = 3 ,
  OP_AND_MAYBE = 4 , OP_FILTER = 5 , OP_NEAR = 6 , OP_PHRASE = 7 ,
  OP_VALUE_RANGE = 8 , OP_SCALE_WEIGHT = 9 , OP_ELITE_SET = 10 , OP_VALUE_GE = 11 ,
  OP_VALUE_LE = 12 , OP_SYNONYM = 13 , OP_MAX = 14 , OP_WILDCARD = 15 ,
  OP_INVALID = 99 , LEAF_TERM = 100 , LEAF_POSTING_SOURCE , LEAF_MATCH_ALL ,
  LEAF_MATCH_NOTHING
}
 Query operators. More...
 
enum  { WILDCARD_LIMIT_ERROR , WILDCARD_LIMIT_FIRST , WILDCARD_LIMIT_MOST_FREQUENT }
 

Public Member Functions

 Query ()
 Construct a query matching no documents.
 
 ~Query ()
 Destructor.
 
 Query (const Query &o)
 Copying is allowed.
 
Queryoperator= (const Query &o)
 Copying is allowed.
 
 Query (const std::string &term, Xapian::termcount wqf=1, Xapian::termpos pos=0)
 Construct a Query object for a term.
 
 Query (Xapian::PostingSource *source)
 Construct a Query object for a PostingSource.
 
 Query (double factor, const Xapian::Query &subquery)
 Scale using OP_SCALE_WEIGHT.
 
 Query (op op_, const Xapian::Query &subquery, double factor)
 Scale using OP_SCALE_WEIGHT.
 
 Query (op op_, const Xapian::Query &a, const Xapian::Query &b)
 Construct a Query object by combining two others.
 
 Query (op op_, const std::string &a, const std::string &b)
 Construct a Query object by combining two terms.
 
 Query (op op_, Xapian::valueno slot, const std::string &range_limit)
 Construct a Query object for a single-ended value range.
 
 Query (op op_, Xapian::valueno slot, const std::string &range_lower, const std::string &range_upper)
 Construct a Query object for a value range.
 
 Query (op op_, const std::string &pattern, Xapian::termcount max_expansion=0, int max_type=WILDCARD_LIMIT_ERROR, op combiner=OP_SYNONYM)
 Query constructor for OP_WILDCARD queries.
 
template<typename I >
 Query (op op_, I begin, I end, Xapian::termcount window=0)
 Construct a Query object from a begin/end iterator pair.
 
const TermIterator get_terms_begin () const
 Begin iterator for terms in the query object.
 
const TermIterator get_terms_end () const
 End iterator for terms in the query object.
 
const TermIterator get_unique_terms_begin () const
 Begin iterator for unique terms in the query object.
 
const TermIterator get_unique_terms_end () const
 End iterator for unique terms in the query object.
 
Xapian::termcount get_length () const
 Return the length of this query object.
 
bool empty () const
 Check if this query is Xapian::Query::MatchNothing.
 
std::string serialise () const
 Serialise this object into a string.
 
op get_type () const
 Get the type of the top level of the query.
 
size_t get_num_subqueries () const
 Get the number of subqueries of the top level query.
 
const Query get_subquery (size_t n) const
 Read a top level subquery.
 
std::string get_description () const
 Return a string describing this object.
 
const Query operator&= (const Query &o)
 Combine with another Xapian::Query object using OP_AND.
 
const Query operator|= (const Query &o)
 Combine with another Xapian::Query object using OP_OR.
 
const Query operator^= (const Query &o)
 Combine with another Xapian::Query object using OP_XOR.
 
const Query operator*= (double factor)
 Scale using OP_SCALE_WEIGHT.
 
const Query operator/= (double factor)
 Inverse scale using OP_SCALE_WEIGHT.
 
 Query (Query::op op_)
 Construct with just an operator.
 

Static Public Member Functions

static const Query unserialise (const std::string &serialised, const Registry &reg=Registry())
 Unserialise a string and return a Query object.
 

Static Public Attributes

static const Xapian::Query MatchNothing
 A query matching no documents.
 
static const Xapian::Query MatchAll
 A query matching all documents.
 

Detailed Description

Class representing a query.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum
Enumerator
WILDCARD_LIMIT_ERROR 

Throw an error if OP_WILDCARD exceeds its expansion limit.

    Xapian::WildcardError will be thrown when the query is actually
    run.
WILDCARD_LIMIT_FIRST 

Stop expanding when OP_WILDCARD reaches its expansion limit.

    This makes the wildcard expand to only the first N terms (sorted
    by byte order).
WILDCARD_LIMIT_MOST_FREQUENT 

Limit OP_WILDCARD expansion to the most frequent terms.

    If OP_WILDCARD would expand to more than its expansion limit, the
    most frequent terms are taken.  This approach works well for cases
    such as expanding a partial term at the end of a query string which
    the user hasn't finished typing yet - as well as being less expense
    to evaluate than the full expansion, using only the most frequent
    terms tends to give better results too.

◆ op

Query operators.

Enumerator
OP_AND 

Match only documents which all subqueries match.

    When used in a weighted context, the weight is the sum of the
    weights for all the subqueries.
OP_OR 

Match documents which at least one subquery matches.

    When used in a weighted context, the weight is the sum of the
    weights for matching subqueries (so additional matching subqueries
    will mean a higher weight).
OP_AND_NOT 

Match documents which the first subquery matches but no others do.

    When used in a weighted context, the weight is just the weight of
    the first subquery.
OP_XOR 

Match documents which an odd number of subqueries match.

    When used in a weighted context, the weight is the sum of the
    weights for matching subqueries (so additional matching subqueries
    will mean a higher weight).
OP_AND_MAYBE 

Match the first subquery taking extra weight from other subqueries.

    When used in a weighted context, the weight is the sum of the
    weights for matching subqueries (so additional matching subqueries
    will mean a higher weight).

    Because only the first subquery determines which documents are
    matched, in a non-weighted context only the first subquery matters.
OP_FILTER 

Match like OP_AND but only taking weight from the first subquery.

    When used in a non-weighted context, OP_FILTER and OP_AND are
    equivalent.

    In older 1.4.x, the third and subsequent subqueries were ignored
    in some situations.  This was fixed in 1.4.15.
OP_NEAR 

Match only documents where all subqueries match near each other.

    The subqueries must match at term positions within the specified
    window size, in any order.

    Currently subqueries must be terms or terms composed with OP_OR.

    When used in a weighted context, the weight is the sum of the
    weights for all the subqueries.
OP_PHRASE 

Match only documents where all subqueries match near and in order.

    The subqueries must match at term positions within the specified
    window size, in the same term position order as subquery order.

    Currently subqueries must be terms or terms composed with OP_OR.

    When used in a weighted context, the weight is the sum of the
    weights for all the subqueries.
OP_VALUE_RANGE 

Match only documents where a value slot is within a given range.

    This operator never contributes weight.
OP_SCALE_WEIGHT 

Scale the weight contributed by a subquery.

    The weight is the weight of the subquery multiplied by the
    specified non-negative scale factor (so if the scale factor is
    zero then the subquery contributes no weight).
OP_ELITE_SET 

Pick the best N subqueries and combine with OP_OR.

    If you want to implement a feature which finds documents similar to
    a piece of text, an obvious approach is to build an "OR" query from
    all the terms in the text, and run this query against a database
    containing the documents.  However such a query can contain a lots
    of terms and be quite slow to perform, yet many of these terms
    don't contribute usefully to the results.

    The OP_ELITE_SET operator can be used instead of OP_OR in this
    situation.  OP_ELITE_SET selects the most important ''N'' terms and
    then acts as an OP_OR query with just these, ignoring any other
    terms.  This will usually return results just as good as the full
    OP_OR query, but much faster.

    In general, the OP_ELITE_SET operator can be used when you have a
    large OR query, but it doesn't matter if the search completely
    ignores some of the less important terms in the query.

    The subqueries don't have to be terms.  If they aren't then
    OP_ELITE_SET could potentially pick a subset which doesn't
    actually match any documents even if the full OR would match some
    (because OP_ELITE_SET currently selects those subqueries which can
    return the highest weights).  This is probably rare in practice
    though.

    You can specify a parameter to the query constructor which controls
    the number of subqueries which OP_ELITE_SET will pick.  If not
    specified, this defaults to 10 (Xapian used to default to
    <code>ceil(sqrt(number_of_subqueries))</code> if there are more
    than 100 subqueries, but this rather arbitrary special case was
    dropped in 1.3.0).  For example, this will pick the best 7 terms:

    <pre>
    Xapian::Query query(Xapian::Query::OP_ELITE_SET, subqs.begin(), subqs.end(), 7);
    </pre>

    If the number of subqueries is less than this threshold,
    OP_ELITE_SET behaves identically to OP_OR.

    When used with a sharded database, OP_ELITE_SET currently picks
    the subqueries to use separately for each shard based on the
    maximum weight they can return in that shard.  This means it
    probably won't select exactly the same terms, and so the results
    of the search may not be exactly the same as for a single database
    with equivalent contents.
OP_VALUE_GE 

Match only documents where a value slot is >= a given value.

    Similar to @a OP_VALUE_RANGE, but open-ended.

    This operator never contributes weight.
OP_VALUE_LE 

Match only documents where a value slot is <= a given value.

    Similar to @a OP_VALUE_RANGE, but open-ended.

    This operator never contributes weight.
OP_SYNONYM 

Match like OP_OR but weighting as if a single term.

    The weight is calculated combining the statistics for the
    subqueries to approximate the weight of a single term occurring
    with those statistics.
OP_MAX 

Pick the maximum weight of any subquery.

    Matches the same documents as @a OP_OR, but the weight contributed
    is the maximum weight from any matching subquery (for OP_OR, it's
    the sum of the weights from the matching subqueries).

    Added in Xapian 1.3.2.
OP_WILDCARD 

Wildcard expansion.

    Added in Xapian 1.3.3.
OP_INVALID 

Construct an invalid query.

    This can be useful as a placeholder - for example @a RangeProcessor
    uses it as a return value to indicate that a range hasn't been
    recognised.
LEAF_TERM 

Value returned by get_type() for a term.

LEAF_POSTING_SOURCE 

Value returned by get_type() for a PostingSource.

LEAF_MATCH_ALL 

Value returned by get_type() for MatchAll or equivalent.

    This is returned for any <code>Xapian::Query(std::string())</code>
    object.
LEAF_MATCH_NOTHING 

Value returned by get_type() for MatchNothing or equivalent.

    This is returned for any <code>Xapian::Query()</code> object.

Constructor & Destructor Documentation

◆ Query() [1/12]

Xapian::Query::Query ( )
inline

Construct a query matching no documents.

MatchNothing is a static instance of this.

When combined with other Query objects using the various supported operators, Query() works like false in boolean logic, so Query() & q is Query(), while Query() | q is q.

Referenced by operator&=(), operator^=(), and operator|=().

◆ Query() [2/12]

Xapian::Query::Query ( const Query o)
inline

Copying is allowed.

The internals are reference counted, so copying is cheap.

◆ Query() [3/12]

Xapian::Query::Query ( const std::string &  term,
Xapian::termcount  wqf = 1,
Xapian::termpos  pos = 0 
)

Construct a Query object for a term.

Parameters
termThe term. An empty string constructs a query matching all documents (MatchAll is a static instance of this).
wqfThe within-query frequency. (default: 1)
posThe query position. Currently this is mainly used to determine the order of terms obtained via get_terms_begin(). (default: 0)

◆ Query() [4/12]

Xapian::Query::Query ( double  factor,
const Xapian::Query subquery 
)

Scale using OP_SCALE_WEIGHT.

Parameters
factorNon-negative real number to multiply weights by.
subqueryQuery object to scale weights from.

◆ Query() [5/12]

Xapian::Query::Query ( op  op_,
const Xapian::Query subquery,
double  factor 
)

Scale using OP_SCALE_WEIGHT.

In this form, the op_ parameter is totally redundant - use Query(factor, subquery) in preference.

Parameters
op_Must be OP_SCALE_WEIGHT.
factorNon-negative real number to multiply weights by.
subqueryQuery object to scale weights from.

◆ Query() [6/12]

Xapian::Query::Query ( op  op_,
const Xapian::Query a,
const Xapian::Query b 
)
inline

Construct a Query object by combining two others.

Parameters
op_The operator to combine the queries with.
aFirst subquery.
bSecond subquery.

◆ Query() [7/12]

Xapian::Query::Query ( op  op_,
const std::string &  a,
const std::string &  b 
)
inline

Construct a Query object by combining two terms.

Parameters
op_The operator to combine the terms with.
aFirst term.
bSecond term.

◆ Query() [8/12]

Xapian::Query::Query ( op  op_,
Xapian::valueno  slot,
const std::string &  range_limit 
)

Construct a Query object for a single-ended value range.

Parameters
op_Must be OP_VALUE_LE or OP_VALUE_GE currently.
slotThe value slot to work over.
range_limitThe limit of the range.

◆ Query() [9/12]

Xapian::Query::Query ( op  op_,
Xapian::valueno  slot,
const std::string &  range_lower,
const std::string &  range_upper 
)

Construct a Query object for a value range.

Parameters
op_Must be OP_VALUE_RANGE currently.
slotThe value slot to work over.
range_lowerLower end of the range.
range_upperUpper end of the range.

◆ Query() [10/12]

Xapian::Query::Query ( op  op_,
const std::string &  pattern,
Xapian::termcount  max_expansion = 0,
int  max_type = WILDCARD_LIMIT_ERROR,
op  combiner = OP_SYNONYM 
)

Query constructor for OP_WILDCARD queries.

Parameters
op_Must be OP_WILDCARD
patternThe wildcard pattern - currently this is just a string and the wildcard expands to terms which start with exactly this string.
max_expansionThe maximum number of terms to expand to (default: 0, which means no limit)
max_typeHow to enforce max_expansion - one of WILDCARD_LIMIT_ERROR (the default), WILDCARD_LIMIT_FIRST or WILDCARD_LIMIT_MOST_FREQUENT. When searching multiple databases, the expansion limit is currently applied independently for each database, so the total number of terms may be higher than the limit. This is arguably a bug, and may change in future versions.
combinerThe Query::op to combine the terms with - one of OP_SYNONYM (the default), OP_OR or OP_MAX.

◆ Query() [11/12]

template<typename I >
Xapian::Query::Query ( op  op_,
begin,
end,
Xapian::termcount  window = 0 
)
inline

Construct a Query object from a begin/end iterator pair.

Dereferencing the iterator should return a Xapian::Query, a non-NULL Xapian::Query*, a std::string or a type which converts to one of these (e.g. const char*).

If begin == end then there are no subqueries and the resulting Query won't match anything.

Parameters
op_The operator to combine the queries with.
beginBegin iterator.
endEnd iterator.
windowWindow size for OP_NEAR and OP_PHRASE, or 0 to use the number of subqueries as the window size (default: 0).

◆ Query() [12/12]

Xapian::Query::Query ( Query::op  op_)
inlineexplicit

Construct with just an operator.

Parameters
op_The operator to use - currently only OP_INVALID is useful.

Member Function Documentation

◆ get_subquery()

const Query Xapian::Query::get_subquery ( size_t  n) const

Read a top level subquery.

Parameters
nReturn the n-th subquery (starting from 0) - only valid when 0 <= n < get_num_subqueries().

◆ get_terms_begin()

const TermIterator Xapian::Query::get_terms_begin ( ) const

Begin iterator for terms in the query object.

The iterator returns terms in ascending query position order, and will return the same term in each unique position it occurs in. If you want the terms in sorted order and without duplicates, see get_unique_terms_begin().

◆ get_unique_terms_begin()

const TermIterator Xapian::Query::get_unique_terms_begin ( ) const

Begin iterator for unique terms in the query object.

Terms are sorted and terms with the same name removed from the list.

If you want the terms in ascending query position order, see get_terms_begin().

◆ operator&=()

const Query Xapian::Query::operator&= ( const Query o)
inline

Combine with another Xapian::Query object using OP_AND.

Since
Since Xapian 1.4.10, when called on a Query object which is OP_AND and has a reference count of 1, then o is appended as a new subquery (provided o is a different Query object and !o.empty()).

References OP_AND, and Query().

◆ operator*=()

const Query Xapian::Query::operator*= ( double  factor)
inline

Scale using OP_SCALE_WEIGHT.

Parameters
factorNon-negative real number to multiply weights by.

◆ operator/=()

const Query Xapian::Query::operator/= ( double  factor)
inline

Inverse scale using OP_SCALE_WEIGHT.

Parameters
factorPositive real number to divide weights by.

◆ operator=()

Query & Xapian::Query::operator= ( const Query o)
inline

Copying is allowed.

The internals are reference counted, so assignment is cheap.

◆ operator^=()

const Query Xapian::Query::operator^= ( const Query o)
inline

Combine with another Xapian::Query object using OP_XOR.

Since
Since Xapian 1.4.10, when called on a Query object which is OP_XOR and has a reference count of 1, then o is appended as a new subquery (provided o is a different Query object and !o.empty()).

References empty(), OP_XOR, and Query().

◆ operator|=()

const Query Xapian::Query::operator|= ( const Query o)
inline

Combine with another Xapian::Query object using OP_OR.

Since
Since Xapian 1.4.10, when called on a Query object which is OP_OR and has a reference count of 1, then o is appended as a new subquery (provided o is a different Query object and !o.empty()).

References empty(), OP_OR, and Query().

◆ unserialise()

static const Query Xapian::Query::unserialise ( const std::string &  serialised,
const Registry reg = Registry() 
)
static

Unserialise a string and return a Query object.

Parameters
serialisedthe string to unserialise.
regXapian::Registry object to use to unserialise user-subclasses of Xapian::PostingSource (default: standard registry).

Member Data Documentation

◆ MatchAll

const Xapian::Query Xapian::Query::MatchAll
static

A query matching all documents.

This is a static instance of Xapian::Query(std::string()). If you are constructing Query objects which use MatchAll in different threads then the reference counting of the static object can get messed up by concurrent access so you should instead use Xapian::Query(std::string()) directly.

◆ MatchNothing

const Xapian::Query Xapian::Query::MatchNothing
static

A query matching no documents.

This is a static instance of a default-constructed Xapian::Query object. It is safe to use concurrently from different threads, unlike MatchAll (this is because MatchNothing has a NULL internal object so there's no reference counting happening).

When combined with other Query objects using the various supported operators, MatchNothing works like false in boolean logic, so MatchNothing & q is MatchNothing, while MatchNothing | q is q.


The documentation for this class was generated from the following file: