xapian-core  1.4.25
Classes | Public Types | Public Member Functions | Private Member Functions | Private Attributes | Friends | List of all members
Xapian::MSet Class Reference

Class representing a list of search results. More...

#include <mset.h>

+ Collaboration diagram for Xapian::MSet:

Classes

class  Internal
 

Public Types

enum  {
  SNIPPET_BACKGROUND_MODEL = 1, SNIPPET_EXHAUSTIVE = 2, SNIPPET_EMPTY_WITHOUT_MATCH = 4, SNIPPET_NGRAMS = 2048,
  SNIPPET_CJK_NGRAM = SNIPPET_NGRAMS
}
 

Public Member Functions

 MSet (const MSet &o)
 Copying is allowed. More...
 
MSetoperator= (const MSet &o)
 Copying is allowed. More...
 
 MSet ()
 Default constructor. More...
 
 ~MSet ()
 Destructor. More...
 
int convert_to_percent (double weight) const
 Convert a weight to a percentage. More...
 
int convert_to_percent (const MSetIterator &it) const
 Convert the weight of the current iterator position to a percentage. More...
 
Xapian::doccount get_termfreq (const std::string &term) const
 Get the termfreq of a term. More...
 
double get_termweight (const std::string &term) const
 Get the term weight of a term. More...
 
Xapian::doccount get_firstitem () const
 Rank of first item in this MSet. More...
 
Xapian::doccount get_matches_lower_bound () const
 Lower bound on the total number of matching documents. More...
 
Xapian::doccount get_matches_estimated () const
 Estimate of the total number of matching documents. More...
 
Xapian::doccount get_matches_upper_bound () const
 Upper bound on the total number of matching documents. More...
 
Xapian::doccount get_uncollapsed_matches_lower_bound () const
 Lower bound on the total number of matching documents before collapsing. More...
 
Xapian::doccount get_uncollapsed_matches_estimated () const
 Estimate of the total number of matching documents before collapsing. More...
 
Xapian::doccount get_uncollapsed_matches_upper_bound () const
 Upper bound on the total number of matching documents before collapsing. More...
 
double get_max_attained () const
 The maximum weight attained by any document. More...
 
double get_max_possible () const
 The maximum possible weight any document could achieve. More...
 
std::string snippet (const std::string &text, size_t length=500, const Xapian::Stem &stemmer=Xapian::Stem(), unsigned flags=SNIPPET_BACKGROUND_MODEL|SNIPPET_EXHAUSTIVE, const std::string &hi_start="<b>", const std::string &hi_end="</b>", const std::string &omit="...") const
 Generate a snippet. More...
 
void fetch (const MSetIterator &begin, const MSetIterator &end) const
 Prefetch hint a range of items. More...
 
void fetch (const MSetIterator &item) const
 Prefetch hint a single MSet item. More...
 
void fetch () const
 Prefetch hint the whole MSet. More...
 
Xapian::doccount size () const
 Return number of items in this MSet object. More...
 
bool empty () const
 Return true if this MSet object is empty. More...
 
void swap (MSet &o)
 Efficiently swap this MSet object with another. More...
 
MSetIterator begin () const
 Return iterator pointing to the first item in this MSet. More...
 
MSetIterator end () const
 Return iterator pointing to just after the last item in this MSet. More...
 
MSetIterator operator[] (Xapian::doccount i) const
 Return iterator pointing to the i-th object in this MSet. More...
 
MSetIterator back () const
 Return iterator pointing to the last object in this MSet. More...
 
std::string get_description () const
 Return a string describing this object. More...
 
Xapian::doccount max_size () const
 

Private Types

typedef Xapian::MSetIterator value_type
 
typedef Xapian::doccount size_type
 
typedef Xapian::doccount_diff difference_type
 
typedef Xapian::MSetIterator iterator
 
typedef Xapian::MSetIterator const_iterator
 
typedef value_typepointer
 
typedef const value_typeconst_pointer
 
typedef value_typereference
 
typedef const value_typeconst_reference
 

Private Member Functions

void fetch_ (Xapian::doccount first, Xapian::doccount last) const
 

Private Attributes

Xapian::Internal::intrusive_ptr< Internalinternal
 

Friends

class MSetIterator
 

Detailed Description

Class representing a list of search results.

Definition at line 44 of file mset.h.

Member Typedef Documentation

◆ const_iterator

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 341 of file mset.h.

◆ const_pointer

typedef const value_type* Xapian::MSet::const_pointer
private

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 345 of file mset.h.

◆ const_reference

typedef const value_type& Xapian::MSet::const_reference
private

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 349 of file mset.h.

◆ difference_type

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 337 of file mset.h.

◆ iterator

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 339 of file mset.h.

◆ pointer

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 343 of file mset.h.

◆ reference

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 347 of file mset.h.

◆ size_type

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 335 of file mset.h.

◆ value_type

MSet is what the C++ STL calls a container.

The following typedefs allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

See "The C++ Programming Language", 3rd ed. section 16.3.1:

Definition at line 333 of file mset.h.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum
Enumerator
SNIPPET_BACKGROUND_MODEL 

Model the relevancy of non-query terms in MSet::snippet().

Non-query terms will be assigned a small weight, and the snippet will tend to prefer snippets which contain a more interesting background (where the query term content is equivalent).

SNIPPET_EXHAUSTIVE 

Exhaustively evaluate candidate snippets in MSet::snippet().

Without this flag, snippet generation will stop once it thinks it has found a "good enough" snippet, which will generally reduce the time taken to generate a snippet.

SNIPPET_EMPTY_WITHOUT_MATCH 

Return the empty string if no term got matched.

If enabled, snippet() returns an empty string if not a single match was found in text. If not enabled, snippet() returns a (sub)string of text without any highlighted terms.

SNIPPET_NGRAMS 

Generate n-grams for scripts without explicit word breaks.

Text in other scripts is split into words as normal.

Enable this option to highlight search results for queries parsed with the QueryParser::FLAG_NGRAMS flag.

The TermGenerator::FLAG_NGRAMS flag needs to have been used at index time.

This mode can also be enabled by setting environment variable XAPIAN_CJK_NGRAM to a non-empty value (but doing so was deprecated in 1.4.11).

In 1.4.x this feature was specific to CJK (Chinese, Japanese and Korean), but in 1.5.0 it's been extended to other languages. To reflect this change the new and preferred name is SNIPPET_NGRAMS, which was added as an alias for forward compatibility in Xapian 1.4.23. Use SNIPPET_CJK_NGRAM instead if you aim to support Xapian < 1.4.23.

Since
Added in Xapian 1.4.23.
SNIPPET_CJK_NGRAM 

Generate n-grams for scripts without explicit word breaks.

Old name - use SNIPPET_NGRAMS instead unless you aim to support Xapian < 1.4.23.

Since
Added in Xapian 1.4.11.

Definition at line 165 of file mset.h.

Constructor & Destructor Documentation

◆ MSet() [1/2]

Xapian::MSet::MSet ( const MSet o)

Copying is allowed.

The internals are reference counted, so copying is cheap.

Definition at line 173 of file omenquire.cc.

◆ MSet() [2/2]

Xapian::MSet::MSet ( )

Default constructor.

Creates an empty MSet, mostly useful as a placeholder.

Definition at line 165 of file omenquire.cc.

Referenced by operator=().

◆ ~MSet()

Xapian::MSet::~MSet ( )

Destructor.

Definition at line 169 of file omenquire.cc.

Member Function Documentation

◆ back()

MSetIterator Xapian::MSet::back ( ) const
inline

Return iterator pointing to the last object in this MSet.

Definition at line 641 of file mset.h.

Referenced by DEFINE_TESTCASE().

◆ begin()

MSetIterator Xapian::MSet::begin ( ) const
inline

Return iterator pointing to the first item in this MSet.

Definition at line 624 of file mset.h.

Referenced by DEFINE_TESTCASE(), main(), mset_expect_order_(), print_mset_percentages(), print_mset_weights(), and test_mset_order_equal().

◆ convert_to_percent() [1/2]

int Xapian::MSet::convert_to_percent ( double  weight) const

Convert a weight to a percentage.

The matching document with the highest weight will get 100% if it matches all the weighted query terms, and proportionally less if it only matches some, and other weights are scaled by the same factor.

Documents with a non-zero score will always score at least 1%.

Note that these generally aren't percentages of anything meaningful (unless you use a custom weighting formula where they are!)

Definition at line 198 of file omenquire.cc.

References Assert, internal, LOGCALL, and RETURN.

Referenced by DEFINE_TESTCASE(), Xapian::MSetIterator::get_percent(), and print_mset_percentages().

◆ convert_to_percent() [2/2]

int Xapian::MSet::convert_to_percent ( const MSetIterator it) const
inline

Convert the weight of the current iterator position to a percentage.

The matching document with the highest weight will get 100% if it matches all the weighted query terms, and proportionally less if it only matches some, and other weights are scaled by the same factor.

Documents with a non-zero score will always score at least 1%.

Note that these generally aren't percentages of anything meaningful (unless you use a custom weighting formula where they are!)

Definition at line 646 of file mset.h.

References Xapian::MSetIterator::get_weight().

◆ empty()

bool Xapian::MSet::empty ( ) const
inline

Return true if this MSet object is empty.

Definition at line 300 of file mset.h.

Referenced by DEFINE_TESTCASE(), and operator==().

◆ end()

MSetIterator Xapian::MSet::end ( ) const
inline

Return iterator pointing to just after the last item in this MSet.

Definition at line 629 of file mset.h.

Referenced by DEFINE_TESTCASE(), main(), print_mset_percentages(), print_mset_weights(), and test_mset_order_equal().

◆ fetch() [1/3]

void Xapian::MSet::fetch ( const MSetIterator begin,
const MSetIterator end 
) const
inline

Prefetch hint a range of items.

For a remote database, this may start a pipelined fetch of the requested documents from the remote server.

For a disk-based database, this may send prefetch hints to the operating system such that the disk blocks the requested documents are stored in are more likely to be in the cache when we come to actually read them.

Definition at line 612 of file mset.h.

References Xapian::MSetIterator::off_from_end.

Referenced by DEFINE_TESTCASE().

◆ fetch() [2/3]

void Xapian::MSet::fetch ( const MSetIterator item) const
inline

Prefetch hint a single MSet item.

For a remote database, this may start a pipelined fetch of the requested documents from the remote server.

For a disk-based database, this may send prefetch hints to the operating system such that the disk blocks the requested documents are stored in are more likely to be in the cache when we come to actually read them.

Definition at line 618 of file mset.h.

References Xapian::MSetIterator::off_from_end.

◆ fetch() [3/3]

void Xapian::MSet::fetch ( ) const
inline

Prefetch hint the whole MSet.

For a remote database, this may start a pipelined fetch of the requested documents from the remote server.

For a disk-based database, this may send prefetch hints to the operating system such that the disk blocks the requested documents are stored in are more likely to be in the cache when we come to actually read them.

Definition at line 294 of file mset.h.

◆ fetch_()

void Xapian::MSet::fetch_ ( Xapian::doccount  first,
Xapian::doccount  last 
) const
private

Definition at line 190 of file omenquire.cc.

References Assert, and LOGCALL_VOID.

◆ get_description()

string Xapian::MSet::get_description ( ) const

Return a string describing this object.

Definition at line 325 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE().

◆ get_firstitem()

Xapian::doccount Xapian::MSet::get_firstitem ( ) const

Rank of first item in this MSet.

This is the parameter first passed to Xapian::Enquire::get_mset().

Definition at line 239 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), Xapian::MSetIterator::get_rank(), and serialise_mset().

◆ get_matches_estimated()

Xapian::doccount Xapian::MSet::get_matches_estimated ( ) const

Estimate of the total number of matching documents.

Definition at line 253 of file omenquire.cc.

References Assert, internal, and round_estimate().

Referenced by DEFINE_TESTCASE(), main(), operator==(), and PerfTestLogger::search_end().

◆ get_matches_lower_bound()

Xapian::doccount Xapian::MSet::get_matches_lower_bound ( ) const

Lower bound on the total number of matching documents.

Definition at line 246 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), main(), operator==(), and PerfTestLogger::search_end().

◆ get_matches_upper_bound()

Xapian::doccount Xapian::MSet::get_matches_upper_bound ( ) const

Upper bound on the total number of matching documents.

Definition at line 262 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), main(), operator==(), and PerfTestLogger::search_end().

◆ get_max_attained()

double Xapian::MSet::get_max_attained ( ) const

The maximum weight attained by any document.

Definition at line 297 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), and serialise_mset().

◆ get_max_possible()

double Xapian::MSet::get_max_possible ( ) const

The maximum possible weight any document could achieve.

Definition at line 290 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), operator==(), and serialise_mset().

◆ get_termfreq()

Xapian::doccount Xapian::MSet::get_termfreq ( const std::string &  term) const

Get the termfreq of a term.

Returns
The number of documents which term occurs in. This considers all documents in the database being searched, so gives the same answer as db.get_termfreq(term) (but is more efficient for query terms as it returns a value cached during the search.)

Definition at line 206 of file omenquire.cc.

References Assert, internal, LOGCALL, RETURN, and usual.

Referenced by DEFINE_TESTCASE(), and main().

◆ get_termweight()

double Xapian::MSet::get_termweight ( const std::string &  term) const

Get the term weight of a term.

Returns
The maximum weight that term could have contributed to a document.

Definition at line 222 of file omenquire.cc.

References Assert, internal, LOGCALL, and RETURN.

Referenced by DEFINE_TESTCASE().

◆ get_uncollapsed_matches_estimated()

Xapian::doccount Xapian::MSet::get_uncollapsed_matches_estimated ( ) const

Estimate of the total number of matching documents before collapsing.

Conceptually the same as get_matches_estimated() for the same query without any collapse part (though the actual value may differ).

Definition at line 276 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), and serialise_mset().

◆ get_uncollapsed_matches_lower_bound()

Xapian::doccount Xapian::MSet::get_uncollapsed_matches_lower_bound ( ) const

Lower bound on the total number of matching documents before collapsing.

Conceptually the same as get_matches_lower_bound() for the same query without any collapse part (though the actual value may differ).

Definition at line 269 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), and serialise_mset().

◆ get_uncollapsed_matches_upper_bound()

Xapian::doccount Xapian::MSet::get_uncollapsed_matches_upper_bound ( ) const

Upper bound on the total number of matching documents before collapsing.

Conceptually the same as get_matches_upper_bound() for the same query without any collapse part (though the actual value may differ).

Definition at line 283 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE(), and serialise_mset().

◆ max_size()

Xapian::doccount Xapian::MSet::max_size ( ) const
inline

MSet is what the C++ STL calls a container.

The following methods allow the class to be used in templates in the same way the standard containers can be.

These are deliberately hidden from the Doxygen-generated docs, as the machinery here isn't interesting to API users. They just need to know that Xapian container classes are compatible with the STL.

Definition at line 363 of file mset.h.

◆ operator=()

MSet & Xapian::MSet::operator= ( const MSet o)
default

Copying is allowed.

The internals are reference counted, so assignment is cheap.

Definition at line 178 of file omenquire.cc.

References internal, and MSet().

◆ operator[]()

MSetIterator Xapian::MSet::operator[] ( Xapian::doccount  i) const
inline

Return iterator pointing to the i-th object in this MSet.

Definition at line 636 of file mset.h.

◆ size()

Xapian::doccount Xapian::MSet::size ( ) const

◆ snippet()

string Xapian::MSet::snippet ( const std::string &  text,
size_t  length = 500,
const Xapian::Stem stemmer = Xapian::Stem(),
unsigned  flags = SNIPPET_BACKGROUND_MODEL|SNIPPET_EXHAUSTIVE,
const std::string &  hi_start = "<b>",
const std::string &  hi_end = "</b>",
const std::string &  omit = "..." 
) const

Generate a snippet.

This method selects a continuous run of words from text, based mainly on where the query matches (currently terms, exact phrases and wildcards are taken into account). If flag SNIPPET_BACKGROUND_MODEL is used (which it is by default) then the selection algorithm also considers the non-query terms in the text with the aim of showing a context which provides more useful information.

The size of the text selected can be controlled by the length parameter, which specifies a number of bytes of text to aim to select. However slightly more text may be selected. Also the size of any escaping, highlighting or omission markers is not considered.

The returned text is escaped to make it suitable for use in HTML (though beware that in upstream releases 1.4.5 and earlier this escaping was sometimes incomplete), and matches with the query will be highlighted using hi_start and hi_end.

If the snippet seems to start or end mid-sentence, then omit is prepended or append (respectively) to indicate this.

The same stemming algorithm which was used to build the query should be specified in stemmer.

And flags contains flags controlling behaviour.

Added in 1.3.5.

Definition at line 304 of file omenquire.cc.

References Assert.

Referenced by DEFINE_TESTCASE().

◆ swap()

void Xapian::MSet::swap ( MSet o)
inline

Efficiently swap this MSet object with another.

Definition at line 303 of file mset.h.

References internal.

Friends And Related Function Documentation

◆ MSetIterator

friend class MSetIterator
friend

Definition at line 45 of file mset.h.

Member Data Documentation

◆ internal

Xapian::Internal::intrusive_ptr<Internal> Xapian::MSet::internal
private

The documentation for this class was generated from the following files: