Class representing a list of search results. More...

#include <mset.h>

Public Types
enum	{ SNIPPET_BACKGROUND_MODEL = 1 , SNIPPET_EXHAUSTIVE = 2 , SNIPPET_EMPTY_WITHOUT_MATCH = 4 , SNIPPET_NGRAMS = 2048 , SNIPPET_CJK_NGRAM = SNIPPET_NGRAMS }

Public Member Functions
	MSet (const MSet &o)
	Copying is allowed.

MSet &	operator= (const MSet &o)
	Copying is allowed.

	MSet ()
	Default constructor.

	~MSet ()
	Destructor.

int	convert_to_percent (double weight) const
	Convert a weight to a percentage.

int	convert_to_percent (const MSetIterator &it) const
	Convert the weight of the current iterator position to a percentage.

Xapian::doccount	get_termfreq (const std::string &term) const
	Get the termfreq of a term.

double	get_termweight (const std::string &term) const
	Get the term weight of a term.

Xapian::doccount	get_firstitem () const
	Rank of first item in this MSet.

Xapian::doccount	get_matches_lower_bound () const
	Lower bound on the total number of matching documents.

Xapian::doccount	get_matches_estimated () const
	Estimate of the total number of matching documents.

Xapian::doccount	get_matches_upper_bound () const
	Upper bound on the total number of matching documents.

Xapian::doccount	get_uncollapsed_matches_lower_bound () const
	Lower bound on the total number of matching documents before collapsing.

Xapian::doccount	get_uncollapsed_matches_estimated () const
	Estimate of the total number of matching documents before collapsing.

Xapian::doccount	get_uncollapsed_matches_upper_bound () const
	Upper bound on the total number of matching documents before collapsing.

double	get_max_attained () const
	The maximum weight attained by any document.

double	get_max_possible () const
	The maximum possible weight any document could achieve.

std::string	snippet (const std::string &text, size_t length=500, const Xapian::Stem &stemmer=Xapian::Stem(), unsigned flags=SNIPPET_BACKGROUND_MODEL\|SNIPPET_EXHAUSTIVE, const std::string &hi_start="<b>", const std::string &hi_end="</b>", const std::string &omit="...") const
	Generate a snippet.

void	fetch (const MSetIterator &begin, const MSetIterator &end) const
	Prefetch hint a range of items.

void	fetch (const MSetIterator &item) const
	Prefetch hint a single MSet item.

void	fetch () const
	Prefetch hint the whole MSet.

Xapian::doccount	size () const
	Return number of items in this MSet object.

bool	empty () const
	Return true if this MSet object is empty.

void	swap (MSet &o)
	Efficiently swap this MSet object with another.

MSetIterator	begin () const
	Return iterator pointing to the first item in this MSet.

MSetIterator	end () const
	Return iterator pointing to just after the last item in this MSet.

MSetIterator	operator[] (Xapian::doccount i) const
	Return iterator pointing to the i-th object in this MSet.

MSetIterator	back () const
	Return iterator pointing to the last object in this MSet.

std::string	get_description () const
	Return a string describing this object.

Detailed Description

Class representing a list of search results.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum

Enumerator
SNIPPET_BACKGROUND_MODEL	Model the relevancy of non-query terms in MSet::snippet(). Non-query terms will be assigned a small weight, and the snippet will tend to prefer snippets which contain a more interesting background (where the query term content is equivalent).
SNIPPET_EXHAUSTIVE	Exhaustively evaluate candidate snippets in MSet::snippet(). Without this flag, snippet generation will stop once it thinks it has found a "good enough" snippet, which will generally reduce the time taken to generate a snippet.
SNIPPET_EMPTY_WITHOUT_MATCH	Return the empty string if no term got matched. If enabled, snippet() returns an empty string if not a single match was found in text. If not enabled, snippet() returns a (sub)string of text without any highlighted terms.
SNIPPET_NGRAMS	Generate n-grams for scripts without explicit word breaks. Text in other scripts is split into words as normal. Enable this option to highlight search results for queries parsed with the QueryParser::FLAG_NGRAMS flag. The TermGenerator::FLAG_NGRAMS flag needs to have been used at index time. This mode can also be enabled by setting environment variable XAPIAN_CJK_NGRAM to a non-empty value (but doing so was deprecated in 1.4.11). In 1.4.x this feature was specific to CJK (Chinese, Japanese and Korean), but in 1.5.0 it's been extended to other languages. To reflect this change the new and preferred name is SNIPPET_NGRAMS, which was added as an alias for forward compatibility in Xapian 1.4.23. Use SNIPPET_CJK_NGRAM instead if you aim to support Xapian < 1.4.23. @since Added in Xapian 1.4.23.
SNIPPET_CJK_NGRAM	Generate n-grams for scripts without explicit word breaks. Old name - use SNIPPET_NGRAMS instead unless you aim to support Xapian < 1.4.23. @since Added in Xapian 1.4.11.

Constructor & Destructor Documentation

◆ MSet() [1/2]

Xapian::MSet::MSet ( const MSet & o )

Copying is allowed.

The internals are reference counted, so copying is cheap.

◆ MSet() [2/2]

Xapian::MSet::MSet ( )

Default constructor.

Creates an empty MSet, mostly useful as a placeholder.

Member Function Documentation

◆ convert_to_percent() [1/2]

int Xapian::MSet::convert_to_percent ( const MSetIterator & it ) const

inline

Convert the weight of the current iterator position to a percentage.

The matching document with the highest weight will get 100% if it matches all the weighted query terms, and proportionally less if it only matches some, and other weights are scaled by the same factor.

Documents with a non-zero score will always score at least 1%.

Note that these generally aren't percentages of anything meaningful (unless you use a custom weighting formula where they are!)

References convert_to_percent(), and Xapian::MSetIterator::get_weight().

◆ convert_to_percent() [2/2]

int Xapian::MSet::convert_to_percent ( double weight ) const

Convert a weight to a percentage.

The matching document with the highest weight will get 100% if it matches all the weighted query terms, and proportionally less if it only matches some, and other weights are scaled by the same factor.

Documents with a non-zero score will always score at least 1%.

Note that these generally aren't percentages of anything meaningful (unless you use a custom weighting formula where they are!)

Referenced by convert_to_percent(), and Xapian::MSetIterator::get_percent().

◆ fetch() [1/3]

void Xapian::MSet::fetch ( ) const

inline

Prefetch hint the whole MSet.

For a remote database, this may start a pipelined fetch of the requested documents from the remote server.

For a disk-based database, this may send prefetch hints to the operating system such that the disk blocks the requested documents are stored in are more likely to be in the cache when we come to actually read them.

◆ fetch() [2/3]

void Xapian::MSet::fetch	(	const MSetIterator &	begin,
		const MSetIterator &	end
	)		const

inline

Prefetch hint a range of items.

For a remote database, this may start a pipelined fetch of the requested documents from the remote server.

For a disk-based database, this may send prefetch hints to the operating system such that the disk blocks the requested documents are stored in are more likely to be in the cache when we come to actually read them.

◆ fetch() [3/3]

void Xapian::MSet::fetch ( const MSetIterator & item ) const

inline

Prefetch hint a single MSet item.

For a remote database, this may start a pipelined fetch of the requested documents from the remote server.

For a disk-based database, this may send prefetch hints to the operating system such that the disk blocks the requested documents are stored in are more likely to be in the cache when we come to actually read them.

◆ get_firstitem()

Xapian::doccount Xapian::MSet::get_firstitem ( ) const

Rank of first item in this MSet.

This is the parameter first passed to Xapian::Enquire::get_mset().

Referenced by Xapian::MSetIterator::get_rank().

◆ get_termfreq()

Xapian::doccount Xapian::MSet::get_termfreq ( const std::string & term ) const

Get the termfreq of a term.

Returns: The number of documents which term occurs in. This considers all documents in the database being searched, so gives the same answer as db.get_termfreq(term) (but is more efficient for query terms as it returns a value cached during the search.)

◆ get_termweight()

double Xapian::MSet::get_termweight ( const std::string & term ) const

Get the term weight of a term.

Returns: The maximum weight that term could have contributed to a document.

◆ get_uncollapsed_matches_estimated()

Xapian::doccount Xapian::MSet::get_uncollapsed_matches_estimated ( ) const

Estimate of the total number of matching documents before collapsing.

Conceptually the same as get_matches_estimated() for the same query without any collapse part (though the actual value may differ).

◆ get_uncollapsed_matches_lower_bound()

Xapian::doccount Xapian::MSet::get_uncollapsed_matches_lower_bound ( ) const

Lower bound on the total number of matching documents before collapsing.

Conceptually the same as get_matches_lower_bound() for the same query without any collapse part (though the actual value may differ).

◆ get_uncollapsed_matches_upper_bound()

Xapian::doccount Xapian::MSet::get_uncollapsed_matches_upper_bound ( ) const

Upper bound on the total number of matching documents before collapsing.

Conceptually the same as get_matches_upper_bound() for the same query without any collapse part (though the actual value may differ).

◆ operator=()

MSet & Xapian::MSet::operator= ( const MSet & o )

Copying is allowed.

The internals are reference counted, so assignment is cheap.

◆ snippet()

std::string Xapian::MSet::snippet	(	const std::string &	text,
		size_t	length = `500`,
		const Xapian::Stem &	stemmer = `Xapian::Stem()`,
		unsigned	flags = `SNIPPET_BACKGROUND_MODEL\|SNIPPET_EXHAUSTIVE`,
		const std::string &	hi_start = `"<b>"`,
		const std::string &	hi_end = `"</b>"`,
		const std::string &	omit = `"..."`
	)		const

Generate a snippet.

This method selects a continuous run of words from text, based mainly on where the query matches (currently terms, exact phrases and wildcards are taken into account). If flag SNIPPET_BACKGROUND_MODEL is used (which it is by default) then the selection algorithm also considers the non-query terms in the text with the aim of showing a context which provides more useful information.

The size of the text selected can be controlled by the length parameter, which specifies a number of bytes of text to aim to select. However slightly more text may be selected. Also the size of any escaping, highlighting or omission markers is not considered.

The returned text is escaped to make it suitable for use in HTML (though beware that in upstream releases 1.4.5 and earlier this escaping was sometimes incomplete), and matches with the query will be highlighted using hi_start and hi_end.

If the snippet seems to start or end mid-sentence, then omit is prepended or append (respectively) to indicate this.

The same stemming algorithm which was used to build the query should be specified in stemmer.

And flags contains flags controlling behaviour.

Added in 1.3.5.

The documentation for this class was generated from the following file:

xapian/mset.h

Public Types

Public Member Functions

Detailed Description

Member Enumeration Documentation

◆ anonymous enum

Constructor & Destructor Documentation

◆ MSet() [1/2]

◆ MSet() [2/2]

Member Function Documentation

◆ convert_to_percent() [1/2]

◆ convert_to_percent() [2/2]

◆ fetch() [1/3]

◆ fetch() [2/3]

◆ fetch() [3/3]

◆ get_firstitem()

◆ get_termfreq()

◆ get_termweight()

◆ get_uncollapsed_matches_estimated()

◆ get_uncollapsed_matches_lower_bound()

◆ get_uncollapsed_matches_upper_bound()

◆ operator=()

◆ snippet()