Xapian::Weight Class Reference

Abstract base class for weighting schemes. More...

#include <weight.h>

Inheritance diagram for Xapian::Weight:

Inheritance graph
[legend]

List of all members.

Classes

class  Internal
 Class to hold statistics for a given collection. More...

Public Member Functions

virtual ~Weight ()
 Virtual destructor, because we have virtual methods.
virtual Weightclone () const =0
 Clone this object.
virtual std::string name () const
 Return the name of this weighting scheme.
virtual std::string serialise () const
 Return this object's parameters serialised as a single string.
virtual Weightunserialise (const std::string &s) const
 Unserialise parameters.
virtual Xapian::weight get_sumpart (Xapian::termcount wdf, Xapian::termcount doclen) const =0
 Calculate the weight contribution for this object's term to a document.
virtual Xapian::weight get_maxpart () const =0
 Return an upper bound on what get_sumpart() can return for any document.
virtual Xapian::weight get_sumextra (Xapian::termcount doclen) const =0
 Calculate the term-independent weight component for a document.
virtual Xapian::weight get_maxextra () const =0
 Return an upper bound on what get_sumextra() can return for any document.

Protected Types

enum  stat_flags {
  COLLECTION_SIZE = 1, RSET_SIZE = 2, AVERAGE_LENGTH = 4, TERMFREQ = 8,
  RELTERMFREQ = 16, QUERY_LENGTH = 32, WQF = 64, WDF = 128,
  DOC_LENGTH = 256, DOC_LENGTH_MIN = 512, DOC_LENGTH_MAX = 1024, WDF_MAX = 2048
}
 Stats which the weighting scheme can use (see need_stat()). More...

Protected Member Functions

void need_stat (stat_flags flag)
 Tell Xapian that your subclass will want a particular statistic.
virtual void init (double factor)=0
 Allow the subclass to perform any initialisation it needs to.
 Weight (const Weight &)
 Don't allow copying.
 Weight ()
 Default constructor, needed by subclass constructors.
Xapian::doccount get_collection_size () const
 The number of documents in the collection.
Xapian::doccount get_rset_size () const
 The number of documents marked as relevant.
Xapian::doclength get_average_length () const
 The average length of a document in the collection.
Xapian::doccount get_termfreq () const
 The number of documents which this term indexes.
Xapian::doccount get_reltermfreq () const
 The number of relevant documents which this term indexes.
Xapian::termcount get_query_length () const
 The length of the query.
Xapian::termcount get_wqf () const
 The within-query-frequency of this term.
Xapian::termcount get_doclength_upper_bound () const
 An upper bound on the maximum length of any document in the database.
Xapian::termcount get_doclength_lower_bound () const
 A lower bound on the minimum length of any document in the database.
Xapian::termcount get_wdf_upper_bound () const
 An upper bound on the wdf of this term.

Private Member Functions

void operator= (const Weight &)
 Don't allow assignment.
void init_ (const Internal &stats, Xapian::termcount query_len_, const std::string &term, Xapian::termcount wqf_, double factor)
void init_ (const Internal &stats, Xapian::termcount query_len_, double factor, Xapian::doccount termfreq, Xapian::doccount reltermfreq)
void init_ (const Internal &stats, Xapian::termcount query_len_)
bool get_sumpart_needs_doclength_ () const
bool get_sumpart_needs_wdf_ () const

Private Attributes

stat_flags stats_needed
 A bitmask of the statistics this weighting scheme needs.
Xapian::doccount collection_size_
 The number of documents in the collection.
Xapian::doccount rset_size_
 The number of documents marked as relevant.
Xapian::doclength average_length_
 The average length of a document in the collection.
Xapian::doccount termfreq_
 The number of documents which this term indexes.
Xapian::doccount reltermfreq_
 The number of relevant documents which this term indexes.
Xapian::termcount query_length_
 The length of the query.
Xapian::termcount wqf_
 The within-query-frequency of this term.
Xapian::termcount doclength_lower_bound_
 A lower bound on the minimum length of any document in the database.
Xapian::termcount doclength_upper_bound_
 An upper bound on the maximum length of any document in the database.
Xapian::termcount wdf_upper_bound_
 An upper bound on the wdf of this term.


Detailed Description

Abstract base class for weighting schemes.

Definition at line 33 of file weight.h.


Member Enumeration Documentation

enum Xapian::Weight::stat_flags [protected]

Stats which the weighting scheme can use (see need_stat()).

Enumerator:
COLLECTION_SIZE 
RSET_SIZE 
AVERAGE_LENGTH 
TERMFREQ 
RELTERMFREQ 
QUERY_LENGTH 
WQF 
WDF 
DOC_LENGTH 
DOC_LENGTH_MIN 
DOC_LENGTH_MAX 
WDF_MAX 

Definition at line 36 of file weight.h.


Constructor & Destructor Documentation

Xapian::Weight::~Weight (  )  [virtual]

Virtual destructor, because we have virtual methods.

Definition at line 112 of file weight.cc.

Xapian::Weight::Weight ( const Weight  )  [protected]

Don't allow copying.

This would ideally be private, but that causes a compilation error with GCC 4.1 (which appears to be a bug).

Xapian::Weight::Weight (  )  [inline, protected]

Default constructor, needed by subclass constructors.

Definition at line 274 of file weight.h.


Member Function Documentation

virtual Weight* Xapian::Weight::clone (  )  const [pure virtual]

Clone this object.

This method allocates and returns a copy of the object it is called on.

If your subclass is called FooWeight and has parameters a and b, then you would implement FooWeight::clone() like so:

FooWeight * FooWeight::clone() const { return new FooWeight(a, b); }

Note that the returned object will be deallocated by Xapian after use with "delete". If you want to handle the deletion in a special way (for example when wrapping the Xapian API for use from another language) then you can define a static operator delete method in your subclass as shown here: http://trac.xapian.org/ticket/554#comment:1

Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.

Referenced by LocalSubMatch::get_postlist_and_term_info(), LocalSubMatch::make_synonym_postlist(), LocalSubMatch::postlist_from_op_leaf_query(), and Xapian::Enquire::set_weighting_scheme().

Xapian::doclength Xapian::Weight::get_average_length (  )  const [inline, protected]

The average length of a document in the collection.

Definition at line 283 of file weight.h.

Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().

Xapian::doccount Xapian::Weight::get_collection_size (  )  const [inline, protected]

The number of documents in the collection.

Definition at line 277 of file weight.h.

Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().

Xapian::termcount Xapian::Weight::get_doclength_lower_bound (  )  const [inline, protected]

A lower bound on the minimum length of any document in the database.

This bound does not include any zero-length documents.

This should only be used by get_maxpart() and get_maxextra().

Definition at line 311 of file weight.h.

Referenced by Xapian::BM25Weight::get_maxextra(), Xapian::TradWeight::get_maxpart(), Xapian::BM25Weight::get_maxpart(), and CheckStatsWeight::get_maxpart().

Xapian::termcount Xapian::Weight::get_doclength_upper_bound (  )  const [inline, protected]

An upper bound on the maximum length of any document in the database.

This should only be used by get_maxpart() and get_maxextra().

Definition at line 301 of file weight.h.

Referenced by CheckStatsWeight::get_maxpart().

virtual Xapian::weight Xapian::Weight::get_maxextra (  )  const [pure virtual]

Return an upper bound on what get_sumextra() can return for any document.

This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.

Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.

virtual Xapian::weight Xapian::Weight::get_maxpart (  )  const [pure virtual]

Return an upper bound on what get_sumpart() can return for any document.

This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.

Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.

Referenced by SynonymPostList::get_maxweight(), and LeafPostList::get_maxweight().

Xapian::termcount Xapian::Weight::get_query_length (  )  const [inline, protected]

The length of the query.

Definition at line 292 of file weight.h.

Referenced by Xapian::BM25Weight::get_maxextra(), Xapian::BM25Weight::get_sumextra(), and CheckStatsWeight::get_sumpart().

Xapian::doccount Xapian::Weight::get_reltermfreq (  )  const [inline, protected]

The number of relevant documents which this term indexes.

Definition at line 289 of file weight.h.

Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().

Xapian::doccount Xapian::Weight::get_rset_size (  )  const [inline, protected]

The number of documents marked as relevant.

Definition at line 280 of file weight.h.

Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().

virtual Xapian::weight Xapian::Weight::get_sumextra ( Xapian::termcount  doclen  )  const [pure virtual]

Calculate the term-independent weight component for a document.

The parameter gives information about the document which may be used in the calculations:

Parameters:
doclen The document's length (unnormalised).

Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.

Referenced by ExtraWeightPostList::get_weight().

virtual Xapian::weight Xapian::Weight::get_sumpart ( Xapian::termcount  wdf,
Xapian::termcount  doclen 
) const [pure virtual]

Calculate the weight contribution for this object's term to a document.

The parameters give information about the document which may be used in the calculations:

Parameters:
wdf The within document frequency of the term in the document.
doclen The document's length (unnormalised).

Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.

Referenced by SynonymPostList::get_weight(), and LeafPostList::get_weight().

bool Xapian::Weight::get_sumpart_needs_doclength_ (  )  const [inline, private]

For internal use only.

Return true if the document length is needed.

If this method returns true, then the document length will be fetched and passed to get_sumpart(). Otherwise 0 may be passed for the document length.

Definition at line 252 of file weight.h.

Referenced by LeafPostList::set_termweight(), and SynonymPostList::set_weight().

bool Xapian::Weight::get_sumpart_needs_wdf_ (  )  const [inline, private]

For internal use only.

Return true if the WDF is needed.

If this method returns true, then the WDF will be fetched and passed to get_sumpart(). Otherwise 0 may be passed for the wdf.

Definition at line 261 of file weight.h.

Referenced by SynonymPostList::set_weight().

Xapian::doccount Xapian::Weight::get_termfreq (  )  const [inline, protected]

The number of documents which this term indexes.

Definition at line 286 of file weight.h.

Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().

Xapian::termcount Xapian::Weight::get_wdf_upper_bound (  )  const [inline, protected]

An upper bound on the wdf of this term.

This should only be used by get_maxpart() and get_maxextra().

Definition at line 319 of file weight.h.

Referenced by Xapian::TradWeight::get_maxpart(), Xapian::BM25Weight::get_maxpart(), and CheckStatsWeight::get_maxpart().

Xapian::termcount Xapian::Weight::get_wqf (  )  const [inline, protected]

The within-query-frequency of this term.

Definition at line 295 of file weight.h.

Referenced by CheckStatsWeight::get_sumpart(), and Xapian::BM25Weight::init().

virtual void Xapian::Weight::init ( double  factor  )  [protected, pure virtual]

Allow the subclass to perform any initialisation it needs to.

Parameters:
factor Any scaling factor (e.g. from OP_SCALE_WEIGHT). If the Weight object is for the term-independent weight supplied by get_sumextra()/get_maxextra(), then init(0.0) is called (starting from Xapian 1.2.11 and 1.3.1 - earlier versions failed to call init() for such Weight objects).

Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.

Referenced by init_().

void Xapian::Weight::init_ ( const Internal stats,
Xapian::termcount  query_len_ 
) [private]

void Xapian::Weight::init_ ( const Internal stats,
Xapian::termcount  query_len_,
double  factor,
Xapian::doccount  termfreq,
Xapian::doccount  reltermfreq 
) [private]

For internal use only.

Initialise this object to calculate weights for a synonym.

Parameters:
stats Source of statistics.
query_len_ Query length.
factor Any scaling factor (e.g. from OP_SCALE_WEIGHT).
termfreq The termfreq to use.
reltermfreq The reltermfreq to use.

Definition at line 81 of file weight.cc.

References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), init(), LOGCALL_VOID, query_length_, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, termfreq_, WDF_MAX, wdf_upper_bound_, and wqf_.

void Xapian::Weight::init_ ( const Internal stats,
Xapian::termcount  query_len_,
const std::string &  term,
Xapian::termcount  wqf_,
double  factor 
) [private]

string Xapian::Weight::name (  )  const [virtual]

Return the name of this weighting scheme.

This name is used by the remote backend. It is passed along with the serialised parameters to the remote server so that it knows which class to create.

Return the full namespace-qualified name of your class here - if your class is called FooWeight, return "FooWeight" from this method (Xapian::BM25Weight returns "Xapian::BM25Weight" here).

If you don't want to support the remote backend, you can use the default implementation which simply returns an empty string.

Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.

Definition at line 115 of file weight.cc.

Referenced by Xapian::Registry::Internal::add_defaults(), DEFINE_TESTCASE(), Xapian::Registry::register_weighting_scheme(), and RemoteDatabase::set_query().

void Xapian::Weight::need_stat ( stat_flags  flag  )  [inline, protected]

Tell Xapian that your subclass will want a particular statistic.

Some of the statistics can be costly to fetch or calculate, so Xapian needs to know which are actually going to be used. You should call need_stat() from your constructor for each such statistic.

Parameters:
flag The stat_flags value for a required statistic.

Definition at line 60 of file weight.h.

Referenced by CheckStatsWeight::CheckStatsWeight().

void Xapian::Weight::operator= ( const Weight  )  [private]

Don't allow assignment.

string Xapian::Weight::serialise (  )  const [virtual]

Return this object's parameters serialised as a single string.

If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.

Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, and MyWeight.

Definition at line 121 of file weight.cc.

Referenced by RemoteDatabase::set_query().

Weight * Xapian::Weight::unserialise ( const std::string &  s  )  const [virtual]

Unserialise parameters.

This method unserialises parameters serialised by the serialise() method and allocates and returns a new object initialised with them.

If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.

Note that the returned object will be deallocated by Xapian after use with "delete". If you want to handle the deletion in a special way (for example when wrapping the Xapian API for use from another language) then you can define a static operator delete method in your subclass as shown here: http://trac.xapian.org/ticket/554#comment:1

Parameters:
s A string containing the serialised parameters.

Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, and MyWeight.

Definition at line 127 of file weight.cc.

Referenced by DEFINE_TESTCASE(), and RemoteServer::msg_query().


Member Data Documentation

The average length of a document in the collection.

Definition at line 89 of file weight.h.

Referenced by init_().

The number of documents in the collection.

Definition at line 83 of file weight.h.

Referenced by init_().

A lower bound on the minimum length of any document in the database.

Definition at line 104 of file weight.h.

Referenced by init_().

An upper bound on the maximum length of any document in the database.

Definition at line 107 of file weight.h.

Referenced by init_().

The length of the query.

Definition at line 98 of file weight.h.

Referenced by init_().

The number of relevant documents which this term indexes.

Definition at line 95 of file weight.h.

Referenced by init_().

The number of documents marked as relevant.

Definition at line 86 of file weight.h.

Referenced by init_().

A bitmask of the statistics this weighting scheme needs.

Definition at line 80 of file weight.h.

Referenced by init_().

The number of documents which this term indexes.

Definition at line 92 of file weight.h.

Referenced by init_().

An upper bound on the wdf of this term.

Definition at line 110 of file weight.h.

Referenced by init_().

The within-query-frequency of this term.

Definition at line 101 of file weight.h.

Referenced by init_().


The documentation for this class was generated from the following files:

Documentation for Xapian (version 1.2.13).
Generated on 9 Jan 2013 by Doxygen 1.5.9.