#include <weight.h>

Classes | |
| class | Internal |
| Class to hold statistics for a given collection. More... | |
Public Member Functions | |
| virtual | ~Weight () |
| Virtual destructor, because we have virtual methods. | |
| virtual Weight * | clone () const =0 |
| Clone this object. | |
| virtual std::string | name () const |
| Return the name of this weighting scheme. | |
| virtual std::string | serialise () const |
| Return this object's parameters serialised as a single string. | |
| virtual Weight * | unserialise (const std::string &s) const |
| Unserialise parameters. | |
| virtual Xapian::weight | get_sumpart (Xapian::termcount wdf, Xapian::termcount doclen) const =0 |
| Calculate the weight contribution for this object's term to a document. | |
| virtual Xapian::weight | get_maxpart () const =0 |
| Return an upper bound on what get_sumpart() can return for any document. | |
| virtual Xapian::weight | get_sumextra (Xapian::termcount doclen) const =0 |
| Calculate the term-independent weight component for a document. | |
| virtual Xapian::weight | get_maxextra () const =0 |
| Return an upper bound on what get_sumextra() can return for any document. | |
Protected Types | |
| enum | stat_flags { COLLECTION_SIZE = 1, RSET_SIZE = 2, AVERAGE_LENGTH = 4, TERMFREQ = 8, RELTERMFREQ = 16, QUERY_LENGTH = 32, WQF = 64, WDF = 128, DOC_LENGTH = 256, DOC_LENGTH_MIN = 512, DOC_LENGTH_MAX = 1024, WDF_MAX = 2048 } |
| Stats which the weighting scheme can use (see need_stat()). More... | |
Protected Member Functions | |
| void | need_stat (stat_flags flag) |
| Tell Xapian that your subclass will want a particular statistic. | |
| virtual void | init (double factor)=0 |
| Allow the subclass to perform any initialisation it needs to. | |
| Weight (const Weight &) | |
| Don't allow copying. | |
| Weight () | |
| Default constructor, needed by subclass constructors. | |
| Xapian::doccount | get_collection_size () const |
| The number of documents in the collection. | |
| Xapian::doccount | get_rset_size () const |
| The number of documents marked as relevant. | |
| Xapian::doclength | get_average_length () const |
| The average length of a document in the collection. | |
| Xapian::doccount | get_termfreq () const |
| The number of documents which this term indexes. | |
| Xapian::doccount | get_reltermfreq () const |
| The number of relevant documents which this term indexes. | |
| Xapian::termcount | get_query_length () const |
| The length of the query. | |
| Xapian::termcount | get_wqf () const |
| The within-query-frequency of this term. | |
| Xapian::termcount | get_doclength_upper_bound () const |
| An lower bound on the maximum length of any document in the database. | |
| Xapian::termcount | get_doclength_lower_bound () const |
| An upper bound on the maximum length of any document in the database. | |
| Xapian::termcount | get_wdf_upper_bound () const |
| An upper bound on the wdf of this term. | |
Private Member Functions | |
| void | operator= (const Weight &) |
| Don't allow assignment. | |
| void | init_ (const Internal &stats, Xapian::termcount query_len_, const std::string &term, Xapian::termcount wqf_, double factor) |
| void | init_ (const Internal &stats, Xapian::termcount query_len_, double factor, Xapian::doccount termfreq, Xapian::doccount reltermfreq) |
| void | init_ (const Internal &stats, Xapian::termcount query_len_) |
| bool | get_sumpart_needs_doclength_ () const |
| bool | get_sumpart_needs_wdf_ () const |
Private Attributes | |
| stat_flags | stats_needed |
| A bitmask of the statistics this weighting scheme needs. | |
| Xapian::doccount | collection_size_ |
| The number of documents in the collection. | |
| Xapian::doccount | rset_size_ |
| The number of documents marked as relevant. | |
| Xapian::doclength | average_length_ |
| The average length of a document in the collection. | |
| Xapian::doccount | termfreq_ |
| The number of documents which this term indexes. | |
| Xapian::doccount | reltermfreq_ |
| The number of relevant documents which this term indexes. | |
| Xapian::termcount | query_length_ |
| The length of the query. | |
| Xapian::termcount | wqf_ |
| The within-query-frequency of this term. | |
| Xapian::termcount | doclength_lower_bound_ |
| An lower bound on the maximum length of any document in the database. | |
| Xapian::termcount | doclength_upper_bound_ |
| An upper bound on the maximum length of any document in the database. | |
| Xapian::termcount | wdf_upper_bound_ |
| An upper bound on the wdf of this term. | |
Definition at line 33 of file weight.h.
enum Xapian::Weight::stat_flags [protected] |
Stats which the weighting scheme can use (see need_stat()).
| Xapian::Weight::~Weight | ( | ) | [virtual] |
| Xapian::Weight::Weight | ( | const Weight & | ) | [protected] |
Don't allow copying.
This would ideally be private, but that causes a compilation error with GCC 4.1 (which appears to be a bug).
| Xapian::Weight::Weight | ( | ) | [inline, protected] |
| virtual Weight* Xapian::Weight::clone | ( | ) | const [pure virtual] |
Clone this object.
This method allocates and returns a copy of the object it is called on.
If your subclass is called FooWeight and has parameters a and b, then you would implement FooWeight::clone() like so:
FooWeight * FooWeight::clone() const { return new FooWeight(a, b); }
Note that the returned object will be deallocated by Xapian after use with "delete". It must therefore have been allocated with "new".
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Referenced by LocalSubMatch::get_postlist_and_term_info(), LocalSubMatch::make_synonym_postlist(), LocalSubMatch::postlist_from_op_leaf_query(), and Xapian::Enquire::set_weighting_scheme().
| Xapian::doclength Xapian::Weight::get_average_length | ( | ) | const [inline, protected] |
The average length of a document in the collection.
Definition at line 270 of file weight.h.
Referenced by Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::doccount Xapian::Weight::get_collection_size | ( | ) | const [inline, protected] |
The number of documents in the collection.
Definition at line 264 of file weight.h.
Referenced by Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::termcount Xapian::Weight::get_doclength_lower_bound | ( | ) | const [inline, protected] |
An upper bound on the maximum length of any document in the database.
This should only be used by get_maxpart() and get_maxextra().
Definition at line 296 of file weight.h.
Referenced by Xapian::BM25Weight::get_maxextra(), Xapian::TradWeight::get_maxpart(), and Xapian::BM25Weight::get_maxpart().
| Xapian::termcount Xapian::Weight::get_doclength_upper_bound | ( | ) | const [inline, protected] |
An lower bound on the maximum length of any document in the database.
This should only be used by get_maxpart() and get_maxextra().
| virtual Xapian::weight Xapian::Weight::get_maxextra | ( | ) | const [pure virtual] |
Return an upper bound on what get_sumextra() can return for any document.
This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
| virtual Xapian::weight Xapian::Weight::get_maxpart | ( | ) | const [pure virtual] |
Return an upper bound on what get_sumpart() can return for any document.
This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Referenced by SynonymPostList::get_maxweight(), and LeafPostList::get_maxweight().
| Xapian::termcount Xapian::Weight::get_query_length | ( | ) | const [inline, protected] |
The length of the query.
Definition at line 279 of file weight.h.
Referenced by Xapian::BM25Weight::get_maxextra(), and Xapian::BM25Weight::get_sumextra().
| Xapian::doccount Xapian::Weight::get_reltermfreq | ( | ) | const [inline, protected] |
The number of relevant documents which this term indexes.
Definition at line 276 of file weight.h.
Referenced by Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::doccount Xapian::Weight::get_rset_size | ( | ) | const [inline, protected] |
The number of documents marked as relevant.
Definition at line 267 of file weight.h.
Referenced by Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| virtual Xapian::weight Xapian::Weight::get_sumextra | ( | Xapian::termcount | doclen | ) | const [pure virtual] |
Calculate the term-independent weight component for a document.
The parameter gives information about the document which may be used in the calculations:
| doclen | The document's length (unnormalised). |
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Referenced by ExtraWeightPostList::get_weight().
| virtual Xapian::weight Xapian::Weight::get_sumpart | ( | Xapian::termcount | wdf, | |
| Xapian::termcount | doclen | |||
| ) | const [pure virtual] |
Calculate the weight contribution for this object's term to a document.
The parameters give information about the document which may be used in the calculations:
| wdf | The within document frequency of the term in the document. | |
| doclen | The document's length (unnormalised). |
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Referenced by SynonymPostList::get_weight(), and LeafPostList::get_weight().
| bool Xapian::Weight::get_sumpart_needs_doclength_ | ( | ) | const [inline, private] |
For internal use only.
Return true if the document length is needed.
If this method returns true, then the document length will be fetched and passed to get_sumpart(). Otherwise 0 may be passed for the document length.
Definition at line 239 of file weight.h.
Referenced by LeafPostList::set_termweight(), and SynonymPostList::set_weight().
| bool Xapian::Weight::get_sumpart_needs_wdf_ | ( | ) | const [inline, private] |
For internal use only.
Return true if the WDF is needed.
If this method returns true, then the WDF will be fetched and passed to get_sumpart(). Otherwise 0 may be passed for the wdf.
Definition at line 248 of file weight.h.
Referenced by SynonymPostList::set_weight().
| Xapian::doccount Xapian::Weight::get_termfreq | ( | ) | const [inline, protected] |
The number of documents which this term indexes.
Definition at line 273 of file weight.h.
Referenced by Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::termcount Xapian::Weight::get_wdf_upper_bound | ( | ) | const [inline, protected] |
An upper bound on the wdf of this term.
This should only be used by get_maxpart() and get_maxextra().
Definition at line 304 of file weight.h.
Referenced by Xapian::TradWeight::get_maxpart(), and Xapian::BM25Weight::get_maxpart().
| Xapian::termcount Xapian::Weight::get_wqf | ( | ) | const [inline, protected] |
The within-query-frequency of this term.
Definition at line 282 of file weight.h.
Referenced by Xapian::BM25Weight::init().
| virtual void Xapian::Weight::init | ( | double | factor | ) | [protected, pure virtual] |
Allow the subclass to perform any initialisation it needs to.
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). |
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Referenced by init_().
| void Xapian::Weight::init_ | ( | const Internal & | stats, | |
| Xapian::termcount | query_len_ | |||
| ) | [private] |
For internal use only.
Initialise this object to calculate the extra weight component.
| stats | Source of statistics. | |
| query_len_ | Query length. |
Definition at line 37 of file weight.cc.
References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), LOGCALL_VOID, query_length_, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, termfreq_, wdf_upper_bound_, and wqf_.
| void Xapian::Weight::init_ | ( | const Internal & | stats, | |
| Xapian::termcount | query_len_, | |||
| double | factor, | |||
| Xapian::doccount | termfreq, | |||
| Xapian::doccount | reltermfreq | |||
| ) | [private] |
For internal use only.
Initialise this object to calculate weights for a synonym.
| stats | Source of statistics. | |
| query_len_ | Query length. | |
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). | |
| termfreq | The termfreq to use. | |
| reltermfreq | The reltermfreq to use. |
Definition at line 80 of file weight.cc.
References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), init(), LOGCALL_VOID, query_length_, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, termfreq_, WDF_MAX, wdf_upper_bound_, and wqf_.
| void Xapian::Weight::init_ | ( | const Internal & | stats, | |
| Xapian::termcount | query_len_, | |||
| const std::string & | term, | |||
| Xapian::termcount | wqf_, | |||
| double | factor | |||
| ) | [private] |
For internal use only.
Initialise this object to calculate weights for term term.
| stats | Source of statistics. | |
| query_len_ | Query length. | |
| term | The term for the new object. | |
| wqf_ | The within-query-frequency of term. | |
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). |
Definition at line 56 of file weight.cc.
References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), Xapian::Weight::Internal::get_reltermfreq(), Xapian::Weight::Internal::get_termfreq(), Xapian::Database::get_wdf_upper_bound(), init(), LOGCALL_VOID, query_length_, RELTERMFREQ, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, TERMFREQ, termfreq_, WDF_MAX, wdf_upper_bound_, and wqf_.
| string Xapian::Weight::name | ( | ) | const [virtual] |
Return the name of this weighting scheme.
This name is used by the remote backend. It is passed along with the serialised parameters to the remote server so that it knows which class to create.
Return the full namespace-qualified name of your class here - if your class is called FooWeight, return "FooWeight" from this method (Xapian::BM25Weight returns "Xapian::BM25Weight" here).
If you don't want to support the remote backend, you can use the default implementation which simply returns an empty string.
Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Definition at line 114 of file weight.cc.
Referenced by Xapian::Registry::Internal::add_defaults(), DEFINE_TESTCASE(), Xapian::Registry::register_weighting_scheme(), and RemoteDatabase::set_query().
| void Xapian::Weight::need_stat | ( | stat_flags | flag | ) | [inline, protected] |
Tell Xapian that your subclass will want a particular statistic.
Some of the statistics can be costly to fetch or calculate, so Xapian needs to know which are actually going to be used. You should call need_stat() from your constructor for each such statistic.
| flag | The stat_flags value for a required statistic. |
| void Xapian::Weight::operator= | ( | const Weight & | ) | [private] |
Don't allow assignment.
| string Xapian::Weight::serialise | ( | ) | const [virtual] |
Return this object's parameters serialised as a single string.
If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.
Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, and MyWeight.
Definition at line 120 of file weight.cc.
Referenced by RemoteDatabase::set_query().
| Weight * Xapian::Weight::unserialise | ( | const std::string & | s | ) | const [virtual] |
Unserialise parameters.
This method unserialises parameters serialised by the serialise() method and allocates and returns a new object initialised with them.
If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.
Note that the returned object will be deallocated by Xapian after use with "delete". It must therefore have been allocated with "new".
| s | A string containing the serialised parameters. |
Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, and MyWeight.
Definition at line 126 of file weight.cc.
Referenced by DEFINE_TESTCASE(), and RemoteServer::msg_query().
Xapian::doccount Xapian::Weight::reltermfreq_ [private] |
Xapian::doccount Xapian::Weight::rset_size_ [private] |
stat_flags Xapian::Weight::stats_needed [private] |
Xapian::doccount Xapian::Weight::termfreq_ [private] |
Xapian::termcount Xapian::Weight::wqf_ [private] |