#include <weight.h>

Classes | |
| class | Internal |
| Class to hold statistics for a given collection. More... | |
Public Member Functions | |
| virtual | ~Weight () |
| Virtual destructor, because we have virtual methods. | |
| virtual Weight * | clone () const =0 |
| Clone this object. | |
| virtual std::string | name () const |
| Return the name of this weighting scheme. | |
| virtual std::string | serialise () const |
| Return this object's parameters serialised as a single string. | |
| virtual Weight * | unserialise (const std::string &s) const |
| Unserialise parameters. | |
| virtual Xapian::weight | get_sumpart (Xapian::termcount wdf, Xapian::termcount doclen) const =0 |
| Calculate the weight contribution for this object's term to a document. | |
| virtual Xapian::weight | get_maxpart () const =0 |
| Return an upper bound on what get_sumpart() can return for any document. | |
| virtual Xapian::weight | get_sumextra (Xapian::termcount doclen) const =0 |
| Calculate the term-independent weight component for a document. | |
| virtual Xapian::weight | get_maxextra () const =0 |
| Return an upper bound on what get_sumextra() can return for any document. | |
Protected Types | |
| enum | stat_flags { COLLECTION_SIZE = 1, RSET_SIZE = 2, AVERAGE_LENGTH = 4, TERMFREQ = 8, RELTERMFREQ = 16, QUERY_LENGTH = 32, WQF = 64, WDF = 128, DOC_LENGTH = 256, DOC_LENGTH_MIN = 512, DOC_LENGTH_MAX = 1024, WDF_MAX = 2048 } |
| Stats which the weighting scheme can use (see need_stat()). More... | |
Protected Member Functions | |
| void | need_stat (stat_flags flag) |
| Tell Xapian that your subclass will want a particular statistic. | |
| virtual void | init (double factor)=0 |
| Allow the subclass to perform any initialisation it needs to. | |
| Weight (const Weight &) | |
| Don't allow copying. | |
| Weight () | |
| Default constructor, needed by subclass constructors. | |
| Xapian::doccount | get_collection_size () const |
| The number of documents in the collection. | |
| Xapian::doccount | get_rset_size () const |
| The number of documents marked as relevant. | |
| Xapian::doclength | get_average_length () const |
| The average length of a document in the collection. | |
| Xapian::doccount | get_termfreq () const |
| The number of documents which this term indexes. | |
| Xapian::doccount | get_reltermfreq () const |
| The number of relevant documents which this term indexes. | |
| Xapian::termcount | get_query_length () const |
| The length of the query. | |
| Xapian::termcount | get_wqf () const |
| The within-query-frequency of this term. | |
| Xapian::termcount | get_doclength_upper_bound () const |
| An upper bound on the maximum length of any document in the database. | |
| Xapian::termcount | get_doclength_lower_bound () const |
| A lower bound on the minimum length of any document in the database. | |
| Xapian::termcount | get_wdf_upper_bound () const |
| An upper bound on the wdf of this term. | |
Private Member Functions | |
| void | operator= (const Weight &) |
| Don't allow assignment. | |
| void | init_ (const Internal &stats, Xapian::termcount query_len_, const std::string &term, Xapian::termcount wqf_, double factor) |
| void | init_ (const Internal &stats, Xapian::termcount query_len_, double factor, Xapian::doccount termfreq, Xapian::doccount reltermfreq) |
| void | init_ (const Internal &stats, Xapian::termcount query_len_) |
| bool | get_sumpart_needs_doclength_ () const |
| bool | get_sumpart_needs_wdf_ () const |
Private Attributes | |
| stat_flags | stats_needed |
| A bitmask of the statistics this weighting scheme needs. | |
| Xapian::doccount | collection_size_ |
| The number of documents in the collection. | |
| Xapian::doccount | rset_size_ |
| The number of documents marked as relevant. | |
| Xapian::doclength | average_length_ |
| The average length of a document in the collection. | |
| Xapian::doccount | termfreq_ |
| The number of documents which this term indexes. | |
| Xapian::doccount | reltermfreq_ |
| The number of relevant documents which this term indexes. | |
| Xapian::termcount | query_length_ |
| The length of the query. | |
| Xapian::termcount | wqf_ |
| The within-query-frequency of this term. | |
| Xapian::termcount | doclength_lower_bound_ |
| A lower bound on the minimum length of any document in the database. | |
| Xapian::termcount | doclength_upper_bound_ |
| An upper bound on the maximum length of any document in the database. | |
| Xapian::termcount | wdf_upper_bound_ |
| An upper bound on the wdf of this term. | |
Definition at line 33 of file weight.h.
enum Xapian::Weight::stat_flags [protected] |
Stats which the weighting scheme can use (see need_stat()).
| Xapian::Weight::~Weight | ( | ) | [virtual] |
| Xapian::Weight::Weight | ( | const Weight & | ) | [protected] |
Don't allow copying.
This would ideally be private, but that causes a compilation error with GCC 4.1 (which appears to be a bug).
| Xapian::Weight::Weight | ( | ) | [inline, protected] |
| virtual Weight* Xapian::Weight::clone | ( | ) | const [pure virtual] |
Clone this object.
This method allocates and returns a copy of the object it is called on.
If your subclass is called FooWeight and has parameters a and b, then you would implement FooWeight::clone() like so:
FooWeight * FooWeight::clone() const { return new FooWeight(a, b); }
Note that the returned object will be deallocated by Xapian after use with "delete". If you want to handle the deletion in a special way (for example when wrapping the Xapian API for use from another language) then you can define a static operator delete method in your subclass as shown here: http://trac.xapian.org/ticket/554#comment:1
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.
Referenced by LocalSubMatch::get_postlist_and_term_info(), LocalSubMatch::make_synonym_postlist(), LocalSubMatch::postlist_from_op_leaf_query(), and Xapian::Enquire::set_weighting_scheme().
| Xapian::doclength Xapian::Weight::get_average_length | ( | ) | const [inline, protected] |
The average length of a document in the collection.
Definition at line 283 of file weight.h.
Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::doccount Xapian::Weight::get_collection_size | ( | ) | const [inline, protected] |
The number of documents in the collection.
Definition at line 277 of file weight.h.
Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::termcount Xapian::Weight::get_doclength_lower_bound | ( | ) | const [inline, protected] |
A lower bound on the minimum length of any document in the database.
This bound does not include any zero-length documents.
This should only be used by get_maxpart() and get_maxextra().
Definition at line 311 of file weight.h.
Referenced by Xapian::BM25Weight::get_maxextra(), Xapian::TradWeight::get_maxpart(), Xapian::BM25Weight::get_maxpart(), and CheckStatsWeight::get_maxpart().
| Xapian::termcount Xapian::Weight::get_doclength_upper_bound | ( | ) | const [inline, protected] |
An upper bound on the maximum length of any document in the database.
This should only be used by get_maxpart() and get_maxextra().
Definition at line 301 of file weight.h.
Referenced by CheckStatsWeight::get_maxpart().
| virtual Xapian::weight Xapian::Weight::get_maxextra | ( | ) | const [pure virtual] |
Return an upper bound on what get_sumextra() can return for any document.
This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.
| virtual Xapian::weight Xapian::Weight::get_maxpart | ( | ) | const [pure virtual] |
Return an upper bound on what get_sumpart() can return for any document.
This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.
Referenced by SynonymPostList::get_maxweight(), and LeafPostList::get_maxweight().
| Xapian::termcount Xapian::Weight::get_query_length | ( | ) | const [inline, protected] |
The length of the query.
Definition at line 292 of file weight.h.
Referenced by Xapian::BM25Weight::get_maxextra(), Xapian::BM25Weight::get_sumextra(), and CheckStatsWeight::get_sumpart().
| Xapian::doccount Xapian::Weight::get_reltermfreq | ( | ) | const [inline, protected] |
The number of relevant documents which this term indexes.
Definition at line 289 of file weight.h.
Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::doccount Xapian::Weight::get_rset_size | ( | ) | const [inline, protected] |
The number of documents marked as relevant.
Definition at line 280 of file weight.h.
Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| virtual Xapian::weight Xapian::Weight::get_sumextra | ( | Xapian::termcount | doclen | ) | const [pure virtual] |
Calculate the term-independent weight component for a document.
The parameter gives information about the document which may be used in the calculations:
| doclen | The document's length (unnormalised). |
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.
Referenced by ExtraWeightPostList::get_weight().
| virtual Xapian::weight Xapian::Weight::get_sumpart | ( | Xapian::termcount | wdf, | |
| Xapian::termcount | doclen | |||
| ) | const [pure virtual] |
Calculate the weight contribution for this object's term to a document.
The parameters give information about the document which may be used in the calculations:
| wdf | The within document frequency of the term in the document. | |
| doclen | The document's length (unnormalised). |
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.
Referenced by SynonymPostList::get_weight(), and LeafPostList::get_weight().
| bool Xapian::Weight::get_sumpart_needs_doclength_ | ( | ) | const [inline, private] |
For internal use only.
Return true if the document length is needed.
If this method returns true, then the document length will be fetched and passed to get_sumpart(). Otherwise 0 may be passed for the document length.
Definition at line 252 of file weight.h.
Referenced by LeafPostList::set_termweight(), and SynonymPostList::set_weight().
| bool Xapian::Weight::get_sumpart_needs_wdf_ | ( | ) | const [inline, private] |
For internal use only.
Return true if the WDF is needed.
If this method returns true, then the WDF will be fetched and passed to get_sumpart(). Otherwise 0 may be passed for the wdf.
Definition at line 261 of file weight.h.
Referenced by SynonymPostList::set_weight().
| Xapian::doccount Xapian::Weight::get_termfreq | ( | ) | const [inline, protected] |
The number of documents which this term indexes.
Definition at line 286 of file weight.h.
Referenced by CheckStatsWeight::get_sumpart(), Xapian::TradWeight::init(), and Xapian::BM25Weight::init().
| Xapian::termcount Xapian::Weight::get_wdf_upper_bound | ( | ) | const [inline, protected] |
An upper bound on the wdf of this term.
This should only be used by get_maxpart() and get_maxextra().
Definition at line 319 of file weight.h.
Referenced by Xapian::TradWeight::get_maxpart(), Xapian::BM25Weight::get_maxpart(), and CheckStatsWeight::get_maxpart().
| Xapian::termcount Xapian::Weight::get_wqf | ( | ) | const [inline, protected] |
The within-query-frequency of this term.
Definition at line 295 of file weight.h.
Referenced by CheckStatsWeight::get_sumpart(), and Xapian::BM25Weight::init().
| virtual void Xapian::Weight::init | ( | double | factor | ) | [protected, pure virtual] |
Allow the subclass to perform any initialisation it needs to.
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). If the Weight object is for the term-independent weight supplied by get_sumextra()/get_maxextra(), then init(0.0) is called (starting from Xapian 1.2.11 and 1.3.1 - earlier versions failed to call init() for such Weight objects). |
Implemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, ExceptionalWeight, CheckInitWeight, and CheckStatsWeight.
Referenced by init_().
| void Xapian::Weight::init_ | ( | const Internal & | stats, | |
| Xapian::termcount | query_len_ | |||
| ) | [private] |
For internal use only.
Initialise this object to calculate the extra weight component.
| stats | Source of statistics. | |
| query_len_ | Query length. |
Definition at line 37 of file weight.cc.
References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), init(), LOGCALL_VOID, query_length_, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, termfreq_, wdf_upper_bound_, and wqf_.
| void Xapian::Weight::init_ | ( | const Internal & | stats, | |
| Xapian::termcount | query_len_, | |||
| double | factor, | |||
| Xapian::doccount | termfreq, | |||
| Xapian::doccount | reltermfreq | |||
| ) | [private] |
For internal use only.
Initialise this object to calculate weights for a synonym.
| stats | Source of statistics. | |
| query_len_ | Query length. | |
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). | |
| termfreq | The termfreq to use. | |
| reltermfreq | The reltermfreq to use. |
Definition at line 81 of file weight.cc.
References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), init(), LOGCALL_VOID, query_length_, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, termfreq_, WDF_MAX, wdf_upper_bound_, and wqf_.
| void Xapian::Weight::init_ | ( | const Internal & | stats, | |
| Xapian::termcount | query_len_, | |||
| const std::string & | term, | |||
| Xapian::termcount | wqf_, | |||
| double | factor | |||
| ) | [private] |
For internal use only.
Initialise this object to calculate weights for term term.
| stats | Source of statistics. | |
| query_len_ | Query length. | |
| term | The term for the new object. | |
| wqf_ | The within-query-frequency of term. | |
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). |
Definition at line 57 of file weight.cc.
References AVERAGE_LENGTH, average_length_, Xapian::Weight::Internal::collection_size, collection_size_, Xapian::Weight::Internal::db, DOC_LENGTH_MAX, DOC_LENGTH_MIN, doclength_lower_bound_, doclength_upper_bound_, Xapian::Weight::Internal::get_average_length(), Xapian::Database::get_doclength_lower_bound(), Xapian::Database::get_doclength_upper_bound(), Xapian::Weight::Internal::get_reltermfreq(), Xapian::Weight::Internal::get_termfreq(), Xapian::Database::get_wdf_upper_bound(), init(), LOGCALL_VOID, query_length_, RELTERMFREQ, reltermfreq_, Xapian::Weight::Internal::rset_size, rset_size_, stats_needed, TERMFREQ, termfreq_, WDF_MAX, wdf_upper_bound_, and wqf_.
| string Xapian::Weight::name | ( | ) | const [virtual] |
Return the name of this weighting scheme.
This name is used by the remote backend. It is passed along with the serialised parameters to the remote server so that it knows which class to create.
Return the full namespace-qualified name of your class here - if your class is called FooWeight, return "FooWeight" from this method (Xapian::BM25Weight returns "Xapian::BM25Weight" here).
If you don't want to support the remote backend, you can use the default implementation which simply returns an empty string.
Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, MyWeight, and ExceptionalWeight.
Definition at line 115 of file weight.cc.
Referenced by Xapian::Registry::Internal::add_defaults(), DEFINE_TESTCASE(), Xapian::Registry::register_weighting_scheme(), and RemoteDatabase::set_query().
| void Xapian::Weight::need_stat | ( | stat_flags | flag | ) | [inline, protected] |
Tell Xapian that your subclass will want a particular statistic.
Some of the statistics can be costly to fetch or calculate, so Xapian needs to know which are actually going to be used. You should call need_stat() from your constructor for each such statistic.
| flag | The stat_flags value for a required statistic. |
Definition at line 60 of file weight.h.
Referenced by CheckStatsWeight::CheckStatsWeight().
| void Xapian::Weight::operator= | ( | const Weight & | ) | [private] |
Don't allow assignment.
| string Xapian::Weight::serialise | ( | ) | const [virtual] |
Return this object's parameters serialised as a single string.
If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.
Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, and MyWeight.
Definition at line 121 of file weight.cc.
Referenced by RemoteDatabase::set_query().
| Weight * Xapian::Weight::unserialise | ( | const std::string & | s | ) | const [virtual] |
Unserialise parameters.
This method unserialises parameters serialised by the serialise() method and allocates and returns a new object initialised with them.
If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.
Note that the returned object will be deallocated by Xapian after use with "delete". If you want to handle the deletion in a special way (for example when wrapping the Xapian API for use from another language) then you can define a static operator delete method in your subclass as shown here: http://trac.xapian.org/ticket/554#comment:1
| s | A string containing the serialised parameters. |
Reimplemented in Xapian::BoolWeight, Xapian::BM25Weight, Xapian::TradWeight, and MyWeight.
Definition at line 127 of file weight.cc.
Referenced by DEFINE_TESTCASE(), and RemoteServer::msg_query().
Xapian::doccount Xapian::Weight::reltermfreq_ [private] |
Xapian::doccount Xapian::Weight::rset_size_ [private] |
stat_flags Xapian::Weight::stats_needed [private] |
Xapian::doccount Xapian::Weight::termfreq_ [private] |
Xapian::termcount Xapian::Weight::wqf_ [private] |