|
xapian-core
2.0.0
|
Class for calculating ESet term weights. More...
#include <expandweight.h>
Inheritance diagram for Xapian::Internal::ExpandWeight:
Collaboration diagram for Xapian::Internal::ExpandWeight:Public Member Functions | |
| ExpandWeight (const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_, bool want_collection_freq_, double expand_k_=0.0) | |
| Constructor. More... | |
| void | collect_stats (TermList *merger, const std::string &term) |
| Get the term statistics. More... | |
| virtual double | get_weight () const =0 |
| Calculate the weight. More... | |
Protected Member Functions | |
| double | get_average_length () const |
| Return the average length of the database. More... | |
| Xapian::doccount | get_rsize () const |
| Return the number of documents in the RSet. More... | |
| Xapian::termcount | get_collection_freq () const |
| Return the collection frequency of the term. More... | |
| Xapian::totallength | get_collection_len () const |
| Return the length of the collection. More... | |
| Xapian::doccount | get_dbsize () const |
| Return the size of the database. More... | |
Protected Attributes | |
| ExpandStats | stats |
| ExpandStats object to accumulate statistics. More... | |
Private Attributes | |
| const Xapian::Database | db |
| The combined database. More... | |
| Xapian::doccount | dbsize |
| The number of documents in the whole database. More... | |
| Xapian::doccount | rsize |
| The number of documents in the RSet. More... | |
| Xapian::termcount | collection_freq = 0 |
| The collection frequency of the term. More... | |
| Xapian::totallength | collection_len |
| The total length of the database. More... | |
| bool | use_exact_termfreq |
| Should we calculate the exact term frequency when generating an ESet? More... | |
| bool | want_collection_freq |
| Does the expansion scheme use collection frequency? More... | |
Class for calculating ESet term weights.
Definition at line 110 of file expandweight.h.
|
inline |
Constructor.
| db_ | The database |
| rsize_ | Number of documents in the RSet |
| use_exact_termfreq_ | When expanding over a combined database, should we use the exact termfreq (if false a cheaper approximation is used) |
| want_collection_freq_ | Does the expansion scheme use collection frequency? |
| expand_k_ | Parameter for ProbEWeight (default: 0) |
Definition at line 154 of file expandweight.h.
| void Xapian::Internal::ExpandWeight::collect_stats | ( | TermList * | merger, |
| const std::string & | term | ||
| ) |
Get the term statistics.
| merger | The tree of TermList objects. |
| term | The current term name. |
Definition at line 37 of file expandweight.cc.
References Xapian::TermIterator::Internal::accumulate_stats(), AssertEqParanoid, AssertRel, LOGCALL_VOID, LOGLINE, LOGVALUE, and term.
Referenced by Xapian::ESet::Internal::expand().
|
inlineprotected |
Return the average length of the database.
Definition at line 181 of file expandweight.h.
References Xapian::Internal::ExpandStats::get_average_length(), and stats.
|
inlineprotected |
Return the collection frequency of the term.
Definition at line 187 of file expandweight.h.
References collection_freq.
|
inlineprotected |
Return the length of the collection.
Definition at line 190 of file expandweight.h.
References collection_len.
|
inlineprotected |
|
inlineprotected |
Return the number of documents in the RSet.
Definition at line 184 of file expandweight.h.
References rsize.
|
pure virtual |
Calculate the weight.
Implemented in Xapian::Internal::Bo1EWeight, and Xapian::Internal::ProbEWeight.
Referenced by Xapian::ESet::Internal::expand().
|
private |
The collection frequency of the term.
Definition at line 121 of file expandweight.h.
Referenced by get_collection_freq().
|
private |
The total length of the database.
Definition at line 124 of file expandweight.h.
Referenced by get_collection_len().
|
private |
The combined database.
Definition at line 112 of file expandweight.h.
|
private |
The number of documents in the whole database.
Definition at line 115 of file expandweight.h.
Referenced by get_dbsize().
|
private |
The number of documents in the RSet.
Definition at line 118 of file expandweight.h.
Referenced by get_rsize().
|
protected |
ExpandStats object to accumulate statistics.
Definition at line 178 of file expandweight.h.
Referenced by get_average_length().
|
private |
Should we calculate the exact term frequency when generating an ESet?
This only has any effect if we're using a combined database.
If this member is true, the exact term frequency will be obtained from the Database object. If this member is false, then an approximation is used to estimate the term frequency based on the term frequencies in the sub-databases which we see while collating term statistics, and the relative sizes of the sub-databases.
Definition at line 136 of file expandweight.h.
|
private |
Does the expansion scheme use collection frequency?
Definition at line 139 of file expandweight.h.