xapian-core
1.4.26
|
Class for calculating ESet term weights. More...
#include <expandweight.h>
Public Member Functions | |
ExpandWeight (const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_, bool want_collection_freq_) | |
Constructor. More... | |
ExpandWeight (const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_, bool want_collection_freq_, double expand_k_) | |
Constructor. More... | |
void | collect_stats (TermList *merger, const std::string &term) |
Get the term statistics. More... | |
virtual double | get_weight () const =0 |
Calculate the weight. More... | |
Protected Member Functions | |
double | get_avlen () const |
Return the average length of the database. More... | |
Xapian::doccount | get_rsize () const |
Return the number of documents in the RSet. More... | |
Xapian::termcount | get_collection_freq () const |
Return the collection frequency of the term. More... | |
Xapian::totallength | get_collection_len () const |
Return the length of the collection. More... | |
Xapian::doccount | get_dbsize () const |
Return the size of the database. More... | |
Protected Attributes | |
ExpandStats | stats |
An ExpandStats object to accumulate statistics. More... | |
Private Attributes | |
const Xapian::Database | db |
The combined database. More... | |
Xapian::doccount | dbsize |
The number of documents in the whole database. More... | |
Xapian::doclength | avlen |
Average document length in the whole database. More... | |
Xapian::doccount | rsize |
The number of documents in the RSet. More... | |
Xapian::termcount | collection_freq |
The collection frequency of the term. More... | |
Xapian::totallength | collection_len |
The total length of the database. More... | |
bool | use_exact_termfreq |
Should we calculate the exact term frequency when generating an ESet? More... | |
bool | want_collection_freq |
Does the expansion scheme use collection frequency? More... | |
Class for calculating ESet term weights.
Definition at line 114 of file expandweight.h.
|
inline |
Constructor.
db_ | The database. |
rsize_ | The number of documents in the RSet. |
use_exact_termfreq_ | When expanding over a combined database, should we use the exact termfreq (if false a cheaper approximation is used). |
Definition at line 157 of file expandweight.h.
|
inline |
Constructor.
db_ | The database. |
rsize_ | The number of documents in the RSet. |
use_exact_termfreq_ | When expanding over a combined database, should we use the exact termfreq (if false a cheaper approximation is used). |
expand_k_ | The parameter for TradWeight query expansion. |
Definition at line 177 of file expandweight.h.
void Xapian::Internal::ExpandWeight::collect_stats | ( | TermList * | merger, |
const std::string & | term | ||
) |
Get the term statistics.
merger | The tree of TermList objects. |
term | The current term name. |
Definition at line 37 of file expandweight.cc.
References Xapian::TermIterator::Internal::accumulate_stats(), AssertEqParanoid, AssertRel, LOGCALL_VOID, LOGLINE, and LOGVALUE.
Referenced by Xapian::ESet::Internal::expand().
|
inlineprotected |
Return the average length of the database.
Definition at line 203 of file expandweight.h.
References Xapian::Internal::ExpandStats::avlen.
|
inlineprotected |
Return the collection frequency of the term.
Definition at line 209 of file expandweight.h.
|
inlineprotected |
Return the length of the collection.
Definition at line 212 of file expandweight.h.
|
inlineprotected |
Return the size of the database.
Definition at line 215 of file expandweight.h.
References Xapian::Internal::ExpandStats::dbsize.
|
inlineprotected |
Return the number of documents in the RSet.
Definition at line 206 of file expandweight.h.
|
pure virtual |
Calculate the weight.
Implemented in Xapian::Internal::Bo1EWeight, and Xapian::Internal::TradEWeight.
Referenced by Xapian::ESet::Internal::expand().
|
private |
Average document length in the whole database.
Definition at line 122 of file expandweight.h.
|
private |
The collection frequency of the term.
Definition at line 128 of file expandweight.h.
|
private |
The total length of the database.
Definition at line 131 of file expandweight.h.
|
private |
The combined database.
Definition at line 116 of file expandweight.h.
|
private |
The number of documents in the whole database.
Definition at line 119 of file expandweight.h.
|
private |
The number of documents in the RSet.
Definition at line 125 of file expandweight.h.
|
protected |
An ExpandStats object to accumulate statistics.
Definition at line 200 of file expandweight.h.
|
private |
Should we calculate the exact term frequency when generating an ESet?
This only has any effect if we're using a combined database.
If this member is true, the exact term frequency will be obtained from the Database object. If this member is false, then an approximation is used to estimate the term frequency based on the term frequencies in the sub-databases which we see while collating term statistics, and the relative sizes of the sub-databases.
Definition at line 143 of file expandweight.h.
|
private |
Does the expansion scheme use collection frequency?
Definition at line 146 of file expandweight.h.