22 #ifndef XAPIAN_INCLUDED_EXPANDWEIGHT_H
23 #define XAPIAN_INCLUDED_EXPANDWEIGHT_H
82 if (wdf == 0) wdf = 1;
91 if (shard_index >=
dbs_seen.size()) {
159 bool use_exact_termfreq_,
160 bool want_collection_freq_)
179 bool use_exact_termfreq_,
180 bool want_collection_freq_,
237 bool use_exact_termfreq_,
269 bool use_exact_termfreq_)
This class is used to access a database, or a group of databases.
This class implements the Bo1 scheme for query expansion.
double get_weight() const
Calculate the weight.
Bo1EWeight(const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_)
Constructor.
Collates statistics while calculating term weight in an ESet.
Xapian::doclength avlen
Average document length in the whole database.
Xapian::doccount termfreq
Term frequency (for a multidb, may be for a subset of the databases).
ExpandStats(Xapian::doclength avlen_, double expand_k_)
Constructor for expansion schemes which require the "expand_k" parameter.
void accumulate(size_t shard_index, Xapian::termcount wdf, Xapian::termcount doclen, Xapian::doccount subtf, Xapian::doccount subdbsize)
Xapian::doccount rtermfreq
The number of documents from the RSet indexed by the current term (r).
double multiplier
The multiplier to be used in TradWeight query expansion.
ExpandStats(Xapian::doclength avlen_)
Constructor for expansion schemes which do not require the "expand_k" parameter.
Xapian::doccount dbsize
Size of the subset of a multidb to which the value in termfreq applies.
Xapian::termcount rcollection_freq
The number of times the term occurs in the rset.
double expand_k
The parameter k to be used for TradWeight query expansion.
std::vector< bool > dbs_seen
Which databases in a multidb are included in termfreq.
Class for calculating ESet term weights.
void collect_stats(TermList *merger, const std::string &term)
Get the term statistics.
Xapian::doccount dbsize
The number of documents in the whole database.
Xapian::totallength get_collection_len() const
Return the length of the collection.
double get_avlen() const
Return the average length of the database.
ExpandWeight(const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_, bool want_collection_freq_)
Constructor.
bool use_exact_termfreq
Should we calculate the exact term frequency when generating an ESet?
Xapian::doccount get_rsize() const
Return the number of documents in the RSet.
virtual double get_weight() const =0
Calculate the weight.
const Xapian::Database db
The combined database.
ExpandWeight(const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_, bool want_collection_freq_, double expand_k_)
Constructor.
bool want_collection_freq
Does the expansion scheme use collection frequency?
Xapian::doccount get_dbsize() const
Return the size of the database.
Xapian::totallength collection_len
The total length of the database.
Xapian::doclength avlen
Average document length in the whole database.
Xapian::termcount collection_freq
The collection frequency of the term.
Xapian::termcount get_collection_freq() const
Return the collection frequency of the term.
ExpandStats stats
An ExpandStats object to accumulate statistics.
Xapian::doccount rsize
The number of documents in the RSet.
This class implements the TradWeight scheme for query expansion.
TradEWeight(const Xapian::Database &db_, Xapian::doccount rsize_, bool use_exact_termfreq_, double expand_k_)
Constructor.
double get_weight() const
Calculate the weight.
Abstract base class for termlists.
API for working with Xapian databases.
The Xapian namespace contains public interfaces for the Xapian library.
unsigned XAPIAN_TERMCOUNT_BASE_TYPE termcount
A counts of terms.
double doclength
A normalised document length.
unsigned XAPIAN_DOCID_BASE_TYPE doccount
A count of documents.
XAPIAN_TOTALLENGTH_TYPE totallength
The total length of all documents in a database.
Abstract base class for termlists.