|
xapian-core
2.0.0
|
Class to hold statistics for a given collection. More...
#include <weightinternal.h>
Collaboration diagram for Xapian::Weight::Internal:Public Member Functions | |
| Internal () | |
| Internal & | operator+= (const Internal &inc) |
| Add in the supplied statistics from a sub-database. More... | |
| void | merge (const Weight::Internal &o) |
| void | set_query (const Xapian::Query &query_) |
| void | accumulate_stats (const Xapian::Database::Internal &sub_db, const Xapian::RSet &rset) |
| Accumulate the rtermfreqs for terms in the query. More... | |
| bool | get_stats (std::string_view term, Xapian::doccount &termfreq, Xapian::doccount &reltermfreq, Xapian::termcount &collfreq) const |
| Get the frequencies for the given term. More... | |
| bool | get_stats (std::string_view term, Xapian::doccount &termfreq) const |
| Get just the termfreq. More... | |
| bool | get_termweight (std::string_view term, double &termweight) const |
| Get the termweight. More... | |
| void | get_max_termweight (double &min_tw, double &max_tw) |
| Get the minimum and maximum termweights. More... | |
| void | set_max_part (const std::string &term, double max_part) |
| Set max_part for a term. More... | |
| Xapian::doclength | get_average_length () const |
| std::string | get_description () const |
| Return a std::string describing this object. More... | |
Static Public Member Functions | |
| static bool | double_param (const char **p, double *ptr_val) |
| static bool | param_name (const char **p, std::string &name) |
| static void | parameter_error (const char *msg, const std::string &scheme, const char *params) |
Public Attributes | |
| Xapian::totallength | total_length = 0 |
| Total length of all documents in the collection. More... | |
| Xapian::doccount | collection_size = 0 |
| Number of documents in the collection. More... | |
| Xapian::doccount | rset_size = 0 |
| Number of relevant documents in the collection. More... | |
| Xapian::termcount | db_doclength_lower_bound = 0 |
| A lower bound on the minimum length of any document in the database. More... | |
| Xapian::termcount | db_doclength_upper_bound = 0 |
| An upper bound on the maximum length of any document in the database. More... | |
| Xapian::termcount | db_unique_terms_lower_bound = 0 |
| A lower bound on the number of unique terms in any document. More... | |
| Xapian::termcount | db_unique_terms_upper_bound = 0 |
| An upper bound on the number of unique terms in any document. More... | |
| bool | have_max_part = false |
| Has max_part been set for any term? More... | |
| Xapian::Query | query |
| The query. More... | |
| std::map< std::string, TermFreqs, std::less<> > | termfreqs |
| Map of term frequencies and relevant term frequencies for the collection. More... | |
Class to hold statistics for a given collection.
Definition at line 106 of file weightinternal.h.
|
inline |
Definition at line 153 of file weightinternal.h.
| void Xapian::Weight::Internal::accumulate_stats | ( | const Xapian::Database::Internal & | sub_db, |
| const Xapian::RSet & | rset | ||
| ) |
Accumulate the rtermfreqs for terms in the query.
Definition at line 83 of file weightinternal.cc.
References Assert, Xapian::Internal::TermFreqs::collfreq, Xapian::Database::Internal::get_doccount(), Xapian::Database::Internal::get_doclength_lower_bound(), Xapian::Database::Internal::get_doclength_upper_bound(), Xapian::Database::Internal::get_freqs(), Xapian::Database::Internal::get_total_length(), Xapian::Query::get_unique_terms_begin(), Xapian::Database::Internal::get_unique_terms_lower_bound(), Xapian::Database::Internal::get_unique_terms_upper_bound(), Xapian::RSet::internal, min_non_zero(), Xapian::Database::Internal::open_term_list(), query(), Xapian::RSet::size(), Xapian::TermIterator::Internal::skip_to(), term, and Xapian::Internal::TermFreqs::termfreq.
Referenced by LocalSubMatch::prepare_match().
|
inlinestatic |
Definition at line 285 of file weightinternal.h.
References C_isspace(), and p.
Referenced by Xapian::Weight::create(), Xapian::BM25Weight::create_from_parameters(), Xapian::BM25PlusWeight::create_from_parameters(), Xapian::InL2Weight::create_from_parameters(), Xapian::IfB2Weight::create_from_parameters(), Xapian::IneB2Weight::create_from_parameters(), Xapian::BB2Weight::create_from_parameters(), Xapian::PL2Weight::create_from_parameters(), Xapian::PL2PlusWeight::create_from_parameters(), Xapian::LMJMWeight::create_from_parameters(), Xapian::LMDirichletWeight::create_from_parameters(), Xapian::LMAbsDiscountWeight::create_from_parameters(), and Xapian::LM2StageWeight::create_from_parameters().
|
inline |
Definition at line 272 of file weightinternal.h.
References Assert, collection_size, and total_length.
Referenced by Xapian::Weight::init_().
| string Xapian::Weight::Internal::get_description | ( | ) | const |
Return a std::string describing this object.
Definition at line 151 of file weightinternal.cc.
References Xapian::Internal::str().
|
inline |
Get the minimum and maximum termweights.
Used by the snippet code.
Definition at line 243 of file weightinternal.h.
|
inline |
Get just the termfreq.
Definition at line 213 of file weightinternal.h.
References get_stats(), and term.
|
inline |
Get the frequencies for the given term.
termfreq is "n_t", the number of documents in the collection indexed by the given term.
reltermfreq is "r_t", the number of relevant documents in the collection indexed by the given term.
collfreq is the total number of occurrences of the term in all documents.
Definition at line 184 of file weightinternal.h.
References collection_size, rset_size, term, and termfreqs.
Referenced by get_stats(), and Xapian::Weight::init_().
|
inline |
Get the termweight.
Definition at line 221 of file weightinternal.h.
References term, and termfreqs.
Referenced by Xapian::check_term().
| void Xapian::Weight::Internal::merge | ( | const Weight::Internal & | o | ) |
Definition at line 141 of file weightinternal.cc.
References have_max_part, and termfreqs.
| Weight::Internal & Xapian::Weight::Internal::operator+= | ( | const Internal & | inc | ) |
Add in the supplied statistics from a sub-database.
Used for remote databases, where we pass across a serialised stats object, unserialise it, and add it to our total.
Definition at line 55 of file weightinternal.cc.
References Assert, collection_size, db_doclength_lower_bound, db_doclength_upper_bound, db_unique_terms_lower_bound, db_unique_terms_upper_bound, min_non_zero(), rset_size, termfreqs, and total_length.
|
inlinestatic |
Definition at line 309 of file weightinternal.h.
References p.
Referenced by Xapian::TfIdfWeight::create_from_parameters().
|
inlinestatic |
Definition at line 322 of file weightinternal.h.
Referenced by Xapian::parameter_error(), and parameter_error().
|
inline |
Set max_part for a term.
Definition at line 262 of file weightinternal.h.
References Assert, have_max_part, term, and termfreqs.
|
inline |
Definition at line 164 of file weightinternal.h.
| Xapian::doccount Xapian::Weight::Internal::collection_size = 0 |
Number of documents in the collection.
Definition at line 123 of file weightinternal.h.
Referenced by get_average_length(), get_stats(), Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().
| Xapian::termcount Xapian::Weight::Internal::db_doclength_lower_bound = 0 |
A lower bound on the minimum length of any document in the database.
Definition at line 129 of file weightinternal.h.
Referenced by Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().
| Xapian::termcount Xapian::Weight::Internal::db_doclength_upper_bound = 0 |
An upper bound on the maximum length of any document in the database.
Definition at line 132 of file weightinternal.h.
Referenced by Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().
| Xapian::termcount Xapian::Weight::Internal::db_unique_terms_lower_bound = 0 |
A lower bound on the number of unique terms in any document.
Definition at line 135 of file weightinternal.h.
Referenced by Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().
| Xapian::termcount Xapian::Weight::Internal::db_unique_terms_upper_bound = 0 |
An upper bound on the number of unique terms in any document.
Definition at line 138 of file weightinternal.h.
Referenced by Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().
| bool Xapian::Weight::Internal::have_max_part = false |
Has max_part been set for any term?
If not, we can avoid having to serialise max_part.
Definition at line 144 of file weightinternal.h.
Referenced by merge(), serialise_stats(), set_max_part(), and unserialise_stats().
| Xapian::Query Xapian::Weight::Internal::query |
| Xapian::doccount Xapian::Weight::Internal::rset_size = 0 |
Number of relevant documents in the collection.
Definition at line 126 of file weightinternal.h.
Referenced by get_stats(), Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().
| std::map<std::string, TermFreqs, std::less<> > Xapian::Weight::Internal::termfreqs |
Map of term frequencies and relevant term frequencies for the collection.
Definition at line 151 of file weightinternal.h.
Referenced by get_max_termweight(), get_stats(), get_termweight(), merge(), operator+=(), LocalSubMatch::register_lazy_postlist_for_stats(), LeafPostList::resolve_lazy_termweight(), serialise_stats(), set_max_part(), and unserialise_stats().
| Xapian::totallength Xapian::Weight::Internal::total_length = 0 |
Total length of all documents in the collection.
Definition at line 120 of file weightinternal.h.
Referenced by get_average_length(), Xapian::Weight::init_(), operator+=(), serialise_stats(), and unserialise_stats().