38 PL2PlusWeight::PL2PlusWeight(
double c,
double delta)
39 : param_c(c), param_delta(delta)
75 if (
rare(wdf_upper_bound == 0 ||
mean > 1)) {
82 double base_change(1.0 / log(2.0));
83 P1 =
mean * base_change + 0.5 *
log2(2.0 * M_PI);
93 dw = P_delta / (param_delta + 1.0);
109 double P_max2a = (wdfn_upper + 0.5) *
log2(wdfn_upper) / (wdfn_upper + 1.0);
125 double wdfn_optb =
P1 +
P2 > 0 ? wdfn_upper : wdfn_lower;
126 double P_max2b = (
P1 -
P2 * wdfn_optb) / (wdfn_optb + 1.0);
135 return "Xapian::PL2PlusWeight";
149 const char *ptr = s.data();
150 const char *end = ptr + s.size();
153 if (
rare(ptr != end))
163 if (wdf == 0 ||
mean > 1) {
169 double wdfn = wdf *
log2(1 +
cl / len);
171 double P =
P1 + (wdfn + 0.5) *
log2(wdfn) -
P2 * wdfn;
173 double wt = (P / (wdfn + 1.0)) +
dw;
176 if (
rare(wt <= 0))
return 0.0;
The Xapian namespace contains public interfaces for the Xapian library.
double factor
The factor to multiply weights by.
double param_delta
Additional parameter delta in the PL2+ weighting formula.
Xapian::doccount get_collection_size() const
The number of documents in the collection.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms) const
Calculate the term-independent weight component for a document.
double upper_bound
The upper bound on the weight.
Xapian::termcount get_collection_freq() const
The collection frequency of the term.
double param_c
The wdf normalization parameter in the formula.
Upper bound on document lengths.
PL2PlusWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
std::string serialise() const
Return this object's parameters serialised as a single string.
Lower bound on (non-zero) document lengths.
double dw
Weight contribution of delta term in the PL2+ function.
Xapian::Weight subclass implementing the PL2+ probabilistic formula.
double P1
Constants for a given term in a given query.
std::string name() const
Return the name of this weighting scheme.
Hierarchy of classes which Xapian can throw as exceptions.
unsigned XAPIAN_TERMCOUNT_BASE_TYPE termcount
A counts of terms.
functions to serialise and unserialise a double
Length of the current document (sum wdf).
InvalidArgumentError indicates an invalid parameter value was passed to the API.
Xapian::termcount get_doclength_lower_bound() const
A lower bound on the minimum length of any document in the shard.
double unserialise_double(const char **p, const char *end)
Unserialise a double serialised by serialise_double.
Indicates an error in the std::string serialisation of an object.
Within-query-frequency of the current term.
Average length of documents in the collection.
double mean
Set by init() to get_collection_freq()) / get_collection_size()
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
Xapian::termcount get_wqf() const
The within-query-frequency of this term.
PL2PlusWeight * clone() const
Clone this object.
Xapian::termcount get_doclength_upper_bound() const
An upper bound on the maximum length of any document in the shard.
Sum of wdf over the whole collection for the current term.
Within-document-frequency of the current term in the current document.
Xapian::doclength get_average_length() const
The average length of a document in the collection.
std::string serialise_double(double v)
Serialise a double to a string.
void init(double factor_)
Allow the subclass to perform any initialisation it needs to.
double cl
Set by init() to (param_c * get_average_length())
Number of documents in the collection.
Defines a log2() function to find the logarithm to base 2 if not already defined in the library...
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms) const
Calculate the weight contribution for this object's term to a document.
void need_stat(stat_flags flag)
Tell Xapian that your subclass will want a particular statistic.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
Xapian::termcount get_wdf_upper_bound() const
An upper bound on the wdf of this term in the shard.