37 PL2Weight::PL2Weight(
double c) : param_c(c)
84 double base_change(1.0 / log(2.0));
86 P1 = mean * base_change + 0.5 *
log2(2.0 * M_PI);
87 P2 =
log2(mean) + base_change;
107 double P_max2a = (wdfn_upper + 0.5) *
log2(wdfn_upper) / (wdfn_upper + 1.0);
123 double wdfn_optb =
P1 +
P2 > 0 ? wdfn_upper : wdfn_lower;
124 double P_max2b = (
P1 -
P2 * wdfn_optb) / (wdfn_optb + 1.0);
134 return "Xapian::PL2Weight";
146 const char *ptr = s.data();
147 const char *end = ptr + s.size();
149 if (
rare(ptr != end))
158 if (wdf == 0)
return 0.0;
160 double wdfn = wdf *
log2(1 +
cl / len);
162 double P =
P1 + (wdfn + 0.5) *
log2(wdfn) -
P2 * wdfn;
163 if (
rare(P <= 0))
return 0.0;
The Xapian namespace contains public interfaces for the Xapian library.
Xapian::doccount get_collection_size() const
The number of documents in the collection.
Xapian::termcount get_collection_freq() const
The collection frequency of the term.
double param_c
The wdf normalization parameter in the formula.
Upper bound on document lengths.
double upper_bound
The upper bound on the weight.
Lower bound on (non-zero) document lengths.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
double lower_bound
The factor to multiply weights by.
Hierarchy of classes which Xapian can throw as exceptions.
unsigned XAPIAN_TERMCOUNT_BASE_TYPE termcount
A counts of terms.
functions to serialise and unserialise a double
Length of the current document (sum wdf).
InvalidArgumentError indicates an invalid parameter value was passed to the API.
Xapian::termcount get_doclength_lower_bound() const
A lower bound on the minimum length of any document in the shard.
double unserialise_double(const char **p, const char *end)
Unserialise a double serialised by serialise_double.
Indicates an error in the std::string serialisation of an object.
Within-query-frequency of the current term.
Average length of documents in the collection.
PL2Weight * clone() const
Clone this object.
void init(double factor)
Allow the subclass to perform any initialisation it needs to.
std::string serialise() const
Return this object's parameters serialised as a single string.
std::string name() const
Return the name of this weighting scheme.
This class implements the PL2 weighting scheme.
Xapian::termcount get_wqf() const
The within-query-frequency of this term.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms) const
Calculate the weight contribution for this object's term to a document.
Xapian::termcount get_doclength_upper_bound() const
An upper bound on the maximum length of any document in the shard.
Sum of wdf over the whole collection for the current term.
Within-document-frequency of the current term in the current document.
double P1
Constants for a given term in a given query.
Xapian::doclength get_average_length() const
The average length of a document in the collection.
std::string serialise_double(double v)
Serialise a double to a string.
double cl
Set by init() to (param_c * get_average_length())
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
PL2Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
Number of documents in the collection.
Defines a log2() function to find the logarithm to base 2 if not already defined in the library...
void need_stat(stat_flags flag)
Tell Xapian that your subclass will want a particular statistic.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms) const
Calculate the term-independent weight component for a document.
Xapian::termcount get_wdf_upper_bound() const
An upper bound on the wdf of this term in the shard.