xapian-core
1.4.26
|
#include <documentterm.h>
Public Types | |
typedef vector< Xapian::termpos > | term_positions |
Public Member Functions | |
OmDocumentTerm (Xapian::termcount wdf_) | |
Make a new term. More... | |
void | merge () const |
Merge sorted ranges before and after split. More... | |
const term_positions * | get_vector_termpos () const |
Xapian::termcount | positionlist_count () const |
void | remove () |
bool | add_position (Xapian::termcount wdf_inc, Xapian::termpos termpos) |
Add a position. More... | |
void | append_position (Xapian::termpos termpos) |
Append a position. More... | |
void | remove_position (Xapian::termpos tpos) |
Remove an entry from the position list. More... | |
Xapian::termpos | remove_positions (Xapian::termpos termpos_first, Xapian::termpos termpos_last) |
Remove a range of positions. More... | |
bool | increase_wdf (Xapian::termcount delta) |
Increase within-document frequency. More... | |
void | decrease_wdf (Xapian::termcount delta) |
Decrease within-document frequency. More... | |
Xapian::termcount | get_wdf () const |
Get the wdf. More... | |
bool | is_deleted () const |
Has this term been deleted from this document? More... | |
string | get_description () const |
Return a string describing this object. More... | |
Public Attributes | |
Xapian::termcount | wdf |
Within document frequency of the term. More... | |
unsigned | split = 0 |
Split point in the position range. More... | |
Private Attributes | |
term_positions | positions |
Positional information. More... | |
A term in a document.
Definition at line 38 of file documentterm.h.
typedef vector<Xapian::termpos> OmDocumentTerm::term_positions |
Definition at line 77 of file documentterm.h.
|
inlineexplicit |
Make a new term.
wdf_ | Initial wdf. |
Definition at line 44 of file documentterm.h.
References LOGCALL_CTOR.
bool OmDocumentTerm::add_position | ( | Xapian::termcount | wdf_inc, |
Xapian::termpos | termpos | ||
) |
|
inline |
Append a position.
The position must be >= the largest currently in the list.
Definition at line 125 of file documentterm.h.
References remove_position(), and remove_positions().
Referenced by Xapian::Document::Internal::add_posting(), and Xapian::Document::Internal::need_terms().
|
inline |
Decrease within-document frequency.
Definition at line 167 of file documentterm.h.
string OmDocumentTerm::get_description | ( | ) | const |
Return a string describing this object.
Definition at line 388 of file omdocument.cc.
References Xapian::Internal::str().
Referenced by is_deleted().
|
inline |
Definition at line 96 of file documentterm.h.
|
inline |
|
inline |
Increase within-document frequency.
Definition at line 156 of file documentterm.h.
References is_deleted(), and rare.
|
inline |
Has this term been deleted from this document?
We flag entries as deleted instead of actually deleting them to avoid invalidating existing TermIterator objects.
Definition at line 184 of file documentterm.h.
References get_description().
Referenced by increase_wdf().
void OmDocumentTerm::merge | ( | ) | const |
Merge sorted ranges before and after split.
Definition at line 245 of file omdocument.cc.
References Assert.
Referenced by get_vector_termpos().
|
inline |
Definition at line 101 of file documentterm.h.
|
inline |
Definition at line 105 of file documentterm.h.
References add_position().
void OmDocumentTerm::remove_position | ( | Xapian::termpos | tpos | ) |
Remove an entry from the position list.
This removes an entry from the list of positions.
This does not change the value of the wdf.
Xapian::InvalidArgumentError | is thrown if the position does not occur in the position list. |
Definition at line 324 of file omdocument.cc.
References Assert, LOGCALL_VOID, rare, and Xapian::Internal::str().
Referenced by append_position().
Xapian::termpos OmDocumentTerm::remove_positions | ( | Xapian::termpos | termpos_first, |
Xapian::termpos | termpos_last | ||
) |
Remove a range of positions.
termpos_first | First position to remove |
termpos_last | Last position to remove |
It's OK if there are no positions in the specified range.
Definition at line 362 of file omdocument.cc.
References Assert, and LOGCALL.
Referenced by append_position().
|
mutableprivate |
Positional information.
This is a list of positions at which the term occurs in the document. The list is in strictly increasing order of term position.
The positions start at 1.
Note that, even if positional information is present, the WDF might not be equal to the length of the position list, since a term might occur multiple times at a single position, but will only have one entry in the position list for each position.
Definition at line 93 of file documentterm.h.
Referenced by get_vector_termpos().
|
mutable |
Split point in the position range.
To allow more efficient insertion of positions, we support the positions being split into two sorted ranges, and if this is the case, split will be > 0 and there will be two sorted ranges [0, split) and [split, positions.size()).
If split is 0, then [0, positions.size()) form a single sorted range.
If positions.empty(), then split > 0 indicates that the term has been deleted (this allows us to delete terms without invalidating existing TermIterator objects).
Use type unsigned here to avoid bloating this structure. More than 4 billion positions in one document is not sensible (and not possible unless termpos is configured to be 64 bit).
Definition at line 72 of file documentterm.h.
Xapian::termcount OmDocumentTerm::wdf |
Within document frequency of the term.
This is the number of occurrences of the term in the document.
Definition at line 53 of file documentterm.h.
Referenced by get_wdf().