xapian-core  2.0.0
Public Member Functions | Private Member Functions | Private Attributes | List of all members
TermInfo Class Reference

Metadata for a term in a document. More...

#include <terminfo.h>

+ Collaboration diagram for TermInfo:

Public Member Functions

 TermInfo (Xapian::termcount wdf_)
 Constructor. More...
 
 TermInfo (Xapian::termcount wdf_, Xapian::termpos termpos)
 Constructor which also adds an initial position. More...
 
const Xapian::VecCOW< Xapian::termpos > * get_positions () const
 Get a pointer to the positions. More...
 
bool has_positions () const
 
size_t count_positions () const
 
Xapian::termcount get_wdf () const
 Get the within-document frequency. More...
 
bool increase_wdf (Xapian::termcount delta)
 Increase within-document frequency. More...
 
bool decrease_wdf (Xapian::termcount delta)
 Decrease within-document frequency. More...
 
bool remove ()
 
bool add_position (Xapian::termcount wdf_inc, Xapian::termpos termpos)
 Add a position. More...
 
void append_position (Xapian::termpos termpos)
 Append a position. More...
 
bool remove_position (Xapian::termpos termpos)
 Remove a position. More...
 
Xapian::termpos remove_positions (Xapian::termpos termpos_first, Xapian::termpos termpos_last)
 Remove a range of positions. More...
 
bool is_deleted () const
 Has this term been deleted from this document? More...
 

Private Member Functions

void merge () const
 Merge sorted ranges before and after split. More...
 

Private Attributes

Xapian::termcount wdf
 
unsigned split = 0
 Split point in the position range. More...
 
Xapian::VecCOW< Xapian::termpospositions
 Positions at which the term occurs. More...
 

Detailed Description

Metadata for a term in a document.

Definition at line 28 of file terminfo.h.

Constructor & Destructor Documentation

◆ TermInfo() [1/2]

TermInfo::TermInfo ( Xapian::termcount  wdf_)
inlineexplicit

Constructor.

Parameters
wdf_Within-document frequency

Definition at line 65 of file terminfo.h.

◆ TermInfo() [2/2]

TermInfo::TermInfo ( Xapian::termcount  wdf_,
Xapian::termpos  termpos 
)
inline

Constructor which also adds an initial position.

Parameters
wdf_Within-document frequency
termposPosition to add

Definition at line 72 of file terminfo.h.

References positions.

Member Function Documentation

◆ add_position()

bool TermInfo::add_position ( Xapian::termcount  wdf_inc,
Xapian::termpos  termpos 
)

Add a position.

If termpos is already present, this is a no-op.

Parameters
wdf_incwdf increment
termposPosition to add
Returns
true if the term was flagged as deleted before the operation.

Definition at line 43 of file terminfo.cc.

References AssertRel, and rare.

◆ append_position()

void TermInfo::append_position ( Xapian::termpos  termpos)
inline

Append a position.

The position must be >= the largest currently in the list.

Definition at line 145 of file terminfo.h.

References positions.

◆ count_positions()

size_t TermInfo::count_positions ( ) const
inline

Definition at line 84 of file terminfo.h.

References positions.

◆ decrease_wdf()

bool TermInfo::decrease_wdf ( Xapian::termcount  delta)
inline

Decrease within-document frequency.

Returns
true If the adjusted wdf is zero and there are no positions.

Definition at line 107 of file terminfo.h.

References positions, split, and wdf.

◆ get_positions()

const Xapian::VecCOW<Xapian::termpos>* TermInfo::get_positions ( ) const
inline

Get a pointer to the positions.

Definition at line 77 of file terminfo.h.

References merge(), positions, and split.

◆ get_wdf()

Xapian::termcount TermInfo::get_wdf ( ) const
inline

Get the within-document frequency.

Definition at line 87 of file terminfo.h.

References wdf.

◆ has_positions()

bool TermInfo::has_positions ( ) const
inline

Definition at line 82 of file terminfo.h.

References positions.

◆ increase_wdf()

bool TermInfo::increase_wdf ( Xapian::termcount  delta)
inline

Increase within-document frequency.

Returns
true if the term was flagged as deleted before the operation.

Definition at line 93 of file terminfo.h.

References is_deleted(), rare, split, and wdf.

◆ is_deleted()

bool TermInfo::is_deleted ( ) const
inline

Has this term been deleted from this document?

We flag entries as deleted instead of actually deleting them to avoid invalidating existing TermIterator objects.

Definition at line 174 of file terminfo.h.

References positions, and split.

Referenced by increase_wdf(), and remove().

◆ merge()

void TermInfo::merge ( ) const
private

Merge sorted ranges before and after split.

Definition at line 33 of file terminfo.cc.

References Assert.

Referenced by get_positions().

◆ remove()

bool TermInfo::remove ( )
inline

Definition at line 122 of file terminfo.h.

References is_deleted(), positions, and split.

◆ remove_position()

bool TermInfo::remove_position ( Xapian::termpos  termpos)

Remove a position.

Parameters
termposPosition to remove
Returns
If termpos wasn't present, returns false.

Definition at line 110 of file terminfo.cc.

References Assert, and rare.

◆ remove_positions()

Xapian::termpos TermInfo::remove_positions ( Xapian::termpos  termpos_first,
Xapian::termpos  termpos_last 
)

Remove a range of positions.

Parameters
termpos_firstFirst position to remove
termpos_lastLast position to remove

It's OK if there are no positions in the specified range.

Returns
the number of positions removed.

Definition at line 145 of file terminfo.cc.

References Assert.

Member Data Documentation

◆ positions

Xapian::VecCOW<Xapian::termpos> TermInfo::positions
mutableprivate

Positions at which the term occurs.

The entries are sorted in strictly increasing order (so duplicate entries are not allowed).

Definition at line 55 of file terminfo.h.

Referenced by append_position(), count_positions(), decrease_wdf(), get_positions(), has_positions(), is_deleted(), remove(), and TermInfo().

◆ split

unsigned TermInfo::split = 0
mutableprivate

Split point in the position range.

To allow more efficient insertion of positions, we support the positions being split into two sorted ranges, and if this is the case, split will be > 0 and there will be two sorted ranges [0, split) and [split, positions.size()).

If split is 0, then [0, positions.size()) form a single sorted range.

If positions.empty(), then split > 0 indicates that the term has been deleted (this allows us to delete terms without invalidating existing TermIterator objects).

Use type unsigned here to avoid bloating this structure. More than 4 billion positions in one document is not sensible (and not possible unless termpos is configured to be 64 bit).

Definition at line 48 of file terminfo.h.

Referenced by decrease_wdf(), get_positions(), increase_wdf(), is_deleted(), and remove().

◆ wdf

Xapian::termcount TermInfo::wdf
private

Definition at line 29 of file terminfo.h.

Referenced by decrease_wdf(), get_wdf(), and increase_wdf().


The documentation for this class was generated from the following files: