xapian-core  1.4.27
Public Types | Public Member Functions | Public Attributes | Private Attributes | List of all members
OmDocumentTerm Class Reference

A term in a document. More...

#include <documentterm.h>

+ Collaboration diagram for OmDocumentTerm:

Public Types

typedef vector< Xapian::termposterm_positions
 

Public Member Functions

 OmDocumentTerm (Xapian::termcount wdf_)
 Make a new term. More...
 
void merge () const
 Merge sorted ranges before and after split. More...
 
const term_positionsget_vector_termpos () const
 
Xapian::termcount positionlist_count () const
 
void remove ()
 
bool add_position (Xapian::termcount wdf_inc, Xapian::termpos termpos)
 Add a position. More...
 
void append_position (Xapian::termpos termpos)
 Append a position. More...
 
void remove_position (Xapian::termpos tpos)
 Remove an entry from the position list. More...
 
Xapian::termpos remove_positions (Xapian::termpos termpos_first, Xapian::termpos termpos_last)
 Remove a range of positions. More...
 
bool increase_wdf (Xapian::termcount delta)
 Increase within-document frequency. More...
 
void decrease_wdf (Xapian::termcount delta)
 Decrease within-document frequency. More...
 
Xapian::termcount get_wdf () const
 Get the wdf. More...
 
bool is_deleted () const
 Has this term been deleted from this document? More...
 
string get_description () const
 Return a string describing this object. More...
 

Public Attributes

Xapian::termcount wdf
 Within document frequency of the term. More...
 
unsigned split = 0
 Split point in the position range. More...
 

Private Attributes

term_positions positions
 Positional information. More...
 

Detailed Description

A term in a document.

Definition at line 38 of file documentterm.h.

Member Typedef Documentation

◆ term_positions

Definition at line 77 of file documentterm.h.

Constructor & Destructor Documentation

◆ OmDocumentTerm()

OmDocumentTerm::OmDocumentTerm ( Xapian::termcount  wdf_)
inlineexplicit

Make a new term.

Parameters
wdf_Initial wdf.

Definition at line 44 of file documentterm.h.

References LOGCALL_CTOR.

Member Function Documentation

◆ add_position()

bool OmDocumentTerm::add_position ( Xapian::termcount  wdf_inc,
Xapian::termpos  termpos 
)

Add a position.

If termpos is already present, this is a no-op.

Parameters
wdf_incwdf increment
termposPosition to add
Returns
true if the term was flagged as deleted before the operation.

Definition at line 255 of file omdocument.cc.

References AssertRel, LOGCALL, and rare.

Referenced by remove().

◆ append_position()

void OmDocumentTerm::append_position ( Xapian::termpos  termpos)
inline

Append a position.

The position must be >= the largest currently in the list.

Definition at line 125 of file documentterm.h.

References remove_position(), and remove_positions().

Referenced by Xapian::Document::Internal::add_posting(), and Xapian::Document::Internal::need_terms().

◆ decrease_wdf()

void OmDocumentTerm::decrease_wdf ( Xapian::termcount  delta)
inline

Decrease within-document frequency.

Definition at line 167 of file documentterm.h.

◆ get_description()

string OmDocumentTerm::get_description ( ) const

Return a string describing this object.

Definition at line 388 of file omdocument.cc.

References Xapian::Internal::str().

Referenced by is_deleted().

◆ get_vector_termpos()

const term_positions* OmDocumentTerm::get_vector_termpos ( ) const
inline

Definition at line 96 of file documentterm.h.

References merge(), and positions.

◆ get_wdf()

Xapian::termcount OmDocumentTerm::get_wdf ( ) const
inline

Get the wdf.

Definition at line 177 of file documentterm.h.

References wdf.

◆ increase_wdf()

bool OmDocumentTerm::increase_wdf ( Xapian::termcount  delta)
inline

Increase within-document frequency.

Returns
true if the term was flagged as deleted before the operation.

Definition at line 156 of file documentterm.h.

References is_deleted(), and rare.

◆ is_deleted()

bool OmDocumentTerm::is_deleted ( ) const
inline

Has this term been deleted from this document?

We flag entries as deleted instead of actually deleting them to avoid invalidating existing TermIterator objects.

Definition at line 184 of file documentterm.h.

References get_description().

Referenced by increase_wdf().

◆ merge()

void OmDocumentTerm::merge ( ) const

Merge sorted ranges before and after split.

Definition at line 245 of file omdocument.cc.

References Assert.

Referenced by get_vector_termpos().

◆ positionlist_count()

Xapian::termcount OmDocumentTerm::positionlist_count ( ) const
inline

Definition at line 101 of file documentterm.h.

◆ remove()

void OmDocumentTerm::remove ( )
inline

Definition at line 105 of file documentterm.h.

References add_position().

◆ remove_position()

void OmDocumentTerm::remove_position ( Xapian::termpos  tpos)

Remove an entry from the position list.

This removes an entry from the list of positions.

This does not change the value of the wdf.

Exceptions
Xapian::InvalidArgumentErroris thrown if the position does not occur in the position list.

Definition at line 324 of file omdocument.cc.

References Assert, LOGCALL_VOID, rare, and Xapian::Internal::str().

Referenced by append_position().

◆ remove_positions()

Xapian::termpos OmDocumentTerm::remove_positions ( Xapian::termpos  termpos_first,
Xapian::termpos  termpos_last 
)

Remove a range of positions.

Parameters
termpos_firstFirst position to remove
termpos_lastLast position to remove

It's OK if there are no positions in the specified range.

Returns
the number of positions removed.

Definition at line 362 of file omdocument.cc.

References Assert, and LOGCALL.

Referenced by append_position().

Member Data Documentation

◆ positions

term_positions OmDocumentTerm::positions
mutableprivate

Positional information.

This is a list of positions at which the term occurs in the document. The list is in strictly increasing order of term position.

The positions start at 1.

Note that, even if positional information is present, the WDF might not be equal to the length of the position list, since a term might occur multiple times at a single position, but will only have one entry in the position list for each position.

Definition at line 93 of file documentterm.h.

Referenced by get_vector_termpos().

◆ split

unsigned OmDocumentTerm::split = 0
mutable

Split point in the position range.

To allow more efficient insertion of positions, we support the positions being split into two sorted ranges, and if this is the case, split will be > 0 and there will be two sorted ranges [0, split) and [split, positions.size()).

If split is 0, then [0, positions.size()) form a single sorted range.

If positions.empty(), then split > 0 indicates that the term has been deleted (this allows us to delete terms without invalidating existing TermIterator objects).

Use type unsigned here to avoid bloating this structure. More than 4 billion positions in one document is not sensible (and not possible unless termpos is configured to be 64 bit).

Definition at line 72 of file documentterm.h.

◆ wdf

Xapian::termcount OmDocumentTerm::wdf

Within document frequency of the term.

This is the number of occurrences of the term in the document.

Definition at line 53 of file documentterm.h.

Referenced by get_wdf().


The documentation for this class was generated from the following files: