xapian-core  1.4.26
Public Member Functions | Static Public Member Functions | Protected Attributes | Private Member Functions | Private Attributes | List of all members
ChertPostList Class Reference

A postlist in a chert database. More...

#include <chert_postlist.h>

+ Inheritance diagram for ChertPostList:
+ Collaboration diagram for ChertPostList:

Public Member Functions

 ChertPostList (Xapian::Internal::intrusive_ptr< const ChertDatabase > this_db_, const string &term, bool keep_reference)
 Default constructor. More...
 
 ~ChertPostList ()
 Destructor. More...
 
bool jump_to (Xapian::docid desired_did)
 Used for looking up doclens. More...
 
Xapian::doccount get_termfreq () const
 Returns number of docs indexed by this term. More...
 
Xapian::docid get_docid () const
 Returns the current docid. More...
 
Xapian::termcount get_doclength () const
 Returns the length of current document. More...
 
Xapian::termcount get_unique_terms () const
 Return the number of unique terms in the current document. More...
 
Xapian::termcount get_wdf () const
 Returns the Within Document Frequency of the term in the current document. More...
 
PositionListread_position_list ()
 Get the list of positions of the term in the current document. More...
 
PositionListopen_position_list () const
 Get the list of positions of the term in the current document. More...
 
PostListnext (double w_min)
 Move to the next document. More...
 
PostListskip_to (Xapian::docid desired_did, double w_min)
 Skip to next document with docid >= docid. More...
 
bool at_end () const
 Return true if and only if we're off the end of the list. More...
 
Xapian::termcount get_wdf_upper_bound () const
 
std::string get_description () const
 Get a description of the document. More...
 
- Public Member Functions inherited from LeafPostList
 ~LeafPostList ()
 
void set_termweight (const Xapian::Weight *weight_)
 Set the weighting scheme to use during matching. More...
 
double resolve_lazy_termweight (Xapian::Weight *weight_, Xapian::Weight::Internal *stats, Xapian::termcount qlen, Xapian::termcount wqf, double factor)
 
Xapian::doccount get_termfreq_min () const
 Get a lower bound on the number of documents indexed by this term. More...
 
Xapian::doccount get_termfreq_max () const
 Get an upper bound on the number of documents indexed by this term. More...
 
Xapian::doccount get_termfreq_est () const
 Get an estimate of the number of documents indexed by this term. More...
 
double get_maxweight () const
 Return an upper bound on what get_weight() can return. More...
 
double get_weight () const
 Return the weight contribution for the current position. More...
 
double recalc_maxweight ()
 Recalculate the upper bound on what get_weight() can return. More...
 
TermFreqs get_termfreq_est_using_stats (const Xapian::Weight::Internal &stats) const
 Get an estimate for the termfreq and reltermfreq, given the stats. More...
 
Xapian::termcount count_matching_subqs () const
 Count the number of leaf subqueries which match at the current position. More...
 
void gather_position_lists (OrPositionList *orposlist)
 Gather PositionList* objects for a subtree. More...
 
virtual LeafPostListopen_nearby_postlist (const std::string &term_) const
 Open another postlist from the same database. More...
 
void set_term (const std::string &term_)
 Set the term name. More...
 
- Public Member Functions inherited from Xapian::PostingIterator::Internal
virtual ~Internal ()
 We have virtual methods and want to be able to delete derived classes using a pointer to the base class, so we need a virtual destructor. More...
 
virtual const std::string * get_sort_key () const
 
virtual const std::string * get_collapse_key () const
 If the collapse key is already known, return it. More...
 
virtual Internalcheck (Xapian::docid did, double w_min, bool &valid)
 Check if the specified docid occurs in this postlist. More...
 
Internalnext ()
 Advance the current position to the next document in the postlist. More...
 
Internalskip_to (Xapian::docid did)
 Skip forward to the specified docid. More...
 
- Public Member Functions inherited from Xapian::Internal::intrusive_base
 intrusive_base ()
 Construct with no references. More...
 

Static Public Member Functions

static void read_number_of_entries (const char **posptr, const char *end, Xapian::doccount *number_of_entries_ptr, Xapian::termcount *collection_freq_ptr)
 Read the number of entries and the collection frequency. More...
 

Protected Attributes

Xapian::Internal::intrusive_ptr< const ChertDatabasethis_db
 The database we are searching. More...
 
ChertPositionList positionlist
 The position list object for this posting list. More...
 
bool have_started
 Whether we've started reading the list yet. More...
 
Xapian::termcount wdf_upper_bound
 Upper bound on wdf for this postlist. More...
 
- Protected Attributes inherited from LeafPostList
const Xapian::Weightweight
 
bool need_doclength
 
bool need_unique_terms
 
std::string term
 The term name for this postlist (empty for an alldocs postlist). More...
 

Private Member Functions

 ChertPostList (const ChertPostList &)
 Copying is not allowed. More...
 
void operator= (const ChertPostList &)
 Assignment is not allowed. More...
 
bool next_in_chunk ()
 Move to the next item in the chunk, if possible. More...
 
void next_chunk ()
 Move to the next chunk. More...
 
bool current_chunk_contains (Xapian::docid desired_did)
 Return true if the given document ID lies in the range covered by the current chunk. More...
 
void move_to_chunk_containing (Xapian::docid desired_did)
 Move to chunk containing the specified document ID. More...
 
bool move_forward_in_chunk_to_at_least (Xapian::docid desired_did)
 Scan forward in the current chunk for the specified document ID. More...
 

Private Attributes

bool is_last_chunk
 True if this is the last chunk. More...
 
bool is_at_end
 Whether we've run off the end of the list yet. More...
 
AutoPtr< ChertCursorcursor
 Cursor pointing to current chunk of postlist. More...
 
Xapian::docid first_did_in_chunk
 The first document id in this chunk. More...
 
Xapian::docid last_did_in_chunk
 The last document id in this chunk. More...
 
const char * pos
 Position of iteration through current chunk. More...
 
const char * end
 Pointer to byte after end of current chunk. More...
 
Xapian::docid did
 Document id we're currently at. More...
 
Xapian::termcount wdf
 The wdf of the current document. More...
 
Xapian::doccount number_of_entries
 The number of entries in the posting list. More...
 

Additional Inherited Members

- Public Attributes inherited from Xapian::Internal::intrusive_base
unsigned _refs
 Reference count. More...
 
- Protected Member Functions inherited from LeafPostList
 LeafPostList (const std::string &term_)
 Only constructable as a base class for derived classes. More...
 
- Protected Member Functions inherited from Xapian::PostingIterator::Internal
 Internal ()
 Only constructable as a base class for derived classes. More...
 

Detailed Description

A postlist in a chert database.

Definition at line 132 of file chert_postlist.h.

Constructor & Destructor Documentation

◆ ChertPostList() [1/2]

ChertPostList::ChertPostList ( const ChertPostList )
private

Copying is not allowed.

◆ ChertPostList() [2/2]

ChertPostList::ChertPostList ( Xapian::Internal::intrusive_ptr< const ChertDatabase this_db_,
const string &  term_,
bool  keep_reference 
)

Default constructor.

The format of a postlist is:

Split into chunks. Key for first chunk is the termname (encoded as length : name). Key for subsequent chunks is the same, followed by the document ID of the first document in the chunk (encoded as length of representation in first byte, and then docid).

A chunk (except for the first chunk) contains:

1) bool - true if this is the last chunk. 2) difference between final docid in chunk and first docid. 3) wdf for the first item. 4) increment in docid to next item, followed by wdf for the item. 5) (4) repeatedly.

The first chunk begins with the number of entries, the collection frequency, then the docid of the first document, then has the header of a standard chunk.

Definition at line 669 of file chert_postlist.cc.

References cursor, did, end, first_did_in_chunk, Xapian::Internal::intrusive_ptr< T >::get(), is_at_end, is_last_chunk, last_did_in_chunk, LOGCALL_CTOR, LOGLINE, ChertPostListTable::make_key(), number_of_entries, pos, read_start_of_chunk(), read_start_of_first_chunk(), read_wdf(), LeafPostList::term, wdf, and wdf_upper_bound.

◆ ~ChertPostList()

ChertPostList::~ChertPostList ( )

Destructor.

Definition at line 708 of file chert_postlist.cc.

References LOGCALL_DTOR.

Member Function Documentation

◆ at_end()

bool ChertPostList::at_end ( ) const
inlinevirtual

◆ current_chunk_contains()

bool ChertPostList::current_chunk_contains ( Xapian::docid  desired_did)
private

Return true if the given document ID lies in the range covered by the current chunk.

This does not say whether the document ID is actually present. It will return false if the document ID is greater than the last document ID in the chunk, even if it is less than the first document ID in the next chunk: it is possible for no chunk to contain a particular document ID.

Definition at line 835 of file chert_postlist.cc.

References first_did_in_chunk, last_did_in_chunk, LOGCALL, and RETURN.

Referenced by jump_to(), and skip_to().

◆ get_description()

string ChertPostList::get_description ( ) const
virtual

Get a description of the document.

Implements Xapian::PostingIterator::Internal.

Definition at line 984 of file chert_postlist.cc.

References number_of_entries, Xapian::Internal::str(), and LeafPostList::term.

Referenced by ChertModifiedPostList::get_description().

◆ get_docid()

Xapian::docid ChertPostList::get_docid ( ) const
inlinevirtual

◆ get_doclength()

Xapian::termcount ChertPostList::get_doclength ( ) const
virtual

◆ get_termfreq()

Xapian::doccount ChertPostList::get_termfreq ( ) const
inlinevirtual

Returns number of docs indexed by this term.

This is the length of the postlist.

Implements LeafPostList.

Definition at line 252 of file chert_postlist.h.

◆ get_unique_terms()

Xapian::termcount ChertPostList::get_unique_terms ( ) const
virtual

Return the number of unique terms in the current document.

Implements Xapian::PostingIterator::Internal.

Definition at line 724 of file chert_postlist.cc.

References Assert, did, Xapian::Internal::intrusive_ptr< T >::get(), ChertDatabase::get_unique_terms(), have_started, is_at_end, LOGCALL, RETURN, and this_db.

Referenced by ChertModifiedPostList::get_unique_terms().

◆ get_wdf()

Xapian::termcount ChertPostList::get_wdf ( ) const
inlinevirtual

Returns the Within Document Frequency of the term in the current document.

Reimplemented from Xapian::PostingIterator::Internal.

Definition at line 265 of file chert_postlist.h.

References Assert.

Referenced by ChertAllDocsPostList::get_doclength(), and ChertModifiedPostList::get_wdf().

◆ get_wdf_upper_bound()

Xapian::termcount ChertPostList::get_wdf_upper_bound ( ) const
virtual

Implements LeafPostList.

Definition at line 1312 of file chert_postlist.cc.

References wdf_upper_bound.

◆ jump_to()

bool ChertPostList::jump_to ( Xapian::docid  desired_did)

Used for looking up doclens.

Returns
true if docid desired_did has a document length.

Definition at line 955 of file chert_postlist.cc.

References current_chunk_contains(), did, have_started, is_at_end, LOGCALL, move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), pos, and RETURN.

◆ move_forward_in_chunk_to_at_least()

bool ChertPostList::move_forward_in_chunk_to_at_least ( Xapian::docid  desired_did)
private

Scan forward in the current chunk for the specified document ID.

This is particularly efficient if the desired document ID is greater than the last in the chunk - it then skips straight to the end.

Returns
true if we moved to a valid document, false if we reached the end of the chunk.

Definition at line 894 of file chert_postlist.cc.

References Assert, did, end, last_did_in_chunk, LOGCALL, pos, read_did_increase(), read_wdf(), RETURN, and wdf.

Referenced by jump_to(), and skip_to().

◆ move_to_chunk_containing()

void ChertPostList::move_to_chunk_containing ( Xapian::docid  desired_did)
private

Move to chunk containing the specified document ID.

This moves to the chunk whose starting document ID is <= desired_did, but such that the next chunk's starting document ID is > desired_did.

It is thus possible that current_chunk_contains(desired_did) will return false after this call, since the document ID might lie after the end of this chunk, but before the start of the next chunk.

Definition at line 846 of file chert_postlist.cc.

References Assert, C_unpack_uint_preserving_sort(), check_tname_in_key_lite(), cursor, did, end, first_did_in_chunk, is_at_end, is_last_chunk, last_did_in_chunk, LOGCALL_VOID, ChertPostListTable::make_key(), next_chunk(), number_of_entries, pos, read_start_of_chunk(), read_start_of_first_chunk(), read_wdf(), report_read_error(), LeafPostList::term, and wdf.

Referenced by jump_to(), and skip_to().

◆ next()

PostList * ChertPostList::next ( double  w_min)
virtual

Move to the next document.

Implements Xapian::PostingIterator::Internal.

Definition at line 814 of file chert_postlist.cc.

References did, have_started, is_at_end, LOGCALL, LOGLINE, next_chunk(), next_in_chunk(), RETURN, and wdf.

◆ next_chunk()

void ChertPostList::next_chunk ( )
private

◆ next_in_chunk()

bool ChertPostList::next_in_chunk ( )
private

Move to the next item in the chunk, if possible.

If already at the end of the chunk, returns false.

Definition at line 734 of file chert_postlist.cc.

References Assert, did, end, last_did_in_chunk, LOGCALL, pos, read_did_increase(), read_wdf(), RETURN, and wdf.

Referenced by next().

◆ open_position_list()

PositionList * ChertPostList::open_position_list ( ) const
virtual

Get the list of positions of the term in the current document.

Reimplemented from Xapian::PostingIterator::Internal.

Definition at line 806 of file chert_postlist.cc.

References Assert, did, Xapian::Internal::intrusive_ptr< T >::get(), LOGCALL, ChertDatabase::position_table, RETURN, LeafPostList::term, and this_db.

Referenced by ChertModifiedPostList::open_position_list().

◆ operator=()

void ChertPostList::operator= ( const ChertPostList )
private

Assignment is not allowed.

◆ read_number_of_entries()

void ChertPostList::read_number_of_entries ( const char **  posptr,
const char *  end,
Xapian::doccount number_of_entries_ptr,
Xapian::termcount collection_freq_ptr 
)
static

Read the number of entries and the collection frequency.

Read the number of entries in the posting list.

This must only be called when *posptr is pointing to the start of the first chunk of the posting list.

Definition at line 639 of file chert_postlist.cc.

References report_read_error(), and unpack_uint().

Referenced by ChertPostListTable::get_freqs(), read_start_of_first_chunk(), and ChertAllTermsList::read_termfreq().

◆ read_position_list()

PositionList * ChertPostList::read_position_list ( )
virtual

◆ skip_to()

PostList * ChertPostList::skip_to ( Xapian::docid  desired_did,
double  w_min 
)
virtual

Member Data Documentation

◆ cursor

AutoPtr<ChertCursor> ChertPostList::cursor
private

Cursor pointing to current chunk of postlist.

Definition at line 154 of file chert_postlist.h.

Referenced by ChertPostList(), ChertPostListTable::get_chunk(), ChertPostListTable::merge_changes(), move_to_chunk_containing(), and next_chunk().

◆ did

Xapian::docid ChertPostList::did
private

◆ end

const char* ChertPostList::end
private

◆ first_did_in_chunk

Xapian::docid ChertPostList::first_did_in_chunk
private

The first document id in this chunk.

Definition at line 157 of file chert_postlist.h.

Referenced by ChertPostList(), current_chunk_contains(), ChertPostListTable::get_chunk(), move_to_chunk_containing(), and next_chunk().

◆ have_started

bool ChertPostList::have_started
protected

Whether we've started reading the list yet.

Definition at line 144 of file chert_postlist.h.

Referenced by get_doclength(), get_unique_terms(), jump_to(), ChertAllDocsModifiedPostList::next(), ChertModifiedPostList::next(), next(), and skip_to().

◆ is_at_end

bool ChertPostList::is_at_end
private

Whether we've run off the end of the list yet.

Definition at line 151 of file chert_postlist.h.

Referenced by ChertPostList(), get_doclength(), get_unique_terms(), jump_to(), move_to_chunk_containing(), next(), next_chunk(), and skip_to().

◆ is_last_chunk

bool ChertPostList::is_last_chunk
private

True if this is the last chunk.

Definition at line 148 of file chert_postlist.h.

Referenced by ChertPostList(), ChertPostListTable::get_chunk(), move_to_chunk_containing(), and next_chunk().

◆ last_did_in_chunk

Xapian::docid ChertPostList::last_did_in_chunk
private

◆ number_of_entries

Xapian::doccount ChertPostList::number_of_entries
private

The number of entries in the posting list.

Definition at line 175 of file chert_postlist.h.

Referenced by ChertPostList(), get_description(), and move_to_chunk_containing().

◆ pos

const char* ChertPostList::pos
private

◆ positionlist

ChertPositionList ChertPostList::positionlist
protected

The position list object for this posting list.

Definition at line 141 of file chert_postlist.h.

Referenced by read_position_list().

◆ this_db

Xapian::Internal::intrusive_ptr<const ChertDatabase> ChertPostList::this_db
protected

◆ wdf

Xapian::termcount ChertPostList::wdf
private

The wdf of the current document.

Definition at line 172 of file chert_postlist.h.

Referenced by ChertPostList(), move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), next(), next_chunk(), next_in_chunk(), and skip_to().

◆ wdf_upper_bound

Xapian::termcount ChertPostList::wdf_upper_bound
protected

Upper bound on wdf for this postlist.

Definition at line 179 of file chert_postlist.h.

Referenced by ChertModifiedPostList::ChertModifiedPostList(), ChertPostList(), and get_wdf_upper_bound().


The documentation for this class was generated from the following files: