xapian-core
1.4.27
|
A postlist in a glass database. More...
#include <glass_postlist.h>
Public Member Functions | |
GlassPostList (Xapian::Internal::intrusive_ptr< const GlassDatabase > this_db_, const string &term, bool keep_reference) | |
Default constructor. More... | |
~GlassPostList () | |
Destructor. More... | |
LeafPostList * | open_nearby_postlist (const std::string &term_) const |
Open another postlist from the same database. More... | |
bool | jump_to (Xapian::docid desired_did) |
Used for looking up doclens. More... | |
Xapian::doccount | get_termfreq () const |
Returns number of docs indexed by this term. More... | |
Xapian::docid | get_docid () const |
Returns the current docid. More... | |
Xapian::termcount | get_doclength () const |
Returns the length of current document. More... | |
Xapian::termcount | get_unique_terms () const |
Returns the number of unique terms in the current document. More... | |
Xapian::termcount | get_wdf () const |
Returns the Within Document Frequency of the term in the current document. More... | |
PositionList * | read_position_list () |
Get the list of positions of the term in the current document. More... | |
PositionList * | open_position_list () const |
Get the list of positions of the term in the current document. More... | |
PostList * | next (double w_min) |
Move to the next document. More... | |
PostList * | skip_to (Xapian::docid desired_did, double w_min) |
Skip to next document with docid >= docid. More... | |
bool | at_end () const |
Return true if and only if we're off the end of the list. More... | |
Xapian::termcount | get_wdf_upper_bound () const |
std::string | get_description () const |
Get a description of the document. More... | |
Public Member Functions inherited from LeafPostList | |
~LeafPostList () | |
void | set_termweight (const Xapian::Weight *weight_) |
Set the weighting scheme to use during matching. More... | |
double | resolve_lazy_termweight (Xapian::Weight *weight_, Xapian::Weight::Internal *stats, Xapian::termcount qlen, Xapian::termcount wqf, double factor) |
Xapian::doccount | get_termfreq_min () const |
Get a lower bound on the number of documents indexed by this term. More... | |
Xapian::doccount | get_termfreq_max () const |
Get an upper bound on the number of documents indexed by this term. More... | |
Xapian::doccount | get_termfreq_est () const |
Get an estimate of the number of documents indexed by this term. More... | |
double | get_maxweight () const |
Return an upper bound on what get_weight() can return. More... | |
double | get_weight () const |
Return the weight contribution for the current position. More... | |
double | recalc_maxweight () |
Recalculate the upper bound on what get_weight() can return. More... | |
TermFreqs | get_termfreq_est_using_stats (const Xapian::Weight::Internal &stats) const |
Get an estimate for the termfreq and reltermfreq, given the stats. More... | |
Xapian::termcount | count_matching_subqs () const |
Count the number of leaf subqueries which match at the current position. More... | |
void | gather_position_lists (OrPositionList *orposlist) |
Gather PositionList* objects for a subtree. More... | |
void | set_term (const std::string &term_) |
Set the term name. More... | |
Public Member Functions inherited from Xapian::PostingIterator::Internal | |
virtual | ~Internal () |
We have virtual methods and want to be able to delete derived classes using a pointer to the base class, so we need a virtual destructor. More... | |
virtual const std::string * | get_sort_key () const |
virtual const std::string * | get_collapse_key () const |
If the collapse key is already known, return it. More... | |
virtual Internal * | check (Xapian::docid did, double w_min, bool &valid) |
Check if the specified docid occurs in this postlist. More... | |
Internal * | next () |
Advance the current position to the next document in the postlist. More... | |
Internal * | skip_to (Xapian::docid did) |
Skip forward to the specified docid. More... | |
Public Member Functions inherited from Xapian::Internal::intrusive_base | |
intrusive_base () | |
Construct with no references. More... | |
Static Public Member Functions | |
static void | read_number_of_entries (const char **posptr, const char *end, Xapian::doccount *number_of_entries_ptr, Xapian::termcount *collection_freq_ptr) |
Read the number of entries and the collection frequency. More... | |
Private Member Functions | |
GlassPostList (const GlassPostList &) | |
Copying is not allowed. More... | |
void | operator= (const GlassPostList &) |
Assignment is not allowed. More... | |
bool | next_in_chunk () |
Move to the next item in the chunk, if possible. More... | |
void | next_chunk () |
Move to the next chunk. More... | |
bool | current_chunk_contains (Xapian::docid desired_did) |
Return true if the given document ID lies in the range covered by the current chunk. More... | |
void | move_to_chunk_containing (Xapian::docid desired_did) |
Move to chunk containing the specified document ID. More... | |
bool | move_forward_in_chunk_to_at_least (Xapian::docid desired_did) |
Scan forward in the current chunk for the specified document ID. More... | |
GlassPostList (Xapian::Internal::intrusive_ptr< const GlassDatabase > this_db_, const string &term, GlassCursor *cursor_) | |
void | init () |
Private Attributes | |
Xapian::Internal::intrusive_ptr< const GlassDatabase > | this_db |
The database we are searching. More... | |
GlassPositionList | positionlist |
The position list object for this posting list. More... | |
bool | have_started |
Whether we've started reading the list yet. More... | |
bool | is_last_chunk |
True if this is the last chunk. More... | |
bool | is_at_end |
Whether we've run off the end of the list yet. More... | |
AutoPtr< GlassCursor > | cursor |
Cursor pointing to current chunk of postlist. More... | |
Xapian::docid | first_did_in_chunk |
The first document id in this chunk. More... | |
Xapian::docid | last_did_in_chunk |
The last document id in this chunk. More... | |
const char * | pos |
Position of iteration through current chunk. More... | |
const char * | end |
Pointer to byte after end of current chunk. More... | |
Xapian::docid | did |
Document id we're currently at. More... | |
Xapian::termcount | wdf |
The wdf of the current document. More... | |
Xapian::doccount | number_of_entries |
The number of entries in the posting list. More... | |
Xapian::termcount | wdf_upper_bound |
Upper bound on wdf for this postlist. More... | |
Additional Inherited Members | |
Public Attributes inherited from Xapian::Internal::intrusive_base | |
unsigned | _refs |
Reference count. More... | |
Protected Member Functions inherited from LeafPostList | |
LeafPostList (const std::string &term_) | |
Only constructable as a base class for derived classes. More... | |
Protected Member Functions inherited from Xapian::PostingIterator::Internal | |
Internal () | |
Only constructable as a base class for derived classes. More... | |
Protected Attributes inherited from LeafPostList | |
const Xapian::Weight * | weight |
bool | need_doclength |
bool | need_unique_terms |
std::string | term |
The term name for this postlist (empty for an alldocs postlist). More... | |
A postlist in a glass database.
Definition at line 55 of file glass_postlist.h.
|
private |
Copying is not allowed.
Referenced by open_nearby_postlist().
|
private |
Definition at line 711 of file glass_postlist.cc.
References Xapian::Internal::intrusive_ptr< T >::get(), init(), and LOGCALL_CTOR.
GlassPostList::GlassPostList | ( | Xapian::Internal::intrusive_ptr< const GlassDatabase > | this_db_, |
const string & | term_, | ||
bool | keep_reference | ||
) |
Default constructor.
The format of a postlist is:
Split into chunks. Key for first chunk is the termname (encoded as length : name). Key for subsequent chunks is the same, followed by the document ID of the first document in the chunk (encoded as length of representation in first byte, and then docid).
A chunk (except for the first chunk) contains:
1) bool - true if this is the last chunk. 2) difference between final docid in chunk and first docid. 3) wdf for the first item. 4) increment in docid to next item, followed by wdf for the item. 5) (4) repeatedly.
The first chunk begins with the number of entries, the collection frequency, then the docid of the first document, then has the header of a standard chunk.
Definition at line 698 of file glass_postlist.cc.
References Xapian::Internal::intrusive_ptr< T >::get(), init(), and LOGCALL_CTOR.
GlassPostList::~GlassPostList | ( | ) |
|
inlinevirtual |
Return true if and only if we're off the end of the list.
Implements Xapian::PostingIterator::Internal.
Definition at line 210 of file glass_postlist.h.
Referenced by GlassAllDocsPostList::get_wdf().
|
private |
Return true if the given document ID lies in the range covered by the current chunk.
This does not say whether the document ID is actually present. It will return false if the document ID is greater than the last document ID in the chunk, even if it is less than the first document ID in the next chunk: it is possible for no chunk to contain a particular document ID.
Definition at line 892 of file glass_postlist.cc.
References first_did_in_chunk, last_did_in_chunk, LOGCALL, and RETURN.
|
virtual |
Get a description of the document.
Implements Xapian::PostingIterator::Internal.
Definition at line 1041 of file glass_postlist.cc.
References description_append(), number_of_entries, Xapian::Internal::str(), and LeafPostList::term.
|
inlinevirtual |
Returns the current docid.
Implements Xapian::PostingIterator::Internal.
Definition at line 182 of file glass_postlist.h.
References Assert.
|
virtual |
Returns the length of current document.
Implements Xapian::PostingIterator::Internal.
Definition at line 773 of file glass_postlist.cc.
References Assert, did, Xapian::Internal::intrusive_ptr< T >::get(), GlassDatabase::get_doclength(), have_started, LOGCALL, RETURN, and this_db.
|
inlinevirtual |
Returns number of docs indexed by this term.
This is the length of the postlist.
Implements LeafPostList.
Definition at line 179 of file glass_postlist.h.
|
virtual |
Returns the number of unique terms in the current document.
Implements Xapian::PostingIterator::Internal.
Definition at line 782 of file glass_postlist.cc.
References Assert, did, Xapian::Internal::intrusive_ptr< T >::get(), GlassDatabase::get_unique_terms(), have_started, LOGCALL, RETURN, and this_db.
|
inlinevirtual |
Returns the Within Document Frequency of the term in the current document.
Reimplemented from Xapian::PostingIterator::Internal.
Definition at line 193 of file glass_postlist.h.
References Assert.
Referenced by GlassAllDocsPostList::get_doclength().
|
virtual |
Implements LeafPostList.
Definition at line 1369 of file glass_postlist.cc.
References Assert, LeafPostList::term, and wdf_upper_bound.
|
private |
Definition at line 725 of file glass_postlist.cc.
References cursor, did, end, first_did_in_chunk, is_at_end, is_last_chunk, last_did_in_chunk, LOGLINE, GlassPostListTable::make_key(), number_of_entries, pos, read_start_of_chunk(), read_start_of_first_chunk(), read_wdf(), LeafPostList::term, wdf, and wdf_upper_bound.
Referenced by GlassPostList().
bool GlassPostList::jump_to | ( | Xapian::docid | desired_did | ) |
Used for looking up doclens.
Definition at line 1012 of file glass_postlist.cc.
References current_chunk_contains(), did, have_started, is_at_end, LOGCALL, move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), pos, and RETURN.
|
private |
Scan forward in the current chunk for the specified document ID.
This is particularly efficient if the desired document ID is greater than the last in the chunk - it then skips straight to the end.
Definition at line 951 of file glass_postlist.cc.
References Assert, did, end, last_did_in_chunk, LOGCALL, pos, read_did_increase(), read_wdf(), RETURN, and wdf.
|
private |
Move to chunk containing the specified document ID.
This moves to the chunk whose starting document ID is <= desired_did, but such that the next chunk's starting document ID is > desired_did.
It is thus possible that current_chunk_contains(desired_did) will return false after this call, since the document ID might lie after the end of this chunk, but before the start of the next chunk.
Definition at line 903 of file glass_postlist.cc.
References Assert, check_tname_in_key_lite(), cursor, did, end, first_did_in_chunk, is_at_end, is_last_chunk, last_did_in_chunk, LOGCALL_VOID, GlassPostListTable::make_key(), next_chunk(), number_of_entries, pos, read_start_of_chunk(), read_start_of_first_chunk(), read_wdf(), report_read_error(), LeafPostList::term, unpack_uint_preserving_sort(), and wdf.
|
virtual |
Move to the next document.
Implements Xapian::PostingIterator::Internal.
Definition at line 871 of file glass_postlist.cc.
References did, have_started, is_at_end, LOGCALL, LOGLINE, next_chunk(), next_in_chunk(), RETURN, and wdf.
|
private |
Move to the next chunk.
If there are no more chunks in this postlist, this will set is_at_end to true.
Definition at line 808 of file glass_postlist.cc.
References check_tname_in_key_lite(), cursor, did, end, first_did_in_chunk, is_at_end, is_last_chunk, last_did_in_chunk, LOGCALL_VOID, pos, read_start_of_chunk(), read_wdf(), report_read_error(), Xapian::Internal::str(), LeafPostList::term, unpack_uint_preserving_sort(), and wdf.
Referenced by GlassPostListTable::merge_changes(), move_to_chunk_containing(), and next().
|
private |
Move to the next item in the chunk, if possible.
If already at the end of the chunk, returns false.
Definition at line 791 of file glass_postlist.cc.
References Assert, did, end, last_did_in_chunk, LOGCALL, pos, read_did_increase(), read_wdf(), RETURN, and wdf.
Referenced by next().
|
virtual |
Open another postlist from the same database.
term_ | The term to open a postlist for. If term_ is near to this postlist's term, then this can be a lot more efficient (and if it isn't very near, there's not much of a penalty). Using this method can make a wildcard expansion much more memory efficient. |
Reimplemented from LeafPostList.
Definition at line 762 of file glass_postlist.cc.
References cursor, Xapian::Internal::intrusive_ptr< T >::get(), GlassPostList(), GlassTable::is_writable(), LOGCALL, GlassDatabase::postlist_table, RETURN, and this_db.
|
virtual |
Get the list of positions of the term in the current document.
Reimplemented from Xapian::PostingIterator::Internal.
Definition at line 863 of file glass_postlist.cc.
References Assert, did, Xapian::Internal::intrusive_ptr< T >::get(), LOGCALL, GlassDatabase::open_position_list(), RETURN, LeafPostList::term, and this_db.
|
private |
Assignment is not allowed.
|
static |
Read the number of entries and the collection frequency.
Read the number of entries in the posting list.
This must only be called when *posptr is pointing to the start of the first chunk of the posting list.
Definition at line 668 of file glass_postlist.cc.
References report_read_error(), and unpack_uint().
Referenced by GlassPostListTable::get_freqs(), read_start_of_first_chunk(), and GlassAllTermsList::read_termfreq().
|
virtual |
Get the list of positions of the term in the current document.
Reimplemented from Xapian::PostingIterator::Internal.
Definition at line 854 of file glass_postlist.cc.
References Assert, did, Xapian::Internal::intrusive_ptr< T >::get(), LOGCALL, positionlist, GlassDatabase::read_position_list(), RETURN, LeafPostList::term, and this_db.
|
virtual |
Skip to next document with docid >= docid.
Implements Xapian::PostingIterator::Internal.
Definition at line 977 of file glass_postlist.cc.
References Assert, current_chunk_contains(), did, have_started, is_at_end, LOGCALL, LOGLINE, move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), RETURN, and wdf.
|
private |
Cursor pointing to current chunk of postlist.
Definition at line 75 of file glass_postlist.h.
Referenced by GlassPostListTable::get_chunk(), init(), GlassPostListTable::merge_changes(), move_to_chunk_containing(), next_chunk(), and open_nearby_postlist().
|
private |
Document id we're currently at.
Definition at line 90 of file glass_postlist.h.
Referenced by get_doclength(), GlassDatabase::get_postlist_cursor(), get_unique_terms(), init(), jump_to(), GlassPostListTable::merge_changes(), GlassPostListTable::merge_doclen_changes(), move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), next(), next_chunk(), next_in_chunk(), open_position_list(), read_position_list(), and skip_to().
|
private |
Pointer to byte after end of current chunk.
Definition at line 87 of file glass_postlist.h.
Referenced by GlassPostListTable::get_chunk(), init(), GlassPostListTable::merge_changes(), move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), next_chunk(), and next_in_chunk().
|
private |
The first document id in this chunk.
Definition at line 78 of file glass_postlist.h.
Referenced by current_chunk_contains(), GlassPostListTable::get_chunk(), init(), move_to_chunk_containing(), and next_chunk().
|
private |
Whether we've started reading the list yet.
Definition at line 66 of file glass_postlist.h.
Referenced by get_doclength(), get_unique_terms(), jump_to(), next(), and skip_to().
|
private |
Whether we've run off the end of the list yet.
Definition at line 72 of file glass_postlist.h.
Referenced by init(), jump_to(), move_to_chunk_containing(), next(), next_chunk(), and skip_to().
|
private |
True if this is the last chunk.
Definition at line 69 of file glass_postlist.h.
Referenced by GlassPostListTable::get_chunk(), init(), move_to_chunk_containing(), and next_chunk().
|
private |
The last document id in this chunk.
Definition at line 81 of file glass_postlist.h.
Referenced by current_chunk_contains(), GlassPostListTable::get_chunk(), init(), move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), next_chunk(), and next_in_chunk().
|
private |
The number of entries in the posting list.
Definition at line 96 of file glass_postlist.h.
Referenced by get_description(), init(), and move_to_chunk_containing().
|
private |
Position of iteration through current chunk.
Definition at line 84 of file glass_postlist.h.
Referenced by GlassPostListTable::get_chunk(), init(), jump_to(), GlassPostListTable::merge_changes(), move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), next_chunk(), and next_in_chunk().
|
private |
The position list object for this posting list.
Definition at line 63 of file glass_postlist.h.
Referenced by read_position_list().
|
private |
The database we are searching.
This pointer is held so that the database doesn't get deleted before us, and also to give us access to the position_table.
Definition at line 60 of file glass_postlist.h.
Referenced by get_doclength(), get_unique_terms(), open_nearby_postlist(), open_position_list(), and read_position_list().
|
private |
The wdf of the current document.
Definition at line 93 of file glass_postlist.h.
Referenced by init(), move_forward_in_chunk_to_at_least(), move_to_chunk_containing(), next(), next_chunk(), next_in_chunk(), and skip_to().
|
private |
Upper bound on wdf for this postlist.
Definition at line 99 of file glass_postlist.h.
Referenced by get_wdf_upper_bound(), and init().