xapian-core  2.0.0
Public Types | Public Member Functions | Protected Member Functions | Protected Attributes | Private Member Functions | Private Attributes | Friends | List of all members
Xapian::Document::Internal Class Reference

Abstract base class for a document. More...

#include <documentinternal.h>

+ Inheritance diagram for Xapian::Document::Internal:
+ Collaboration diagram for Xapian::Document::Internal:

Public Types

enum  remove_posting_result { OK , NO_TERM , NO_POS }
 

Public Member Functions

 Internal ()
 Construct an empty document. More...
 
virtual ~Internal ()
 We have virtual methods and want to be able to delete derived classes using a pointer to the base class, so we need a virtual destructor. More...
 
bool data_modified () const
 Return true if the document data might have been modified. More...
 
bool terms_modified () const
 Return true if the document's terms might have been modified. More...
 
bool values_modified () const
 Return true if the document's values might have been modified. More...
 
bool modified () const
 Return true if the document might have been modified in any way. More...
 
bool positions_modified () const
 Return true if the document's term positions might have been modified. More...
 
Xapian::docid get_docid () const
 Get the document ID this document came from. More...
 
Xapian::doccount get_index () const
 Internal method used by MSet::diversify(). More...
 
void set_index (Xapian::doccount new_index)
 Internal method used by MSet::diversify(). More...
 
std::string get_data () const
 Get the document data. More...
 
void set_data (std::string_view data_)
 Set the document data. More...
 
void add_term (std::string_view term, Xapian::termcount wdf_inc)
 Add a term to this document. More...
 
bool remove_term (std::string_view term)
 Remove a term from this document. More...
 
void add_posting (std::string_view term, Xapian::termpos term_pos, Xapian::termcount wdf_inc)
 Add a posting for a term. More...
 
remove_posting_result remove_posting (std::string_view term, Xapian::termpos term_pos, Xapian::termcount wdf_dec)
 Remove a posting for a term. More...
 
remove_posting_result remove_postings (std::string_view term, Xapian::termpos term_pos_first, Xapian::termpos term_pos_last, Xapian::termcount wdf_dec, Xapian::termpos &n_removed)
 Remove a range of postings for a term. More...
 
void clear_terms ()
 Clear all terms from the document. More...
 
Xapian::termcount termlist_count () const
 Return the number of distinct terms in this document. More...
 
TermListopen_term_list () const
 Start iterating the terms in this document. More...
 
std::string get_value (Xapian::valueno slot) const
 Read a value slot in this document. More...
 
void add_value (Xapian::valueno slot, std::string_view value)
 Add a value to a slot in this document. More...
 
void clear_values ()
 Clear all value slots in this document. More...
 
Xapian::valueno values_count () const
 Count the value slots used in this document. More...
 
Xapian::ValueIterator values_begin () const
 
std::string get_description () const
 Return a string describing this object. More...
 
- Public Member Functions inherited from Xapian::Internal::intrusive_base
 intrusive_base ()
 Construct with no references. More...
 

Protected Member Functions

 Internal (Xapian::Internal::intrusive_ptr< const Xapian::Database::Internal > database_, Xapian::docid did_)
 Constructor used by subclasses. More...
 
 Internal (const Xapian::Database::Internal *database_, Xapian::docid did_, std::string &&data_, std::map< Xapian::valueno, std::string > &&values_)
 Constructor used by RemoteDocument subclass. More...
 
virtual std::string fetch_data () const
 Fetch the document data from the database. More...
 
virtual void fetch_all_values (std::map< Xapian::valueno, std::string > &values_) const
 Fetch all set values from the database. More...
 
virtual std::string fetch_value (Xapian::valueno slot) const
 Fetch a single value from the database. More...
 

Protected Attributes

std::unique_ptr< std::map< Xapian::valueno, std::string > > values
 Document value slots and their contents. More...
 
Xapian::Internal::intrusive_ptr< const Xapian::Database::Internaldatabase
 Database this document came from. More...
 
Xapian::docid did
 The document ID this document came from in database. More...
 

Private Member Functions

void operator= (const Internal &)=delete
 Don't allow assignment. More...
 
 Internal (const Internal &)=delete
 Don't allow copying. More...
 
void ensure_terms_fetched () const
 Ensure terms have been fetched from database. More...
 
void ensure_values_fetched () const
 Ensure values have been fetched from database. More...
 

Private Attributes

std::unique_ptr< std::string > data
 The document data. More...
 
std::unique_ptr< std::map< std::string, TermInfo, std::less<> > > terms
 Terms in the document and their associated metadata. More...
 
Xapian::termcount termlist_size
 The number of distinct terms in terms. More...
 
Xapian::doccount index: 31
 An index value, unused by Document itself. More...
 
bool positions_modified_: 1
 Are there any changes to term positions in terms? More...
 

Friends

class ::DocumentTermList
 
class ::DocumentValueList
 
class ::GlassValueManager
 
class ::HoneyValueManager
 
class ::ValueStreamDocument
 

Additional Inherited Members

- Public Attributes inherited from Xapian::Internal::intrusive_base
unsigned _refs
 Reference count. More...
 

Detailed Description

Abstract base class for a document.

Definition at line 49 of file documentinternal.h.

Member Enumeration Documentation

◆ remove_posting_result

Enumerator
OK 
NO_TERM 
NO_POS 

Definition at line 322 of file documentinternal.h.

Constructor & Destructor Documentation

◆ Internal() [1/4]

Xapian::Document::Internal::Internal ( const Internal )
privatedelete

Don't allow copying.

◆ Internal() [2/4]

Xapian::Document::Internal::Internal ( Xapian::Internal::intrusive_ptr< const Xapian::Database::Internal database_,
Xapian::docid  did_ 
)
inlineprotected

Constructor used by subclasses.

Definition at line 158 of file documentinternal.h.

◆ Internal() [3/4]

Xapian::Document::Internal::Internal ( const Xapian::Database::Internal database_,
Xapian::docid  did_,
std::string &&  data_,
std::map< Xapian::valueno, std::string > &&  values_ 
)
inlineprotected

Constructor used by RemoteDocument subclass.

Definition at line 163 of file documentinternal.h.

◆ Internal() [4/4]

Xapian::Document::Internal::Internal ( )
inline

Construct an empty document.

Definition at line 197 of file documentinternal.h.

◆ ~Internal()

Xapian::Document::Internal::~Internal ( )
virtual

We have virtual methods and want to be able to delete derived classes using a pointer to the base class, so we need a virtual destructor.

Definition at line 96 of file documentinternal.cc.

Member Function Documentation

◆ add_posting()

void Xapian::Document::Internal::add_posting ( std::string_view  term,
Xapian::termpos  term_pos,
Xapian::termcount  wdf_inc 
)
inline

Add a posting for a term.

Definition at line 306 of file documentinternal.h.

References ensure_terms_fetched(), positions_modified_, term, termlist_size, and terms.

◆ add_term()

void Xapian::Document::Internal::add_term ( std::string_view  term,
Xapian::termcount  wdf_inc 
)
inline

Add a term to this document.

Definition at line 274 of file documentinternal.h.

References ensure_terms_fetched(), term, termlist_size, and terms.

◆ add_value()

void Xapian::Document::Internal::add_value ( Xapian::valueno  slot,
std::string_view  value 
)
inline

Add a value to a slot in this document.

Definition at line 428 of file documentinternal.h.

References ensure_values_fetched(), and values.

◆ clear_terms()

void Xapian::Document::Internal::clear_terms ( )
inline

Clear all terms from the document.

Definition at line 376 of file documentinternal.h.

References database, Xapian::Database::Internal::has_positions(), positions_modified_, termlist_size, and terms.

◆ clear_values()

void Xapian::Document::Internal::clear_values ( )
inline

Clear all value slots in this document.

Definition at line 441 of file documentinternal.h.

References database, and values.

◆ data_modified()

bool Xapian::Document::Internal::data_modified ( ) const
inline

Return true if the document data might have been modified.

If the document is from a database, this means modifications compared to the version read, otherwise it means modifications compared to an empty database.

Definition at line 210 of file documentinternal.h.

References data.

Referenced by modified().

◆ ensure_terms_fetched()

void Xapian::Document::Internal::ensure_terms_fetched ( ) const
private

Ensure terms have been fetched from database.

After this call, terms will be non-NULL. If database is NULL, terms will be initialised to an empty map if it was NULL.

Definition at line 39 of file documentinternal.cc.

References Xapian::PositionIterator::Internal::get_position(), Xapian::PositionIterator::Internal::next(), p, and term.

Referenced by add_posting(), add_term(), remove_posting(), remove_postings(), and remove_term().

◆ ensure_values_fetched()

void Xapian::Document::Internal::ensure_values_fetched ( ) const
private

Ensure values have been fetched from database.

After this call, values will be non-NULL. If database is NULL, values will be initialised to an empty map if it was NULL.

Definition at line 66 of file documentinternal.cc.

Referenced by add_value(), and values_count().

◆ fetch_all_values()

void Xapian::Document::Internal::fetch_all_values ( std::map< Xapian::valueno, std::string > &  values_) const
protectedvirtual

Fetch all set values from the database.

The default implementation (used when there's no associated database) clears values_.

Reimplemented in ValueStreamDocument, RemoteDocument, InMemoryDocument, HoneyDocument, and GlassDocument.

Definition at line 84 of file documentinternal.cc.

◆ fetch_data()

string Xapian::Document::Internal::fetch_data ( ) const
protectedvirtual

Fetch the document data from the database.

The default implementation (used when there's no associated database) returns an empty string.

Reimplemented in ValueStreamDocument, RemoteDocument, InMemoryDocument, HoneyDocument, and GlassDocument.

Definition at line 78 of file documentinternal.cc.

Referenced by get_data().

◆ fetch_value()

string Xapian::Document::Internal::fetch_value ( Xapian::valueno  slot) const
protectedvirtual

Fetch a single value from the database.

The default implementation (used when there's no associated database) returns an empty string.

Reimplemented in ValueStreamDocument, RemoteDocument, InMemoryDocument, HoneyDocument, and GlassDocument.

Definition at line 91 of file documentinternal.cc.

Referenced by get_value().

◆ get_data()

std::string Xapian::Document::Internal::get_data ( ) const
inline

Get the document data.

Definition at line 262 of file documentinternal.h.

References data, and fetch_data().

◆ get_description()

string Xapian::Document::Internal::get_description ( ) const

Return a string describing this object.

Definition at line 129 of file documentinternal.cc.

References description_append(), and Xapian::Internal::str().

◆ get_docid()

Xapian::docid Xapian::Document::Internal::get_docid ( ) const
inline

Get the document ID this document came from.

If this document didn't come from a database, this will be 0.

Note that this is the docid in the sub-database when multiple databases are being searched.

Definition at line 253 of file documentinternal.h.

References did.

◆ get_index()

Xapian::doccount Xapian::Document::Internal::get_index ( ) const
inline

Internal method used by MSet::diversify().

Definition at line 256 of file documentinternal.h.

References index.

◆ get_value()

std::string Xapian::Document::Internal::get_value ( Xapian::valueno  slot) const
inline

Read a value slot in this document.

Returns
The value in slot slot, or an empty string if not set.

Definition at line 416 of file documentinternal.h.

References fetch_value(), and values.

Referenced by Collapser::check().

◆ modified()

bool Xapian::Document::Internal::modified ( ) const
inline

Return true if the document might have been modified in any way.

If the document is from a database, this means modifications compared to the version read, otherwise it means modifications compared to an empty database.

Definition at line 234 of file documentinternal.h.

References data_modified(), terms_modified(), and values_modified().

◆ open_term_list()

TermList * Xapian::Document::Internal::open_term_list ( ) const

Start iterating the terms in this document.

Returns
A new TermList object (caller takes ownership) or NULL if there are no terms.

Definition at line 103 of file documentinternal.cc.

Referenced by Xapian::Document::termlist_begin().

◆ operator=()

void Xapian::Document::Internal::operator= ( const Internal )
privatedelete

Don't allow assignment.

◆ positions_modified()

bool Xapian::Document::Internal::positions_modified ( ) const
inline

Return true if the document's term positions might have been modified.

If the document is from a database, this means modifications compared to the version read, otherwise it means modifications compared to an empty database.

Definition at line 244 of file documentinternal.h.

References positions_modified_.

◆ remove_posting()

remove_posting_result Xapian::Document::Internal::remove_posting ( std::string_view  term,
Xapian::termpos  term_pos,
Xapian::termcount  wdf_dec 
)
inline

Remove a posting for a term.

Definition at line 326 of file documentinternal.h.

References ensure_terms_fetched(), positions_modified_, term, termlist_size, and terms.

◆ remove_postings()

remove_posting_result Xapian::Document::Internal::remove_postings ( std::string_view  term,
Xapian::termpos  term_pos_first,
Xapian::termpos  term_pos_last,
Xapian::termcount  wdf_dec,
Xapian::termpos n_removed 
)
inline

Remove a range of postings for a term.

Can only return OK or NO_TERM.

Definition at line 349 of file documentinternal.h.

References ensure_terms_fetched(), mul_overflows(), positions_modified_, term, termlist_size, and terms.

◆ remove_term()

bool Xapian::Document::Internal::remove_term ( std::string_view  term)
inline

Remove a term from this document.

Definition at line 288 of file documentinternal.h.

References ensure_terms_fetched(), positions_modified_, term, termlist_size, and terms.

Referenced by Xapian::Document::remove_term().

◆ set_data()

void Xapian::Document::Internal::set_data ( std::string_view  data_)
inline

Set the document data.

Definition at line 269 of file documentinternal.h.

References data.

◆ set_index()

void Xapian::Document::Internal::set_index ( Xapian::doccount  new_index)
inline

Internal method used by MSet::diversify().

Definition at line 259 of file documentinternal.h.

References index.

◆ termlist_count()

Xapian::termcount Xapian::Document::Internal::termlist_count ( ) const
inline

Return the number of distinct terms in this document.

Definition at line 393 of file documentinternal.h.

References database, did, Xapian::Database::Internal::open_term_list(), termlist_size, and terms.

◆ terms_modified()

bool Xapian::Document::Internal::terms_modified ( ) const
inline

Return true if the document's terms might have been modified.

If the document is from a database, this means modifications compared to the version read, otherwise it means modifications compared to an empty database.

Definition at line 218 of file documentinternal.h.

References terms.

Referenced by modified().

◆ values_begin()

Xapian::ValueIterator Xapian::Document::Internal::values_begin ( ) const

Definition at line 115 of file documentinternal.cc.

◆ values_count()

Xapian::valueno Xapian::Document::Internal::values_count ( ) const
inline

Count the value slots used in this document.

Definition at line 455 of file documentinternal.h.

References ensure_values_fetched(), and values.

◆ values_modified()

bool Xapian::Document::Internal::values_modified ( ) const
inline

Return true if the document's values might have been modified.

If the document is from a database, this means modifications compared to the version read, otherwise it means modifications compared to an empty database.

Definition at line 226 of file documentinternal.h.

References values.

Referenced by modified().

Friends And Related Function Documentation

◆ ::DocumentTermList

friend class ::DocumentTermList
friend

Definition at line 50 of file documentinternal.h.

◆ ::DocumentValueList

friend class ::DocumentValueList
friend

Definition at line 51 of file documentinternal.h.

◆ ::GlassValueManager

friend class ::GlassValueManager
friend

Definition at line 53 of file documentinternal.h.

◆ ::HoneyValueManager

friend class ::HoneyValueManager
friend

Definition at line 54 of file documentinternal.h.

◆ ::ValueStreamDocument

friend class ::ValueStreamDocument
friend

Definition at line 55 of file documentinternal.h.

Member Data Documentation

◆ data

std::unique_ptr<std::string> Xapian::Document::Internal::data
private

The document data.

If NULL, this hasn't been fetched or set yet.

Definition at line 67 of file documentinternal.h.

Referenced by data_modified(), get_data(), and set_data().

◆ database

Xapian::Internal::intrusive_ptr<const Xapian::Database::Internal> Xapian::Document::Internal::database
protected

Database this document came from.

If this document didn't come from a database, this will be NULL.

Definition at line 146 of file documentinternal.h.

Referenced by clear_terms(), clear_values(), and termlist_count().

◆ did

Xapian::docid Xapian::Document::Internal::did
protected

The document ID this document came from in database.

If this document didn't come from a database, this will be 0.

Note that this is the docid in the sub-database when multiple databases are being searched.

Definition at line 155 of file documentinternal.h.

Referenced by get_docid(), ValueStreamDocument::set_shard_document(), and termlist_count().

◆ index

Xapian::doccount Xapian::Document::Internal::index
private

An index value, unused by Document itself.

This is used by the diversification code.

It is in a bit field with a bool flag so that it doesn't incur any additional space cost for cases where it isn't used.

The bool flag is stored in the top bit, which is likely to be very cheap to check (since it's the sign bit for a signed integer value).

We initialise this in the constructors to avoid valgrind warning that positions_modified_ is used uninitialised. Valgrind is meant to track undefined-ness at the bit level, so this shouldn't be needed. FIXME: Investigate!

Definition at line 103 of file documentinternal.h.

Referenced by get_index(), and set_index().

◆ positions_modified_

bool Xapian::Document::Internal::positions_modified_
mutableprivate

Are there any changes to term positions in terms?

If a document is read from a database, modified and then replaced at the same docid, then we can save a lot of work if we know when there are no changes to term positions, even if there are changes to terms (a common example is adding filter terms to an existing document).

It's OK for this to be true when there aren't any modifications (it just means that the backend can't shortcut as directly).

Definition at line 115 of file documentinternal.h.

Referenced by add_posting(), clear_terms(), positions_modified(), remove_posting(), remove_postings(), and remove_term().

◆ termlist_size

Xapian::termcount Xapian::Document::Internal::termlist_size
mutableprivate

The number of distinct terms in terms.

Only valid when terms is non-NULL.

This may be less than terms.size() if any terms have been deleted.

Definition at line 86 of file documentinternal.h.

Referenced by add_posting(), add_term(), clear_terms(), remove_posting(), remove_postings(), remove_term(), and termlist_count().

◆ terms

std::unique_ptr<std::map<std::string, TermInfo, std::less<> > > Xapian::Document::Internal::terms
mutableprivate

Terms in the document and their associated metadata.

If NULL, the terms haven't been fetched or set yet.

We use std::map<> rather than std::unordered_map<> because the latter invalidates existing iterators upon insert() if rehashing occurs, whereas existing iterators remain valid for std::map<>.

Definition at line 78 of file documentinternal.h.

Referenced by add_posting(), add_term(), clear_terms(), remove_posting(), remove_postings(), remove_term(), termlist_count(), and terms_modified().

◆ values

std::unique_ptr<std::map<Xapian::valueno, std::string> > Xapian::Document::Internal::values
mutableprotected

Document value slots and their contents.

If NULL, the values haven't been fetched or set yet.

We use std::map<> rather than std::unordered_map<> because the latter invalidates existing iterators upon insert() if rehashing occurs, whereas existing iterators remain valid for std::map<>.

Definition at line 140 of file documentinternal.h.

Referenced by add_value(), clear_values(), get_value(), values_count(), and values_modified().


The documentation for this class was generated from the following files: