xapian-core  1.4.26
Public Member Functions | Static Public Member Functions | Protected Member Functions | Static Protected Member Functions | Protected Attributes | Private Member Functions | Friends | List of all members
ChertTable Class Reference

Class managing a Btree table in a Chert database. More...

#include <chert_table.h>

+ Inheritance diagram for ChertTable:
+ Collaboration diagram for ChertTable:

Public Member Functions

 ChertTable (const char *tablename_, const std::string &path_, bool readonly_, int compress_strategy_=DONT_COMPRESS, bool lazy=false)
 Create a new Btree object. More...
 
 ~ChertTable ()
 Close the Btree. More...
 
void close (bool permanent=false)
 Close the Btree. More...
 
bool readahead_key (const string &key) const
 
bool exists () const
 Determine whether the btree exists on disk. More...
 
void open ()
 Open the btree at the latest revision. More...
 
bool open (chert_revision_number_t revision_)
 Open the btree at a given revision. More...
 
bool is_open () const
 Return true if this table is open. More...
 
void flush_db ()
 Flush any outstanding changes to the DB file of the table. More...
 
void commit (chert_revision_number_t revision, int changes_fd=-1, const std::string *changes_tail=NULL)
 Commit any outstanding changes to the table. More...
 
void write_changed_blocks (int changes_fd)
 Append the list of blocks changed to a changeset file. More...
 
void cancel ()
 Cancel any outstanding changes. More...
 
bool get_exact_entry (const std::string &key, std::string &tag) const
 Read an entry from the table, if and only if it is exactly that being asked for. More...
 
bool key_exists (const std::string &key) const
 Check if a key exists in the Btree. More...
 
bool read_tag (Cursor *C_, std::string *tag, bool keep_compressed) const
 Read the tag value for the key pointed to by cursor C_. More...
 
void add (const std::string &key, std::string tag, bool already_compressed=false)
 Add a key/tag pair to the table, replacing any existing pair with the same key. More...
 
bool del (const std::string &key)
 Delete an entry from the table. More...
 
void erase ()
 Erase this table from disk. More...
 
void set_block_size (unsigned int block_size_)
 Set the block size. More...
 
unsigned int get_block_size () const
 Get the block size. More...
 
void create_and_open (unsigned int blocksize)
 Create a new empty btree structure on disk and open it at the initial revision. More...
 
void set_full_compaction (bool parity)
 
chert_revision_number_t get_latest_revision_number () const
 Get the latest revision number stored in this table. More...
 
chert_revision_number_t get_open_revision_number () const
 Get the revision number at which this table is currently open. More...
 
chert_tablesize_t get_entry_count () const
 Return a count of the number of entries in the table. More...
 
bool empty () const
 Return true if there are no entries in the table. More...
 
ChertCursorcursor_get () const
 Get a cursor for reading from the table. More...
 
bool is_modified () const
 Determine whether the object contains uncommitted modifications. More...
 
void set_max_item_size (size_t block_capacity)
 Set the maximum item size given the block capacity. More...
 
string get_path () const
 

Static Public Member Functions

static void throw_database_closed ()
 Throw an exception indicating that the database is closed. More...
 

Protected Member Functions

bool do_open_to_read (bool revision_supplied, chert_revision_number_t revision_)
 Perform the opening operation to read. More...
 
bool do_open_to_write (bool revision_supplied, chert_revision_number_t revision_, bool create_db=false)
 Perform the opening operation to write. More...
 
bool basic_open (bool revision_supplied, chert_revision_number_t revision)
 
bool find (Cursor *) const
 find(C_) searches for the key of B->kt in the B-tree. More...
 
int delete_kt ()
 
void read_block (uint4 n, uint8_t *p) const
 read_block(n, p) reads block n of the DB file to address p. More...
 
void write_block (uint4 n, const uint8_t *p) const
 write_block(n, p) writes block n in the DB file from address p. More...
 
void set_overwritten () const
 
void block_to_cursor (Cursor *C_, int j, uint4 n) const
 
void alter ()
 Btree::alter(); is called when the B-tree is to be altered. More...
 
void compact (uint8_t *p)
 compact(p) compact the block at p by shuffling all the items up to the end. More...
 
void enter_key (int j, Key prevkey, Key newkey)
 enter_key(j, prevkey, newkey) is called after a block split. More...
 
int mid_point (uint8_t *p) const
 mid_point(p) finds the directory entry in c that determines the approximate mid point of the data in the block at p. More...
 
void add_item_to_block (uint8_t *p, Item_wr kt, int c)
 add_item_to_block(p, kt_, c) adds item kt_ to the block at p. More...
 
void add_item (Item_wr kt, int j)
 ChertTable::add_item(kt_, j) adds item kt_ to the block at cursor level C[j]. More...
 
void delete_item (int j, bool repeatedly)
 ChertTable::delete_item(j, repeatedly) is (almost) the converse of add_item. More...
 
int add_kt (bool found)
 add_kt(found) adds the item (key-tag pair) at B->kt into the B-tree, using cursor C. More...
 
void read_root ()
 
void split_root (uint4 split_n)
 Btree needs to gain a new level to insert more items: so split root block and construct a new one. More...
 
void form_key (const std::string &key) const
 
char other_base_letter () const
 
void lazy_alloc_deflate_zstream () const
 Allocate the zstream for deflating, if not already allocated. More...
 
void lazy_alloc_inflate_zstream () const
 Allocate the zstream for inflating, if not already allocated. More...
 
bool prev (Cursor *C_, int j) const
 
bool next (Cursor *C_, int j) const
 
bool prev_default (Cursor *C_, int j) const
 
bool next_default (Cursor *C_, int j) const
 
bool prev_for_sequential (Cursor *C_, int dummy) const
 
bool next_for_sequential (Cursor *C_, int dummy) const
 

Static Protected Member Functions

static int find_in_block (const uint8_t *p, Key key, bool leaf, int c)
 find_in_block(p, key, leaf, c) searches for the key in the block at p. More...
 
static uint4 block_given_by (const uint8_t *p, int c)
 block_given_by(p, c) finds the item at block address p, directory offset c, and returns its tag value as an integer. More...
 

Protected Attributes

const char * tablename
 The name of the table (used when writing changesets). More...
 
chert_revision_number_t revision_number
 revision number of the opened B-tree. More...
 
chert_tablesize_t item_count
 keeps a count of the number of items in the B-tree. More...
 
unsigned int block_size
 block size of the B tree in bytes More...
 
chert_revision_number_t latest_revision_number
 Revision number of the other base, or zero if there is only one base file. More...
 
bool both_bases
 set to true if baseA and baseB both exist as valid bases. More...
 
char base_letter
 the value 'A' or 'B' of the current base More...
 
bool faked_root_block
 true if the root block is faked (not written to disk). More...
 
bool sequential
 true iff the data has been written in a single write in sequential order. More...
 
int handle
 File descriptor of the table. More...
 
int level
 number of levels, counting from 0 More...
 
uint4 root
 the root block of the B-tree More...
 
Item_wr kt
 buffer of size block_size for making up key-tag items More...
 
uint8_t * buffer
 buffer of size block_size for reforming blocks More...
 
ChertTable_base base
 For writing back as file baseA or baseB. More...
 
std::string name
 The path name of the B tree. More...
 
int seq_count
 count of the number of successive instances of purely sequential addition, starting at SEQ_START_POINT (neg) and going up to zero. More...
 
uint4 changed_n
 the last block to be changed by an addition More...
 
int changed_c
 directory offset corresponding to last block to be changed by an addition More...
 
size_t max_item_size
 maximum size of an item (key-tag pair) More...
 
bool Btree_modified
 Set to true the first time the B-tree is modified. More...
 
bool full_compaction
 set to true when full compaction is to be achieved More...
 
bool writable
 Set to true when the database is opened to write. More...
 
bool cursor_created_since_last_modification
 Flag for tracking when cursors need to rebuild. More...
 
unsigned long cursor_version
 Version count for tracking when cursors need to rebuild. More...
 
Cursor C [BTREE_CURSOR_LEVELS]
 
uint8_t * split_p
 Buffer used when splitting a block. More...
 
int compress_strategy
 DONT_COMPRESS or Z_DEFAULT_STRATEGY, Z_FILTERED, Z_HUFFMAN_ONLY, Z_RLE. More...
 
z_stream * deflate_zstream
 Zlib state object for deflating. More...
 
z_stream * inflate_zstream
 Zlib state object for inflating. More...
 
bool lazy
 If true, don't create the table until it's needed. More...
 
uint4 last_readahead
 Last block readahead_key() preread. More...
 

Private Member Functions

 ChertTable (const ChertTable &)
 Copying not allowed. More...
 
ChertTableoperator= (const ChertTable &)
 Assignment not allowed. More...
 
bool really_empty () const
 Return true if there are no entries in the table. More...
 

Friends

class ChertCursor
 

Detailed Description

Class managing a Btree table in a Chert database.

A table is a store holding a set of key/tag pairs.

A key is used to access a block of data in a chert table.

Keys are of limited length.

Keys may not be empty (each Btree has a special empty key for internal use).

A tag is a piece of data associated with a given key. The contents of the tag are opaque to the Btree.

Tags may be of arbitrary length (the Btree imposes a very large limit). Note though that they will be loaded into memory in their entirety, so should not be permitted to grow without bound in normal usage.

Tags which are null strings are valid, and are different from a tag simply not being in the table.

Definition at line 347 of file chert_table.h.

Constructor & Destructor Documentation

◆ ChertTable() [1/2]

ChertTable::ChertTable ( const ChertTable )
private

Copying not allowed.

◆ ChertTable() [2/2]

ChertTable::ChertTable ( const char *  tablename_,
const std::string &  path_,
bool  readonly_,
int  compress_strategy_ = DONT_COMPRESS,
bool  lazy = false 
)

Create a new Btree object.

This does not create the table on disk - the create_and_open() method must be called to create the table on disk.

This also does not open the table - either the create_and_open() or open() methods must be called before use is made of the table.

Parameters
tablename_The name of the table (used in changesets).
path_Path at which the table is stored.
readonly_whether to open the table for read only access.
compress_strategy_DONT_COMPRESS, Z_DEFAULT_STRATEGY, Z_FILTERED, Z_HUFFMAN_ONLY, or Z_RLE.
lazyIf true, don't create the table until it's needed.

Definition at line 1562 of file chert_table.cc.

References LOGCALL_CTOR.

◆ ~ChertTable()

ChertTable::~ChertTable ( )

Close the Btree.

Any outstanding changes (ie, changes made without commit() having subsequently been called) will be lost.

Definition at line 1754 of file chert_table.cc.

References close(), deflate_zstream, inflate_zstream, and LOGCALL_DTOR.

Member Function Documentation

◆ add()

void ChertTable::add ( const std::string &  key,
std::string  tag,
bool  already_compressed = false 
)

Add a key/tag pair to the table, replacing any existing pair with the same key.

If an error occurs during the operation, an exception will be thrown.

If key is empty, then the null item is replaced.

e.g. btree.add("TODAY", "Mon 9 Oct 2000");

Parameters
keyThe key to store in the table.
tagThe tag to store in the table.
already_compressedtrue if tag is already compressed, for example because it is being opaquely copied (default: false).

Definition at line 978 of file chert_table.cc.

References Assert, BYTE_PAIR_RANGE, C, C2, COMPRESS_MIN, D2, DONT_COMPRESS, I2, K1, LOGCALL_VOID, and TOTAL_FREE().

Referenced by Chert::PostlistChunkWriter::flush(), ChertCompact::merge_docid_keyed(), ChertCompact::merge_postlists(), ChertCompact::merge_spellings(), ChertCompact::merge_synonyms(), ChertRecordTable::replace_record(), ChertWritableDatabase::set_metadata(), ChertDatabaseStats::write(), and ValueUpdater::write_tag().

◆ add_item()

void ChertTable::add_item ( Item_wr  kt_,
int  j 
)
protected

ChertTable::add_item(kt_, j) adds item kt_ to the block at cursor level C[j].

If there is not enough room the block splits and the item is then added to the appropriate half.

Definition at line 672 of file chert_table.cc.

References Assert, AssertRel, C, D2, DIR_END(), DIR_START, LOGCALL_VOID, MAX_FREE(), SET_DIR_END(), Item_base< T >::size(), and TOTAL_FREE().

◆ add_item_to_block()

void ChertTable::add_item_to_block ( uint8_t *  p,
Item_wr  kt_,
int  c 
)
protected

add_item_to_block(p, kt_, c) adds item kt_ to the block at p.

c is the offset in the directory that needs to be expanded to accommodate the new entry for the item. We know before this is called that there is enough contiguous room for the item in the block, so it's just a matter of shuffling up any directory entries after where we're inserting and copying in the item.

Definition at line 636 of file chert_table.cc.

References Assert, AssertRel, D2, DIR_END(), DIR_START, Item_base< T >::get_address(), LOGCALL_VOID, MAX_FREE(), SET_DIR_END(), SET_MAX_FREE(), SET_TOTAL_FREE(), setD(), Item_base< T >::size(), and TOTAL_FREE().

◆ add_kt()

int ChertTable::add_kt ( bool  found)
protected

add_kt(found) adds the item (key-tag pair) at B->kt into the B-tree, using cursor C.

found == find() is handed over as a parameter from Btree::add. Btree::alter() prepares for the alteration to the B-tree. Then there are a number of cases to consider:

If an item with the same key is in the B-tree (found is true), the new kt replaces it.

If then kt is smaller, or the same size as, the item it replaces, kt is put in the same place as the item it replaces, and the TOTAL_FREE measure is reduced.

If kt is larger than the item it replaces it is put in the MAX_FREE space if there is room, and the directory entry and space counts are adjusted accordingly.

  • But if there is not room we do it the long way: the old item is deleted with delete_item and kt is added in with add_item.

If the key of kt is not in the B-tree (found is false), the new kt is added in with add_item.

Definition at line 846 of file chert_table.cc.

References Assert, AssertRel, C, Item_base< T >::components_of(), D2, DIR_END(), DIR_START, Item_base< T >::get_address(), LOGCALL, MAX_FREE(), RETURN, SEQ_START_POINT, SET_MAX_FREE(), SET_TOTAL_FREE(), setD(), Item_base< T >::size(), and TOTAL_FREE().

◆ alter()

void ChertTable::alter ( )
protected

Btree::alter(); is called when the B-tree is to be altered.

It causes new blocks to be forced for the current set of blocks in the cursor.

The point is that if a block at level 0 is to be altered it may get a new number. Then the pointer to this block from level 1 will need changing. So the block at level 1 needs altering and may get a new block number. Then the pointer to this block from level 2 will need changing ... and so on back to the root.

The clever bit here is spotting the cases when we can make an early exit from this process. If C[j].rewrite is true, C[j+k].rewrite will be true for k = 1,2 ... We have been through all this before, and there is no need to do it again. If C[j].n was free at the start of the transaction, we can copy it back to the same place without violating the integrity of the B-tree. We don't then need a new n and can return. The corresponding C[j].rewrite may be true or false in that case.

Definition at line 338 of file chert_table.cc.

References Assert, C, LOGCALL_VOID, REVISION(), Item_wr::set_block_given_by(), and SET_REVISION().

◆ basic_open()

bool ChertTable::basic_open ( bool  revision_supplied,
chert_revision_number_t  revision 
)
protected

◆ block_given_by()

static uint4 ChertTable::block_given_by ( const uint8_t *  p,
int  c 
)
staticprotected

block_given_by(p, c) finds the item at block address p, directory offset c, and returns its tag value as an integer.

Referenced by next_default(), and prev_default().

◆ block_to_cursor()

void ChertTable::block_to_cursor ( Cursor C_,
int  j,
uint4  n 
) const
protected

◆ cancel()

void ChertTable::cancel ( )

◆ close()

void ChertTable::close ( bool  permanent = false)

Close the Btree.

This closes and frees any of the btree structures which have been created and opened.

Parameters
permanentIf true, the Btree will not reopen on demand.

Definition at line 1773 of file chert_table.cc.

References buffer, C, Item_base< T >::get_address(), handle, kt, level, LOGCALL_VOID, Cursor::p, and split_p.

Referenced by ChertDatabase::close(), commit(), create_and_open(), do_open_to_read(), erase(), open(), and ~ChertTable().

◆ commit()

void ChertTable::commit ( chert_revision_number_t  revision,
int  changes_fd = -1,
const std::string *  changes_tail = NULL 
)

Commit any outstanding changes to the table.

Commit changes made by calling add() and del() to the Btree.

If an error occurs during the operation, this will be signalled by an exception. In case of error, changes made will not be committed to the Btree - they will be discarded.

Parameters
new_revisionThe new revision number to store. This must be greater than the latest revision number (see get_latest_revision_number()), or an exception will be thrown.
changes_fdThe file descriptor to write changes to. Defaults to -1, meaning no changes will be written.

Definition at line 1826 of file chert_table.cc.

References Assert, base, base_letter, BLK_UNUSED, both_bases, BTREE_CURSOR_LEVELS, Btree_modified, Cursor::c, C, changed_c, changed_n, ChertTable_base::clear_bit_map(), close(), ChertTable_base::commit(), DIR_START, faked_root_block, handle, io_full_sync(), io_sync(), io_tmp_rename(), item_count, latest_revision_number, level, LOGCALL_VOID, Cursor::n, name, other_base_letter(), read_root(), Xapian::revision(), revision_number, Cursor::rewrite, root, seq_count, SEQ_START_POINT, sequential, ChertTable_base::set_have_fakeroot(), ChertTable_base::set_item_count(), ChertTable_base::set_level(), ChertTable_base::set_revision(), ChertTable_base::set_root(), ChertTable_base::set_sequential(), tablename, throw_database_closed(), writable, and ChertTable_base::write_to_file().

Referenced by ChertDatabase::compact(), ChertCompact::multimerge_postlists(), and ChertDatabase::set_revision_number().

◆ compact()

void ChertTable::compact ( uint8_t *  p)
protected

compact(p) compact the block at p by shuffling all the items up to the end.

MAX_FREE(p) is then maximized, and is equal to TOTAL_FREE(p).

Definition at line 469 of file chert_table.cc.

References Assert, D2, DIR_END(), DIR_START, Item_base< T >::get_address(), LOGCALL_VOID, SET_MAX_FREE(), SET_TOTAL_FREE(), setD(), and Item_base< T >::size().

◆ create_and_open()

void ChertTable::create_and_open ( unsigned int  blocksize)

Create a new empty btree structure on disk and open it at the initial revision.

The table must be writable - it doesn't make sense to create a table that is read-only!

The block size must be less than 64K, where K = 1024. It is unwise to use a small block size (less than 1024 perhaps), so we enforce a minimum block size of 2K.

Example:

Btree btree("X-"); btree.create_and_open(8192); // Files will be X-DB, X-baseA (and X-baseB).

Parameters
blocksize- Size of blocks to use.
Exceptions
Xapian::DatabaseCreateErrorif the table can't be created.
Xapian::InvalidArgumentErrorif the requested blocksize is unsuitable.

Definition at line 1719 of file chert_table.cc.

References Assert, block_size, close(), do_open_to_write(), handle, io_unlink(), LOGCALL_VOID, revision_number, ChertTable_base::set_block_size(), set_block_size(), ChertTable_base::set_have_fakeroot(), ChertTable_base::set_revision(), ChertTable_base::set_sequential(), throw_database_closed(), writable, and ChertTable_base::write_to_file().

Referenced by ChertDatabase::compact(), ChertTermListTable::create_and_open(), ChertDatabase::create_and_open_tables(), and ChertCompact::multimerge_postlists().

◆ cursor_get()

ChertCursor * ChertTable::cursor_get ( ) const

Get a cursor for reading from the table.

The cursor is owned by the caller - it is the caller's responsibility to ensure that it is deleted.

Definition at line 1318 of file chert_table.cc.

References LOGCALL, RETURN, and throw_database_closed().

Referenced by check_chert_table(), Chert::PostlistChunkWriter::flush(), ChertDatabase::get_postlist_cursor(), ChertAllTermsList::next(), ChertDatabase::open_metadata_keylist(), ChertDatabase::open_spelling_wordlist(), ChertDatabase::open_synonym_keylist(), ChertAllTermsList::skip_to(), and ValueUpdater::update().

◆ del()

bool ChertTable::del ( const std::string &  key)

Delete an entry from the table.

The entry will be removed from the table, if it exists. If it does not exist, no action will be taken. The item with an empty key can't be removed, and false is returned.

If an error occurs during the operation, this will be signalled by an exception.

e.g. bool deleted = btree.del("TODAY")

Parameters
keyThe key to remove from the table.
Returns
true if an entry was removed; false if it did not exist.

Definition at line 1105 of file chert_table.cc.

References Assert, CHERT_BTREE_MAX_KEY_LEN, LOGCALL, RETURN, and throw_database_closed().

Referenced by ChertRecordTable::delete_record(), Chert::PostlistChunkWriter::flush(), ChertWritableDatabase::set_metadata(), and ValueUpdater::write_tag().

◆ delete_item()

void ChertTable::delete_item ( int  j,
bool  repeatedly 
)
protected

ChertTable::delete_item(j, repeatedly) is (almost) the converse of add_item.

If repeatedly is true, the process repeats at the next level when a block has been completely emptied, freeing the block and taking out the pointer to it. Emptied root blocks are also removed, which reduces the number of levels in the B-tree.

Definition at line 771 of file chert_table.cc.

References Assert, AssertRel, BLK_UNUSED, Item_base< T >::block_given_by(), C, D2, DIR_END(), DIR_START, LOGCALL_VOID, MAX_FREE(), SET_DIR_END(), SET_MAX_FREE(), SET_TOTAL_FREE(), Item_base< T >::size(), and TOTAL_FREE().

◆ delete_kt()

int ChertTable::delete_kt ( )
protected

Definition at line 914 of file chert_table.cc.

References Assert, C, Item_base< T >::components_of(), LOGCALL, RETURN, and SEQ_START_POINT.

◆ do_open_to_read()

bool ChertTable::do_open_to_read ( bool  revision_supplied,
chert_revision_number_t  revision_ 
)
protected

Perform the opening operation to read.

Return true iff the open succeeded.

Definition at line 2007 of file chert_table.cc.

References basic_open(), BLK_UNUSED, block_size, C, close(), errno_to_string(), handle, io_open_block_rd(), lazy, level, LOGCALL, Cursor::n, name, Cursor::p, read_root(), RETURN, revision_number, and throw_database_closed().

Referenced by open().

◆ do_open_to_write()

bool ChertTable::do_open_to_write ( bool  revision_supplied,
chert_revision_number_t  revision_,
bool  create_db = false 
)
protected

Perform the opening operation to write.

Return true iff the open succeeded.

Definition at line 1509 of file chert_table.cc.

References BLK_UNUSED, C, close(), DIR_START, errno_to_string(), io_open_block_wr(), LOGCALL, name, RETURN, SEQ_START_POINT, throw_database_closed(), and zeroed_new().

Referenced by create_and_open(), and open().

◆ empty()

bool ChertTable::empty ( ) const
inline

◆ enter_key()

void ChertTable::enter_key ( int  j,
Key  prevkey,
Key  newkey 
)
protected

enter_key(j, prevkey, newkey) is called after a block split.

It enters in the block at level C[j] a separating key for the block at level C[j - 1]. The key itself is newkey. prevkey is the preceding key, and at level 1 newkey can be trimmed down to the first point of difference to prevkey for entry in C[j].

This code looks longer than it really is. If j exceeds the number of B-tree levels the root block has split and we have to construct a new one, but this is a rare event.

The key is constructed in b, with block number C[j - 1].n as tag, and this is added in with add_item. add_item may itself cause a block split, with a further call to enter_key. Hence the recursion.

Definition at line 538 of file chert_table.cc.

References Assert, AssertEq, AssertRel, C, C2, D2, Item_wr::form_null_key(), Key::get_address(), getint4(), I2, K1, Item_base< T >::key(), Key::length(), LOGCALL_VOID, Item_wr::set_key_and_block(), SET_TOTAL_FREE(), and TOTAL_FREE().

◆ erase()

void ChertTable::erase ( )

Erase this table from disk.

Definition at line 1696 of file chert_table.cc.

References close(), io_unlink(), and LOGCALL_VOID.

Referenced by ChertDatabase::compact(), and ChertLazyTable::create_and_open().

◆ exists()

bool ChertTable::exists ( ) const

Determine whether the btree exists on disk.

Definition at line 1689 of file chert_table.cc.

References file_exists(), LOGCALL, and RETURN.

Referenced by ChertDatabase::database_exists().

◆ find()

bool ChertTable::find ( Cursor C_) const
protected

find(C_) searches for the key of B->kt in the B-tree.

Result is true if found, false otherwise. When false, the B_tree cursor is positioned at the last key in the B-tree <= the search key. Goes to first (null) item in B-tree when key length == 0.

Note: The cursor can be left with C_[0].c == DIR_START - D2 if the requested key doesn't exist and is less than the smallest key in a leaf block, but after the dividing key. The caller needs to fix up C_[0].c in this case, either explicitly or by performing an operation which gives C_[0].c a valid value.

Definition at line 433 of file chert_table.cc.

References DIR_START, LOGCALL, Key::p, and RETURN.

◆ find_in_block()

int ChertTable::find_in_block ( const uint8_t *  p,
Key  key,
bool  leaf,
int  c 
)
staticprotected

find_in_block(p, key, leaf, c) searches for the key in the block at p.

leaf is true for a data block, and false for an index block (when the first key is dummy and never needs to be tested). What we get is the directory entry to the last key <= the key being searched for.

The lookup is by binary chop, with i and j set to the left and right ends of the search area. In sequential addition, c will often be the answer, so we test the keys round c and move i and j towards c if possible.

The returned value is < DIR_END(p). If leaf is false, the returned value is >= DIR_START; if leaf is true, it can also be == DIR_START - D2.

Definition at line 386 of file chert_table.cc.

References Assert, AssertRel, D2, DIR_END(), DIR_START, Key::get_address(), Item_base< T >::key(), LOGCALL_STATIC, and RETURN.

◆ flush_db()

void ChertTable::flush_db ( )

Flush any outstanding changes to the DB file of the table.

This must be called before commit, to ensure that the DB file is ready to be switched to a new version by the commit.

Definition at line 1803 of file chert_table.cc.

References Assert, Btree_modified, C, faked_root_block, handle, level, LOGCALL_VOID, throw_database_closed(), writable, and write_block().

Referenced by ChertDatabase::compact(), ChertSynonymTable::flush_db(), ChertSpellingTable::flush_db(), ChertCompact::multimerge_postlists(), and ChertDatabase::set_revision_number().

◆ form_key()

void ChertTable::form_key ( const std::string &  key) const
protected

Definition at line 948 of file chert_table.cc.

References LOGCALL_VOID.

◆ get_block_size()

unsigned int ChertTable::get_block_size ( ) const
inline

Get the block size.

Definition at line 559 of file chert_table.h.

Referenced by ChertDatabase::open_tables(), and ChertDatabase::open_tables_consistent().

◆ get_entry_count()

chert_tablesize_t ChertTable::get_entry_count ( ) const
inline

Return a count of the number of entries in the table.

The count does not include the ever-present item with null key.

Use empty() if you only want to know if the table is empty or not.

Returns
The number of entries in the table.

Definition at line 623 of file chert_table.h.

Referenced by ChertTableCheck::check(), check_chert_table(), and ChertRecordTable::get_doccount().

◆ get_exact_entry()

bool ChertTable::get_exact_entry ( const std::string &  key,
std::string &  tag 
) const

Read an entry from the table, if and only if it is exactly that being asked for.

If the key is found in the table, then the tag is copied to tag. If the key is not found tag is left unchanged.

The result is true iff the specified key is found in the Btree.

Parameters
keyThe key to look for in the table.
tagA tag object to fill with the value if found.
Returns
true if key is found in table, false if key is not found in table.

Definition at line 1187 of file chert_table.cc.

References Assert, C, CHERT_BTREE_MAX_KEY_LEN, LOGCALL, RETURN, and throw_database_closed().

Referenced by ChertTermList::ChertTermList(), Chert::PostlistChunkWriter::flush(), ChertPostListTable::get_freqs(), ChertDatabase::get_metadata(), ChertRecordTable::get_record(), ChertDatabaseStats::read(), and ChertPositionList::read_data().

◆ get_latest_revision_number()

chert_revision_number_t ChertTable::get_latest_revision_number ( ) const
inline

Get the latest revision number stored in this table.

This gives the higher of the revision numbers held in the base files of the B-tree, or just the revision number if there's only one base file.

It is possible that there are other, older, revisions of this table available, and indeed that the revision currently open is one of these older revisions.

Definition at line 598 of file chert_table.h.

Referenced by ChertDatabase::ChertDatabase(), and ChertDatabase::get_next_revision_number().

◆ get_open_revision_number()

chert_revision_number_t ChertTable::get_open_revision_number ( ) const
inline

Get the revision number at which this table is currently open.

It is possible that there are other, more recent or older revisions available.

Returns
the current revision number.

Definition at line 610 of file chert_table.h.

Referenced by ChertDatabaseReplicator::apply_changeset_from_conn(), ChertTableCheck::check(), ChertDatabase::ChertDatabase(), ChertDatabase::create_and_open_tables(), ChertDatabase::get_revision_number(), and ChertDatabase::open_tables_consistent().

◆ get_path()

string ChertTable::get_path ( ) const
inline

Definition at line 672 of file chert_table.h.

References name, and Key::p.

Referenced by ChertDatabase::compact().

◆ is_modified()

bool ChertTable::is_modified ( ) const
inline

Determine whether the object contains uncommitted modifications.

Returns
true if there have been modifications since the last the last call to commit().

Definition at line 653 of file chert_table.h.

Referenced by ChertDatabase::apply(), ChertWritableDatabase::has_uncommitted_changes(), ChertSynonymTable::is_modified(), and ChertSpellingTable::is_modified().

◆ is_open()

bool ChertTable::is_open ( ) const
inline

Return true if this table is open.

NB If the table is lazy and doesn't yet exist, returns false.

Definition at line 433 of file chert_table.h.

References ChertCursor::read_tag(), and Xapian::revision().

Referenced by ChertWritableDatabase::add_document_(), ChertWritableDatabase::delete_document(), ChertDatabase::open_term_list(), ChertWritableDatabase::replace_document(), and ChertDatabase::throw_termlist_table_close_exception().

◆ key_exists()

bool ChertTable::key_exists ( const std::string &  key) const

Check if a key exists in the Btree.

This is just like get_exact_entry() except it doesn't read the tag value so is more efficient if you only want to check that the key exists.

Parameters
keyThe key to look for in the table.
Returns
true if key is found in table, false if key is not found in table.

Definition at line 1210 of file chert_table.cc.

References Assert, C, CHERT_BTREE_MAX_KEY_LEN, LOGCALL, and RETURN.

◆ lazy_alloc_deflate_zstream()

void ChertTable::lazy_alloc_deflate_zstream ( ) const
protected

Allocate the zstream for deflating, if not already allocated.

Definition at line 1614 of file chert_table.cc.

References compress_strategy, deflate_zstream, rare, Xapian::Internal::str(), and usual.

◆ lazy_alloc_inflate_zstream()

void ChertTable::lazy_alloc_inflate_zstream ( ) const
protected

Allocate the zstream for inflating, if not already allocated.

Definition at line 1652 of file chert_table.cc.

References inflate_zstream, rare, Xapian::Internal::str(), and usual.

◆ mid_point()

int ChertTable::mid_point ( uint8_t *  p) const
protected

mid_point(p) finds the directory entry in c that determines the approximate mid point of the data in the block at p.

Definition at line 604 of file chert_table.cc.

References Assert, D2, DIR_END(), DIR_START, LOGCALL, RETURN, Item_base< T >::size(), and TOTAL_FREE().

◆ next()

bool ChertTable::next ( Cursor C_,
int  j 
) const
inlineprotected

◆ next_default()

bool ChertTable::next_default ( Cursor C_,
int  j 
) const
protected

◆ next_for_sequential()

bool ChertTable::next_for_sequential ( Cursor C_,
int  dummy 
) const
protected

◆ open() [1/2]

void ChertTable::open ( )

Open the btree at the latest revision.

Exceptions
Xapian::DatabaseCorruptErrorwill be thrown if the table is in a corrupt state.
Xapian::DatabaseOpeningErrorwill be thrown if the table cannot be opened (but is not corrupt - eg, permission problems, not present, etc).

Definition at line 2050 of file chert_table.cc.

References close(), do_open_to_read(), do_open_to_write(), LOGCALL_VOID, LOGLINE, and writable.

Referenced by ChertDatabaseReplicator::apply_changeset_from_conn(), ChertTableCheck::check(), check_chert_table(), ChertPostListTable::open(), ChertDatabase::open_tables(), and ChertDatabase::open_tables_consistent().

◆ open() [2/2]

bool ChertTable::open ( chert_revision_number_t  revision_)

Open the btree at a given revision.

Like Btree::open, but try to open at the given revision number and fail if that isn't possible.

Parameters
revision_- revision number to open.
Returns
true if table is successfully opened at desired revision; false if table cannot be opened at desired revision (but table is otherwise consistent).
Exceptions
Xapian::DatabaseCorruptErrorwill be thrown if the table is in a corrupt state.
Xapian::DatabaseOpeningErrorwill be thrown if the table cannot be opened (but is not corrupt - eg, permission problems, not present, etc).

Definition at line 2067 of file chert_table.cc.

References AssertEq, close(), do_open_to_read(), do_open_to_write(), LOGCALL, LOGLINE, RETURN, revision_number, and writable.

◆ operator=()

ChertTable& ChertTable::operator= ( const ChertTable )
private

Assignment not allowed.

◆ other_base_letter()

char ChertTable::other_base_letter ( ) const
inlineprotected

Definition at line 711 of file chert_table.h.

Referenced by commit().

◆ prev()

bool ChertTable::prev ( Cursor C_,
int  j 
) const
inlineprotected

Definition at line 816 of file chert_table.h.

◆ prev_default()

bool ChertTable::prev_default ( Cursor C_,
int  j 
) const
protected

◆ prev_for_sequential()

bool ChertTable::prev_for_sequential ( Cursor C_,
int  dummy 
) const
protected

◆ read_block()

void ChertTable::read_block ( uint4  n,
uint8_t *  p 
) const
protected

read_block(n, p) reads block n of the DB file to address p.

Definition at line 170 of file chert_table.cc.

References Assert, DIR_END(), DIR_START, io_read_block(), LOGCALL_VOID, rare, Xapian::Internal::str(), and throw_database_closed().

Referenced by next_for_sequential(), prev_for_sequential(), and write_changed_blocks().

◆ read_root()

void ChertTable::read_root ( )
protected

◆ read_tag()

bool ChertTable::read_tag ( Cursor C_,
std::string *  tag,
bool  keep_compressed 
) const

Read the tag value for the key pointed to by cursor C_.

Parameters
keep_compressedDon't uncompress the tag - e.g. useful if it's just being opaquely copied.
Returns
true if current_tag holds compressed data (always false if keep_compressed was false).

Definition at line 1223 of file chert_table.cc.

References aligned_write4(), Item_base< T >::append_chunk(), C2, Item_base< T >::components_of(), Item_base< T >::get_compressed(), I2, K1, LOGCALL, LOGLINE, RETURN, and Xapian::Internal::str().

◆ readahead_key()

bool ChertTable::readahead_key ( const string &  key) const

◆ really_empty()

bool ChertTable::really_empty ( ) const
private

Return true if there are no entries in the table.

Definition at line 1600 of file chert_table.cc.

References ChertCursor::find_entry(), handle, ChertCursor::next(), and throw_database_closed().

◆ set_block_size()

void ChertTable::set_block_size ( unsigned int  block_size_)

Set the block size.

It's only safe to do this before the table is created.

Definition at line 1707 of file chert_table.cc.

References block_size, BYTE_PAIR_RANGE, CHERT_DEFAULT_BLOCK_SIZE, and LOGCALL_VOID.

Referenced by ChertDatabase::compact(), ChertLazyTable::create_and_open(), create_and_open(), ChertDatabase::open_tables(), and ChertDatabase::open_tables_consistent().

◆ set_full_compaction()

void ChertTable::set_full_compaction ( bool  parity)

Definition at line 1309 of file chert_table.cc.

References Assert, and LOGCALL_VOID.

Referenced by ChertDatabase::compact().

◆ set_max_item_size()

void ChertTable::set_max_item_size ( size_t  block_capacity)
inline

Set the maximum item size given the block capacity.

At least this many items of maximum size must fit into a block. The default is BLOCK_CAPACITY (which is currently 4).

Definition at line 660 of file chert_table.h.

References BLOCK_CAPACITY, CHERT_MAX_ITEM_SIZE, D2, DIR_START, and throw_database_closed().

Referenced by ChertDatabase::compact().

◆ set_overwritten()

void ChertTable::set_overwritten ( ) const
protected

Definition at line 251 of file chert_table.cc.

References LOGCALL_VOID.

Referenced by next_for_sequential(), and prev_for_sequential().

◆ split_root()

void ChertTable::split_root ( uint4  split_n)
protected

Btree needs to gain a new level to insert more items: so split root block and construct a new one.

Definition at line 493 of file chert_table.cc.

References BTREE_CURSOR_LEVELS, C, DIR_START, Item_wr::form_null_key(), LOGCALL_VOID, SET_DIR_END(), SET_LEVEL(), SET_REVISION(), STRINGIZE, and zeroed_new().

◆ throw_database_closed()

void ChertTable::throw_database_closed ( )
static

◆ write_block()

void ChertTable::write_block ( uint4  n,
const uint8_t *  p 
) const
protected

write_block(n, p) writes block n in the DB file from address p.

When writing we check to see if the DB file has already been modified. If not (so this is the first write) the old base is deleted. This prevents the possibility of it being opened subsequently as an invalid base.

Definition at line 199 of file chert_table.cc.

References Assert, AssertEqParanoid, AssertParanoid, io_unlink(), io_write_block(), LOGCALL_VOID, and REVISION().

Referenced by flush_db().

◆ write_changed_blocks()

void ChertTable::write_changed_blocks ( int  changes_fd)

Append the list of blocks changed to a changeset file.

Parameters
changes_fdThe file descriptor to write changes to.

Definition at line 1909 of file chert_table.cc.

References Assert, base, block_size, ChertTable_base::calculate_last_block(), faked_root_block, ChertTable_base::find_changed_block(), handle, io_write(), LOGCALL_VOID, pack_string(), pack_uint(), read_block(), and tablename.

Referenced by ChertDatabase::set_revision_number().

Friends And Related Function Documentation

◆ ChertCursor

friend class ChertCursor
friend

Definition at line 348 of file chert_table.h.

Member Data Documentation

◆ base

ChertTable_base ChertTable::base
protected

For writing back as file baseA or baseB.

Definition at line 780 of file chert_table.h.

Referenced by cancel(), ChertTableCheck::check(), commit(), next_for_sequential(), and write_changed_blocks().

◆ base_letter

char ChertTable::base_letter
protected

the value 'A' or 'B' of the current base

Definition at line 745 of file chert_table.h.

Referenced by cancel(), ChertTableCheck::check(), and commit().

◆ block_size

unsigned int ChertTable::block_size
protected

◆ both_bases

bool ChertTable::both_bases
mutableprotected

set to true if baseA and baseB both exist as valid bases.

The unused base is deleted as soon as a write to the Btree takes place.

Definition at line 742 of file chert_table.h.

Referenced by commit().

◆ Btree_modified

bool ChertTable::Btree_modified
mutableprotected

Set to true the first time the B-tree is modified.

Definition at line 801 of file chert_table.h.

Referenced by cancel(), commit(), and flush_db().

◆ buffer

uint8_t* ChertTable::buffer
protected

buffer of size block_size for reforming blocks

Definition at line 777 of file chert_table.h.

Referenced by close().

◆ C

Cursor ChertTable::C[BTREE_CURSOR_LEVELS]
mutableprotected

◆ changed_c

int ChertTable::changed_c
protected

directory offset corresponding to last block to be changed by an addition

Definition at line 795 of file chert_table.h.

Referenced by cancel(), and commit().

◆ changed_n

uint4 ChertTable::changed_n
protected

the last block to be changed by an addition

Definition at line 791 of file chert_table.h.

Referenced by cancel(), and commit().

◆ compress_strategy

int ChertTable::compress_strategy
protected

DONT_COMPRESS or Z_DEFAULT_STRATEGY, Z_FILTERED, Z_HUFFMAN_ONLY, Z_RLE.

Definition at line 852 of file chert_table.h.

Referenced by lazy_alloc_deflate_zstream().

◆ cursor_created_since_last_modification

bool ChertTable::cursor_created_since_last_modification
mutableprotected

Flag for tracking when cursors need to rebuild.

Definition at line 810 of file chert_table.h.

Referenced by cancel().

◆ cursor_version

unsigned long ChertTable::cursor_version
protected

Version count for tracking when cursors need to rebuild.

Definition at line 813 of file chert_table.h.

Referenced by cancel().

◆ deflate_zstream

z_stream* ChertTable::deflate_zstream
mutableprotected

Zlib state object for deflating.

Definition at line 855 of file chert_table.h.

Referenced by lazy_alloc_deflate_zstream(), and ~ChertTable().

◆ faked_root_block

bool ChertTable::faked_root_block
protected

true if the root block is faked (not written to disk).

false otherwise. This is true when the btree hasn't been modified yet.

Definition at line 751 of file chert_table.h.

Referenced by cancel(), ChertTableCheck::check(), commit(), flush_db(), and write_changed_blocks().

◆ full_compaction

bool ChertTable::full_compaction
protected

set to true when full compaction is to be achieved

Definition at line 804 of file chert_table.h.

◆ handle

int ChertTable::handle
protected

File descriptor of the table.

If the table is lazily created and doesn't yet exist, this will be -1.

If close() has been called, this will be -2.

Definition at line 765 of file chert_table.h.

Referenced by cancel(), close(), commit(), create_and_open(), do_open_to_read(), flush_db(), really_empty(), and write_changed_blocks().

◆ inflate_zstream

z_stream* ChertTable::inflate_zstream
mutableprotected

Zlib state object for inflating.

Definition at line 858 of file chert_table.h.

Referenced by lazy_alloc_inflate_zstream(), and ~ChertTable().

◆ item_count

chert_tablesize_t ChertTable::item_count
protected

keeps a count of the number of items in the B-tree.

Definition at line 728 of file chert_table.h.

Referenced by cancel(), ChertTableCheck::check(), and commit().

◆ kt

Item_wr ChertTable::kt
mutableprotected

buffer of size block_size for making up key-tag items

Definition at line 774 of file chert_table.h.

Referenced by close().

◆ last_readahead

uint4 ChertTable::last_readahead
mutableprotected

Last block readahead_key() preread.

Definition at line 864 of file chert_table.h.

◆ latest_revision_number

chert_revision_number_t ChertTable::latest_revision_number
mutableprotected

Revision number of the other base, or zero if there is only one base file.

Definition at line 736 of file chert_table.h.

Referenced by cancel(), commit(), next_for_sequential(), and prev_for_sequential().

◆ lazy

bool ChertTable::lazy
protected

If true, don't create the table until it's needed.

Definition at line 861 of file chert_table.h.

Referenced by do_open_to_read().

◆ level

int ChertTable::level
protected

◆ max_item_size

size_t ChertTable::max_item_size
protected

maximum size of an item (key-tag pair)

Definition at line 798 of file chert_table.h.

◆ name

std::string ChertTable::name
protected

The path name of the B tree.

Definition at line 783 of file chert_table.h.

Referenced by commit(), and do_open_to_read().

◆ revision_number

chert_revision_number_t ChertTable::revision_number
protected

revision number of the opened B-tree.

Definition at line 725 of file chert_table.h.

Referenced by cancel(), ChertTableCheck::check(), commit(), create_and_open(), do_open_to_read(), next_for_sequential(), open(), and prev_for_sequential().

◆ root

uint4 ChertTable::root
protected

the root block of the B-tree

Definition at line 771 of file chert_table.h.

Referenced by cancel(), and commit().

◆ seq_count

int ChertTable::seq_count
protected

count of the number of successive instances of purely sequential addition, starting at SEQ_START_POINT (neg) and going up to zero.

Definition at line 788 of file chert_table.h.

Referenced by cancel(), and commit().

◆ sequential

bool ChertTable::sequential
protected

true iff the data has been written in a single write in sequential order.

Definition at line 756 of file chert_table.h.

Referenced by cancel(), ChertTableCheck::check(), and commit().

◆ split_p

uint8_t* ChertTable::split_p
protected

Buffer used when splitting a block.

This buffer holds the split off part of the block. It's only used when updating (in ChertTable::add_item().

Definition at line 848 of file chert_table.h.

Referenced by close().

◆ tablename

const char* ChertTable::tablename
protected

The name of the table (used when writing changesets).

Definition at line 716 of file chert_table.h.

Referenced by commit(), and write_changed_blocks().

◆ writable

bool ChertTable::writable
protected

Set to true when the database is opened to write.

Definition at line 807 of file chert_table.h.

Referenced by cancel(), commit(), create_and_open(), flush_db(), next_for_sequential(), open(), and prev_for_sequential().


The documentation for this class was generated from the following files: