xapian-core  1.4.25
Classes | Macros | Functions
editdistance.cc File Reference

Edit distance calculation algorithm. More...

#include <config.h>
#include "editdistance.h"
#include "omassert.h"
#include <algorithm>
#include <cstdlib>
+ Include dependency graph for editdistance.cc:

Go to the source code of this file.

Classes

struct  edist_seq< CHR >
 
class  edist_state< CHR >
 

Macros

#define INF   1000000
 

Functions

template<class CHR >
static int seqcmp_editdist (const CHR *ptr1, int len1, const CHR *ptr2, int len2, int max_distance)
 
int edit_distance_unsigned (const unsigned *ptr1, int len1, const unsigned *ptr2, int len2, int max_distance)
 Calculate the edit distance between two sequences. More...
 

Detailed Description

Edit distance calculation algorithm.

Based on that described in:

"An extension of Ukkonen's enhanced dynamic programming ASM algorithm" by Hal Berghel, University of Arkansas and David Roach, Acxiom Corporation

http://berghel.net/publications/asm/asm.php

Definition in file editdistance.cc.

Macro Definition Documentation

◆ INF

#define INF   1000000

Definition at line 137 of file editdistance.cc.

Referenced by edist_state< CHR >::edist_state().

Function Documentation

◆ edit_distance_unsigned()

int edit_distance_unsigned ( const unsigned *  ptr1,
int  len1,
const unsigned *  ptr2,
int  len2,
int  max_distance 
)

Calculate the edit distance between two sequences.

Edit distance is defined as the minimum number of edit operations required to move from one sequence to another. The edit operations considered are:

  • Insertion of a character at an arbitrary position.
  • Deletion of a character at an arbitrary position.
  • Substitution of a character at an arbitrary position.
  • Transposition of two neighbouring characters at an arbitrary position in the string.
Parameters
ptr1A pointer to the start of the first sequence.
len1The length of the first sequence.
ptr2A pointer to the start of the second sequence.
len2The length of the first sequence.
max_distanceThe greatest edit distance that's interesting to us. If the true edit distance is > max_distance, any value > max_distance may be returned instead (which allows the edit distance algorithm to avoid work for poor matches).
Returns
The edit distance from one item to the other.

Definition at line 206 of file editdistance.cc.

Referenced by Xapian::Database::get_spelling_suggestion().

◆ seqcmp_editdist()

template<class CHR >
static int seqcmp_editdist ( const CHR *  ptr1,
int  len1,
const CHR *  ptr2,
int  len2,
int  max_distance 
)
static