xapian-core  1.4.25
Public Member Functions | Static Public Member Functions | Private Attributes | List of all members
Xapian::Stem Class Reference

Class representing a stemming algorithm. More...

#include <stem.h>

+ Collaboration diagram for Xapian::Stem:

Public Member Functions

 Stem (const Stem &o)
 Copy constructor. More...
 
Stemoperator= (const Stem &o)
 Assignment. More...
 
 Stem ()
 Construct a Xapian::Stem object which doesn't change terms. More...
 
 Stem (StemImplementation *p)
 Construct a Xapian::Stem object with a user-provided stemming algorithm. More...
 
 ~Stem ()
 Destructor. More...
 
std::string operator() (const std::string &word) const
 Stem a word. More...
 
bool is_none () const
 Return true if this is a no-op stemmer. More...
 
std::string get_description () const
 Return a string describing this object. More...
 
 Stem (const std::string &language)
 Construct a Xapian::Stem object for a particular language. More...
 
 Stem (const std::string &language, bool fallback)
 Construct a Xapian::Stem object for a particular language. More...
 

Static Public Member Functions

static std::string get_available_languages ()
 Return a list of available languages. More...
 

Private Attributes

Xapian::Internal::intrusive_ptr< StemImplementationinternal
 

Detailed Description

Class representing a stemming algorithm.

Definition at line 62 of file stem.h.

Constructor & Destructor Documentation

◆ Stem() [1/5]

Xapian::Stem::Stem ( const Stem o)

Copy constructor.

Definition at line 40 of file stem.cc.

◆ Stem() [2/5]

Xapian::Stem::Stem ( )

Construct a Xapian::Stem object which doesn't change terms.

Equivalent to Stem("none").

Definition at line 54 of file stem.cc.

Referenced by operator=().

◆ Stem() [3/5]

Xapian::Stem::Stem ( const std::string &  language)
explicit

Construct a Xapian::Stem object for a particular language.

Parameters
languageEither the English name for the language or the two letter ISO639 code.

The following language names are understood (aliases follow the name):

  • none - don't stem terms
  • arabic (ar) - Since Xapian 1.3.5
  • armenian (hy) - Since Xapian 1.3.0
  • basque (eu) - Since Xapian 1.3.0
  • catalan (ca) - Since Xapian 1.3.0
  • danish (da)
  • dutch (nl)
  • english (en) - Martin Porter's 2002 revision of his stemmer
  • earlyenglish - Early English (e.g. Shakespeare, Dickens) stemmer (since Xapian 1.3.2)
  • english_lovins (lovins) - Lovin's stemmer
  • english_porter (porter) - Porter's stemmer as described in his 1980 paper
  • finnish (fi)
  • french (fr)
  • german (de)
  • german2 - Normalises umlauts and ß
  • hungarian (hu)
  • indonesian (id) - Since Xapian 1.4.6
  • irish (ga) - Since Xapian 1.4.7
  • italian (it)
  • kraaij_pohlmann - A different Dutch stemmer
  • lithuanian (lt) - Since Xapian 1.4.7
  • nepali (ne) - Since Xapian 1.4.7
  • norwegian (nb, nn, no)
  • portuguese (pt)
  • romanian (ro)
  • russian (ru)
  • spanish (es)
  • swedish (sv)
  • tamil (ta) - Since Xapian 1.4.7
  • turkish (tr)
Parameters
fallbackIf true then treat unknown language as "none", otherwise an exception is thrown (default: false). Parameter added in Xapian 1.4.14 - older versions always threw an exception.
Exceptions
Xapian::InvalidArgumentErroris thrown if language isn't recognised and fallback is false.

Definition at line 129 of file stem.cc.

◆ Stem() [4/5]

Xapian::Stem::Stem ( const std::string &  language,
bool  fallback 
)

Construct a Xapian::Stem object for a particular language.

Parameters
languageEither the English name for the language or the two letter ISO639 code.

The following language names are understood (aliases follow the name):

  • none - don't stem terms
  • arabic (ar) - Since Xapian 1.3.5
  • armenian (hy) - Since Xapian 1.3.0
  • basque (eu) - Since Xapian 1.3.0
  • catalan (ca) - Since Xapian 1.3.0
  • danish (da)
  • dutch (nl)
  • english (en) - Martin Porter's 2002 revision of his stemmer
  • earlyenglish - Early English (e.g. Shakespeare, Dickens) stemmer (since Xapian 1.3.2)
  • english_lovins (lovins) - Lovin's stemmer
  • english_porter (porter) - Porter's stemmer as described in his 1980 paper
  • finnish (fi)
  • french (fr)
  • german (de)
  • german2 - Normalises umlauts and ß
  • hungarian (hu)
  • indonesian (id) - Since Xapian 1.4.6
  • irish (ga) - Since Xapian 1.4.7
  • italian (it)
  • kraaij_pohlmann - A different Dutch stemmer
  • lithuanian (lt) - Since Xapian 1.4.7
  • nepali (ne) - Since Xapian 1.4.7
  • norwegian (nb, nn, no)
  • portuguese (pt)
  • romanian (ro)
  • russian (ru)
  • spanish (es)
  • swedish (sv)
  • tamil (ta) - Since Xapian 1.4.7
  • turkish (tr)
Parameters
fallbackIf true then treat unknown language as "none", otherwise an exception is thrown (default: false). Parameter added in Xapian 1.4.14 - older versions always threw an exception.
Exceptions
Xapian::InvalidArgumentErroris thrown if language isn't recognised and fallback is false.

Definition at line 132 of file stem.cc.

◆ Stem() [5/5]

Xapian::Stem::Stem ( StemImplementation p)
explicit

Construct a Xapian::Stem object with a user-provided stemming algorithm.

You can subclass Xapian::StemImplementation to implement your own stemming algorithm (or to wrap a third-party algorithm) and then wrap your implementation in a Xapian::Stem object to pass to the Xapian API.

Parameters
pThe user-subclassed StemImplementation object. This is reference counted, and so will be automatically deleted by the Xapian::Stem wrapper when no longer required.

Definition at line 135 of file stem.cc.

◆ ~Stem()

Xapian::Stem::~Stem ( )

Destructor.

Definition at line 137 of file stem.cc.

Member Function Documentation

◆ get_available_languages()

static std::string Xapian::Stem::get_available_languages ( )
inlinestatic

Return a list of available languages.

Each stemmer is only included once in the list (not once for each alias). The name included is the English name of the language.

The list is returned as a string, with language names separated by spaces. This is a static method, so a Xapian::Stem object is not required for this operation.

Definition at line 181 of file stem.h.

References Xapian::Internal::get_constinfo_(), Xapian::Internal::constinfo::stemmer_data, and Xapian::Internal::constinfo::stemmer_name_len.

Referenced by DEFINE_TESTCASE(), and main().

◆ get_description()

string Xapian::Stem::get_description ( ) const

Return a string describing this object.

Definition at line 147 of file stem.cc.

Referenced by DEFINE_TESTCASE().

◆ is_none()

bool Xapian::Stem::is_none ( ) const
inline

Return true if this is a no-op stemmer.

Definition at line 166 of file stem.h.

Referenced by DEFINE_TESTCASE(), Xapian::TermGenerator::Internal::index_text(), and Xapian::QueryParser::Internal::parse_query().

◆ operator()()

string Xapian::Stem::operator() ( const std::string &  word) const

Stem a word.

Parameters
worda word to stem.
Returns
the stem

Definition at line 140 of file stem.cc.

◆ operator=()

Stem & Xapian::Stem::operator= ( const Stem o)
default

Assignment.

Definition at line 43 of file stem.cc.

References internal, and Stem().

Member Data Documentation

◆ internal

Xapian::Internal::intrusive_ptr<StemImplementation> Xapian::Stem::internal
private

Reference counted internals.

Definition at line 65 of file stem.h.

Referenced by operator=().


The documentation for this class was generated from the following files: