latin-stemming

0.0.2 • Public • Published

latin-stemming

Functions for getting the stems for Latin words. Based on the Schinke algorithm. The key thing to note about this approach is that it can be applied to words when it doesn't know what language part (e.g. noun vs verb) it is. In those cases it will usually return more than one possible stem.

var stemming = require("latin-stemming");

Constants

The module uses several hard-coded arrays for lookups and replacements. You can access them yourself via:

  • stemming.quewords => an array of -que words that are atomic and NOT 'and'
  • stemming.nounsuffixes => an array of regexes for matching noun suffixes
  • stemming.verbsuffixes => an array of regexes for matching verb suffixes, and what those suffixes should be replaced with in the stem

Functions

stemming.stem(word, config) // => []

Word, I hope, is self explanatory. Config is a struct that can contain several optional values:

  • quewords - override the default quewords list provided by the module
  • nounsuffixes - override the default noun suffix regexes
  • verbsuffixes - override the default verb suffixes and replacements
  • type - if this is "Noun", "Adjective", "Adverb", or "Verb", the stemmer will only apply the relevent stemming rules

stemming.couchkey(config) // => Function

Returns a contextless function that can be used for CouchDB indexes. Config can contain several optional values:

  • condition - string to be used inside an if (...) statement; if you include this, documents will only be processed if they pass the condition
  • wordkey (defaults to 'word') - the document property containing the word to be stemmed
  • typekey (defaults to 'wordclass') - the document property that contains language part of the word (e.g. Noun, Verb); if the property is not available, keys for both verb and noun interpretations will be emitted
  • quewords - override the default quewords list provided by the module
  • nounsuffixes - override the default noun suffix regexes
  • verbsuffixes - override the default verb suffixes and replacements

Readme

Keywords

none

Package Sidebar

Install

npm i latin-stemming

Weekly Downloads

0

Version

0.0.2

License

BSD

Last publish

Collaborators

  • shib71