~linuxgoose/linguistics-robin

ref: aceb0db052748219be2c24d7d013c024391a8dc8 linguistics-robin/README.md -rw-r--r-- 1.5 KiB
aceb0db0 — Jordan Update README.md 8 months ago

#Linguistics Robin

Linguistics Robin is a Python linguistics collection that stemmed from a phonetics only library (which is why there is currently more phonetic tooling). Right now, the following algorithms are implemented and supported:

  • Soundex
  • Metaphone
  • Double-Metaphone
  • Refined Soundex
  • Fuzzy Soundex
  • Lein
  • Matching Rating Approach
  • New York State Identification and Intelligence System (NYSIIS)
  • Caverphone
  • Caverphone 2

In addition, the following distance metrics:

  • Hamming
  • Levenshtein

More will be added in the future. Please refer to the issues list for algorithms slated for the future.

Pull requests are always welcome to assist with the addition of new algorithms.

#Installation

The module is available in PyPI, just use pip install linguistics_robin.

#Usage

>>> from linguistics_robin import Soundex
>>> soundex = Soundex()
>>> soundex.phonetics('Rupert')
'R163'
>>> soundex.phonetics('Robert')
'R163'
>>> soundex.sounds_like('Robert', 'Rupert')
True

The same API applies to every algorithm, e.g:

>>> from linguistics_robin import Metaphone
>>> metaphone = Metaphone()
>>> metaphone.phonetics('discrimination')
'TSKRMNXN'

You can also use the distance(word1, word2, metric='levenshtein') method to find the distance between 2 phonetic representations.

>>> from linguistics_robin import RefinedSoundex
>>> rs = RefinedSoundex()
>>> rs.distance('Rupert', 'Robert')
0
>>> rs.distance('assign', 'assist', metric='hamming')
2