~linuxgoose/linguistics-robin

ref: 63915e4dce58f8bb8c89156a0dcecadc3b972a60 linguistics-robin/README.md -rw-r--r-- 1.3 KiB
63915e4d — Jordan Fixing of not dropping all leading instances of the first character matching the next in line 8 months ago

#Pyphonetics

Pyphonetics is a Python 3 library for phonetic algorithms. Right now, the following algorithms are implemented and supported:

  • Soundex
  • Metaphone
  • Refined Soundex
  • Fuzzy Soundex
  • Lein
  • Matching Rating Approach

In addition, the following distance metrics:

  • Hamming
  • Levenshtein

More will be added in the future.

#Instalation

The module is available in PyPI, just use pip install pyphonetics.

#Usage

>>> from pyphonetics import Soundex
>>> soundex = Soundex()
>>> soundex.phonetics('Rupert')
'R163'
>>> soundex.phonetics('Robert')
'R163'
>>> soundex.sounds_like('Robert', 'Rupert')
True

The same API applies to every algorithm, e.g:

>>> from pyphonetics import Metaphone
>>> metaphone = Metaphone()
>>> metaphone.phonetics('discrimination')
'TSKRMNXN'

You can also use the distance(word1, word2, metric='levenshtein') method to find the distance between 2 phonetic representations.

>>> from pyphonetics import RefinedSoundex
>>> rs = RefinedSoundex()
>>> rs.distance('Rupert', 'Robert')
0
>>> rs.distance('assign', 'assist', metric='hamming')
2

#Credits

The module was largely based on the implementation of phonetic algorithms found in the Talisman.js Node NLP library.