These tools go back to research by Lauri Karttunen and others at
Xerox Research Center Europe in Grenoble, where an attempt was made
to create highly efficient compilers and runtime libraries for finite-state
transducers, i.e. to move beyond regular expressions to regular RELATIONS.
This not only permits to formalize replacements (regular expressions with an
"output tape"), but also creates reversible automata (input and output roles can be swapped) and leads to a domain specific language that describes transducers in very readable ways, including sub-automata naming, so that it can be useful for formal specification or linguistic rules (phonology, morphology, i.e. word or sound grammar). The latter two projects are open source clones of the former
effort. Once you have used these for a week, you will never want to get back to ugly "ordinary" regexes again.
1. - Xerox xfst
2. - Xerox lexc
3. - Xerox twolc
4. - Mans Hulden's FOMA
Repo: https://fomafst.github.io/
Paper: https://dingo.sbs.arizona.edu/~mhulden/hulden_foma_2009.pdf
Demo: https://dsacl3-2018.github.io/xfst-demo/
Tutorial: https://foma.sourceforge.net/lrec2010/
5. Helsinki HfstXfst:
Homepage: https://github.com/hfst/hfst/wiki/HfstXfst
These tools go back to research by Lauri Karttunen and others at Xerox Research Center Europe in Grenoble, where an attempt was made to create highly efficient compilers and runtime libraries for finite-state transducers, i.e. to move beyond regular expressions to regular RELATIONS. This not only permits to formalize replacements (regular expressions with an "output tape"), but also creates reversible automata (input and output roles can be swapped) and leads to a domain specific language that describes transducers in very readable ways, including sub-automata naming, so that it can be useful for formal specification or linguistic rules (phonology, morphology, i.e. word or sound grammar). The latter two projects are open source clones of the former effort. Once you have used these for a week, you will never want to get back to ugly "ordinary" regexes again.
Books:
(a) https://www.amazon.co.uk/Finite-State-Processing-Synthesis-L...
(b) https://www.amazon.co.uk/Recognition-Algorithms-Finite-State...
(c) https://www.amazon.co.uk/Finite-State-Techniques-Transducers...
(d) https://www.amazon.co.uk/Finite-state-Language-Processing-Sp...
(e) https://www.amazon.co.uk/Finite-State-Morphology-CSLI-Comput...