Assignment 1: Finite-state Morphology

Recommended deadline: December 1, 2019
Deadline: March 1, 2020

In this assignment, you are required to create finite-state morphological analyzer for English nouns and verbs (you are welcome to incorporate adjectives from the Foma tutorial, but you do not have to). Your morphological analyzer should model following morphological processes:

Your analyzer should produce all possible analyses of a word (only for the above morphological processes). For example, books should be analyzed both as book<N><PL> and book<V><pres><3sg>.

Your system should, at a minimum, be able to work for the items provided in the file lemmas.txt, which contains a lemma on the first column, and comma separated POS tag(s) on the second column. You may extend your analyzer with further lexicon and affixes. However, please do not add more POS tags for the words provided in this lexicon.

Try develop an analyzer that is easy to extend to other words. That is, prefer general solutions to hard-coding exceptions that work for only particular cases.

Please use Xerox’ xfst/lexc formalisms. You can use free implementations HFST or foma. If you are interested in using another formalism, e.g., SFST or OpenFST, please contact me first.

Please follow the assignment link and implement your solution in the repository created for you. Please follow good programming practices (clear coding style, comments when necessary).