Biblos.js, a.k.a. Morpheus for Ancient Greek
- look main Philosophy section
all words in the text (usually a sentence, or a few sentences) are checked in the special hand-made dictionary of “Terms.” Here in “terms” I call words that do not require further analysis. For example, all forms of pronouns, articles, all unchangeable entries in the dictionary, irregular verb forms, etc. For example, words in a phrase εἴ περ γάρ ἐστιν ἡ ψυχὴ ἐν has only one non-term word the - ψυχὴ, and all others are particles, prepositions, articles, and the finite form of verb “to be”, - so, “terms”.
Antrax module - https://github.com/mbykov/antrax divides the word form into chains of all possible segments (for a long word, there can be tens of thousands of chains). And then selects the best chains, establishing correspondence between segments. Matches may be
- prefix - prefix - stem - ending
- stem - connecting vowel - stem - ending
- stem - suffix (e.g. participles) - ending
- etc, etc, any meaningful combinations
So. Morpheus easily identifies rather complex forms, for example προσδιαιρέω => προσ-δι-αιρέω, ἀμφοτερογνώμων => ἀμφοτερ-ο-γνώμων, etc
To establish a connection between the stem and the ending, Morpheus has a special module. This module creates a Flex database for all wordforms from the resource https://en.wiktionary.org/wiki/Category:Ancient_Greek_lemmas - more than 300 thousand word forms. The same wordforms are tests, so Morpheus has more than 300 thousand tests also. The Flex database serves as a filter, trained on data from wiktionary.org. Something like a black box. What happens in it does not interest us, but all tests goes ok. So Morpheus has no ready-made “paradigm system.” It automatically establishes an unambiguous correspondence between the wordform and the dictionary entries, and nothing more.
In addition, the Flex database can classify not the usual “paradigm”, but a separate wordform case. This decision made it possible to abandon the inevitable concept of exclusion. Morpheus easily and automatically processes forms that are usually considered exceptions - for example, the forms of the verb “to be”, forms like θρίξ (nom) - τριχός (gen), etc. If we nevertheless look into the black box, we will see both “classical paradigms” and “paradigms with unexpected additions”, and in many cases a “certain horror” in general. Therefore, it is better not to look there, provided that all tests run correctly. The use of the “black box” instead of the “paradigm system” is a consequence of not using the concept of “language” and the lack of intention on its (language) research and modeling.
An important corollary: antrax therefore does not have a separate “noun paradigm” and “adjective paradigm” separately, the ways of declension of all types of names are common. If the dictionary entry has the parameter “gender” (modern noun), the result gets the gender from the dictionary article, and if it does not (modern adjective), then from the ending properties. This is fully consistent with antique grammar.
The ultimate goal of the application is to show in the interface what an ancient didaskalos would explain to us about this sentence or wordform, and not the modern linguist .