Logion: Machine-Learning Based Detection and Correction of Textual Errors in Greek Philology (New Publication)

Congratulations to the Logion team for a new publication in the Proceedings of the Ancient Language Processing Workshop!

Abstract: We present statistical and machine-learning based techniques for detecting and correcting errors in text and apply them to the challenge of textual corruption in Greek philology. Most ancient Greek texts reach us through a long process of copying, in relay, from earlier manuscripts (now lost). In this process of textual transmission, copying errors tend to accrue. After training a BERT model on the largest premodern Greek dataset used for this purpose to date, we identify and correct previously undetected errors made by scribes in the process of textual transmission, in what is, to our knowledge, the first successful identification of such errors via machine learning. The premodern Greek BERT model we train is available for use at https://huggingface.co/cabrooks/LOGION-base.

Previous
Previous

Steven Feng participates in the Digital Humanities for Hellenic Studies Summer Institute in Athens (Team Member Update)

Next
Next

Machine Learning and the Future of Philology: A Case Study (New Publication)