Natural Language Processing and Quantitative Linguistics: the contribution of the research on Latin texts
prof. Dominique Longrée (Université de Liège / Laboratoire d’Analyse statistique des Langues anciennes)
Přednáška se koná ve čtvrtek 23. února 2023 od 18:00 hodin v místnosti C144 (Celetná 20, Praha 1).
Anotace: Since the Second World War, the processing and statistical handling of digitized Latin texts has been an original and important addition to quantitative linguistics studies. This development of quantitative studies devoted to Latin texts is to a large extent a Franco-Belgian achievement, and is based largely on resources produced, beginning in 1961, by the Laboratory for the Statistical Analysis of Ancient Languages (LASLA) at the University of Liège. A description of the LASLA research and of their epistemological foundations emphasizes how the lemmatisation and the morpho-syntactic tagging have been keystones not only of the Natural Language Processing, but also of the statistical analysis of textual data. It shows how these operations of abstraction and regrouping allow other more or less complex analysis units to emerge, and ground new disciplines and approaches. More recently, the study of Latin literary works had also played a key role in the development of the Treebanks, of the mining-software and of the artificial intelligence applied to texts. At the same time, this research has also led to a better understanding of the parameters taken into account by the mechanisms of the AI.