[institut] SCL seminar: Alexandru Nicolin, Thursday, 10 November 2016, 14:00

Antun Balaz antun at ipb.ac.rs
Mon Nov 7 09:41:49 CET 2016


Dear colleagues,

You are cordially invited to the SCL seminar of the Center for the Study of Complex Systems, which will be held on Thursday, 10 November 2016 at 14:00 in the library reading room “Dr. Dragan Popović" of the Institute of Physics Belgrade. The talk entitled

Computer-based statistical description of the Romanian language

will be given by Dr. Alexandru Nicolin ("Horia Hulubei" National Institute for Physics and Nuclear Engineering, Bucharest, Romania). Abstract of the talk:

Motivated by the advent of security solutions which rely on voice biometrics, we will revisit by means of extensive computer-based investigations the concept of phonetical balance for Romanian utterances and the distribution of Romanian words. We will show that the standard distribution of phonems offers only a partial description of the phonetics of the language and that more detailed statistical indicators are needed. To this end, we will introduce a simple indicator that measures vowel-consonant (or consonant-vowel) sequences and analyze the distribution of consonant clusters for Romanian words. Our results will show that the distribution of consonant clusters is scale-free-like (akin to the distribution of words and phrases in large texts) and that large clusters of vowels or consonants are infrequent. This, in turn, indicates that utterances consisting of words which are statistically unrepresentative with respect to the previous indicators are good candidates for benchmarking the efficency of voice biometrics solutions. For the distribution of Romanian words and word clusters we will show the validity of Zipf's law using a Romanian text corpus of roughly 5 million words. Finally, we will argue that these statistical analyses of text corpora belong to the general field of Big Data, for which there are numerous funding opportunities within Horizon 2020.


Best regards,
Antun Balaž

-----
Antun Balaž
E-mail: antun at ipb.ac.rs
Web: http://www.scl.rs/antun

Phone: +381 11 3713152
Fax: +381 11 3162190

Scientific Computing Laboratory
Institute of Physics Belgrade
Pregrevica 118, 11080 Belgrade, Serbia
-----




More information about the institut mailing list