Vladimír Benko

Researcher at the Slovak Academy of Sciences, Ľ. Štúr Institute of Linguistics
Panská 26, SK-81101 Bratislava, Slovakia
phone +421-2-54431762, fax +421-2-54431756
vladimir.benko at juls.savba.sk

Research Interests

Projects

Selected Bibliography

2017

  • Benko, Vladimír. Language Code Switching in Web Corpora. In RASLAN 2017: Recent Advances in Slavonic Natural Language Processing. Ed. Aleš Horák, Pavel Rychlý, Adam Rambousek. Brno: Tribun EÚ, 2017, pp. 97-105. ISBN 978-80-263-1340-3.    BibTeX PDF
  • Benko, Vladimír. Are Web Corpora Inferior? The Case of Czech and Slovak. In Proceedings of the Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing (CMLC-5+BigNLP 2017) including the papers from the Web-as-Corpus (WAC-XI) guest section. Birmingham, Ed. Piotr Bański et al., Mannheim: Institut für Deutsche Sprache, 2017, pp. 43-48.    BibTeX PDF
  • Benko, Vladimír - Butašová, Anna. Teaching corpus linguistics with Aranea web corpora. In Trudy meždunarodnoj konferencii "Korpusnaja lingvistika - 2017". Sankt-Peterburg: Sankt-Peterburgskij gosudarstvennyj universitet - Institut lingvističeskich issledovanij RAN - Rossijskij gosudarstvennyj pedagogičeskij universitet im. A. I. Gercena, 2017, pp. 16-21.    BibTeX PDF

2016

  • Benko, Vladimír. Feeding the "Brno Pipeline": The Case of Araneum Slovacum. In Proceedings of the Eleventh Workshop on Recent Advances in Slavonic Natural Languages Processing (RASLAN 2016). Brno: Tribun, 2016, pp. 19-27. ISSN 2336-4289.    BibTeX PDF
  • Benko, Vladimír. Two Years of Aranea: Increasing Counts and Tuning the Pipeline. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2016). Portorož: European Language Resources Association (ELRA), 2016, pp. 4245-4248. ISBN 978-2-9517408-9-1.    BibTeX PDF
  • Benko, Vladimír – Zakharov, Victor. Very Large Russian Corpora: New Opportunities and New Challenges. In Kompjuternaja lingvistika i intellektuaľnyje technologii: Po materialam meždunarodnoj konferencii «Dialog» (2016), vypusk 15 (22). Moskva: Rossijskij gosudarstvennyj gumanitarnyj universitet, 2016, pp. 79-93. ISSN 2221-7932.    BibTeX PDF

2014

  • Benko, Vladimír. Aranea: Yet Another Family of (Comparable) Web Corpora. In Petr Sojka, Aleš Horák, Ivan Kopeček and Karel Pala (Eds.): Text, Speech and Dialogue. 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings. LNCS 8655. Springer International Publishing Switzerland, 2014. pp. 257-264. ISBN: 978-3-319-10815-5 (Print), 978-3-319-10816-2 (Online).    BibTeX PDF
  • Benko, Vladimír. Compatible Sketch Grammars for Comparable Corpora. In Andrea Abel, Chiara Vettori, Natascia Ralli (Eds.): Proceedings of the XVI EURALEX International Congress: The User In Focus. 15–19 July 2014. Bolzano/Bozen: Eurac Research, 2014. pp. 417-430. ISBN 978-88-88906-97-3.    BibTeX PDF

2013

  • Benko, Vladimír. Data Deduplication in Slovak Corpora. In Slovko 2013: Natural Language Processing, Corpus Linguistics, E-learning. RAM-Verlag: Lüdenscheid, 2013, pp. 27-39.    BibTeX PDF
  • Benko, Vladimír. Compatible Sketch Grammar Experiment. In Proceedings of the International Conference «Corpus Linguistics – 2013», June 25–27, 2013, St. Petersburg., 2013, pp. 21-29.    BibTeX PDF

Useful resources

  • Microsoft Windows Keyboards Layout (Sk)    PDF
  • Microsoft Windows Code Tables (Sk)    PDF
  • Slovak National Corpus Tagset Cheatsheet (Sk)    PDF
  • Penn Tagset, Araneum Universal Tagset & PCRE Cheatsheet    PDF
  • Slovak/Czech-compatible Cyrillics keyboard for Microsoft Windows (XP – Windows 8.1)
    QWERTZ (ЭЮЕРТЗ) layout    ZIP
    QWERTY (ЭЮЕРТЫ) layout    ZIP