Benko, Vladimír. Language Code Switching in Web Corpora.
In RASLAN 2017: Recent Advances in Slavonic Natural Language Processing. Ed. Aleš Horák, Pavel Rychlý, Adam Rambousek. Brno: Tribun EÚ, 2017,
pp. 97-105. ISBN 978-80-263-1340-3.
Benko, Vladimír. Are Web Corpora Inferior? The Case of Czech and Slovak.
In Proceedings of the Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing (CMLC-5+BigNLP 2017)
including the papers from the Web-as-Corpus (WAC-XI) guest section. Birmingham, Ed. Piotr Bański et al.,
Mannheim: Institut für Deutsche Sprache, 2017, pp. 43-48.
Benko, Vladimír - Butašová, Anna. Teaching corpus linguistics with Aranea web corpora.
In Trudy meždunarodnoj konferencii "Korpusnaja lingvistika - 2017".
Sankt-Peterburg: Sankt-Peterburgskij gosudarstvennyj universitet - Institut lingvističeskich issledovanij RAN
- Rossijskij gosudarstvennyj pedagogičeskij universitet im. A. I. Gercena, 2017, pp. 16-21.
2016
Benko, Vladimír. Feeding the "Brno Pipeline": The Case of Araneum Slovacum.
In Proceedings of the Eleventh Workshop on Recent Advances in Slavonic Natural Languages Processing (RASLAN 2016).
Brno: Tribun, 2016, pp. 19-27. ISSN 2336-4289.
Benko, Vladimír. Two Years of Aranea: Increasing Counts and Tuning the Pipeline.
In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2016).
Portorož: European Language Resources Association (ELRA), 2016, pp. 4245-4248. ISBN 978-2-9517408-9-1.
Benko, Vladimír – Zakharov, Victor. Very Large Russian Corpora: New Opportunities and New Challenges.
In Kompjuternaja lingvistika i intellektuaľnyje technologii: Po materialam meždunarodnoj konferencii «Dialog» (2016),
vypusk 15 (22).
Moskva: Rossijskij gosudarstvennyj gumanitarnyj universitet, 2016, pp. 79-93. ISSN 2221-7932.
2014
Benko, Vladimír. Aranea: Yet Another Family of (Comparable) Web Corpora.
In Petr Sojka, Aleš Horák, Ivan Kopeček and Karel Pala (Eds.):
Text, Speech and Dialogue. 17th International Conference,
TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings.
LNCS 8655.
Springer International Publishing Switzerland, 2014. pp. 257-264.
ISBN: 978-3-319-10815-5 (Print), 978-3-319-10816-2 (Online).
Benko, Vladimír. Compatible Sketch Grammars for Comparable Corpora.
In Andrea Abel, Chiara Vettori, Natascia Ralli
(Eds.): Proceedings of the XVI EURALEX International Congress: The User In Focus. 15–19 July 2014.
Bolzano/Bozen: Eurac Research, 2014. pp. 417-430. ISBN 978-88-88906-97-3.
2013
Benko, Vladimír. Data Deduplication in Slovak Corpora.
In Slovko 2013: Natural Language Processing, Corpus Linguistics, E-learning. RAM-Verlag: Lüdenscheid, 2013, pp. 27-39.
Benko, Vladimír. Compatible Sketch Grammar Experiment.
In Proceedings of the International Conference «Corpus Linguistics – 2013», June 25–27, 2013, St. Petersburg., 2013, pp. 21-29.
Useful resources
Microsoft Windows Keyboards Layout (Sk)
Microsoft Windows Code Tables (Sk)
Slovak National Corpus Tagset Cheatsheet (Sk)
Penn Tagset, Araneum Universal Tagset & PCRE Cheatsheet
Slovak/Czech-compatible Cyrillics keyboard for Microsoft Windows (XP – Windows 8.1)
QWERTZ (ЭЮЕРТЗ) layout
QWERTY (ЭЮЕРТЫ) layout