Gramatika v korpusu, korpus v gramatice: příspěvek k diskusi o vyhledávání gramatické informace v korpusech

[img]PDF - Authorized users only
Language: Czech
68Kb
Title:Gramatika v korpusu, korpus v gramatice: příspěvek k diskusi o vyhledávání gramatické informace v korpusech
Creators:
Uhlířová, Ludmila; uhlirova at ujc dot cas dot cz; Ústav pro jazyk český AV ČR, Letenská 4, Praha 1, 118 51
Journal or Publication Title:
Slovo a slovesnost, 65, 1, pp. 16-24

Abstract

This article discusses some aspects of searching for grammatical information in corpora. It argues that any search procedure must consist of at least three principally different steps. First, a hypothesis regarding some grammatical property of the language system must be formulated in terms of an available “tagging” menu. Second, general instructions concerning the sample size, relevant context size, etc. must be stated, and only then can the third step, i.e. the proper search and interpretation of the attested data, be taken. Examples from the Czech National Corpus are offered to show that the boundary between grammaticality and non-grammaticality of a phenomenon or category is represented by a probability scale with more than just two opposing values and that the corpus may serve as an important tool for locating the most probable (favorite) point on the scale. The issue of zero or non-zero occurrence of a phenomenon is discussed in greater detail. It is argued that if no example of a phenomenon is attested in the corpus, it does not necessarily follow that the corpus is too small and that it is necessary or significant to intervene in favor of a larger one.

Official URL: http://www.ceeol.com/aspx/issuedetails.aspx?issueid=F4667DD2-7E45-4379-B8C5-81B00FFE7A90&articleid=ECB1AE55-582F-41C8-BC5F-972FC1672179

Title:Gramatika v korpusu, korpus v gramatice: příspěvek k diskusi o vyhledávání gramatické informace v korpusech
Translated title:Grammar in corpus, corpus in grammar: A contribution to the discussion on searching for grammatical information in a corpus
Creators:
Uhlířová, Ludmila; uhlirova at ujc dot cas dot cz; Ústav pro jazyk český AV ČR, Letenská 4, Praha 1, 118 51
Subjects:P Language and Literature > P Philology. Linguistics
Divisions:Humanities and Social Sciences > 9. Section of Humanities and Philology > Institute of the Czech Language > Slovo a slovesnost
Journal or Publication Title:Slovo a slovesnost
Volume:65
Number:1
Page Range:pp. 16-24
ISSN:0037-7031
Publisher:Czech Language Institute of the Academy of Sciences of the Czech Republic
ID Code:2787
Item Type:Article
Deposited On:04 Sep 2008 20:32
Last Modified:04 Sep 2008 18:32

Citation

Uhlířová, Ludmila (2004) Gramatika v korpusu, korpus v gramatice: příspěvek k diskusi o vyhledávání gramatické informace v korpusech. Slovo a slovesnost, 65 (1). pp. 16-24. ISSN 0037-7031

Repository Staff Only: item control page