Prezentace se nahrává, počkejte prosím

Prezentace se nahrává, počkejte prosím

Previously existing lexical data resources automatic preprocessing manual annotation manual & automatic consistency checking automatic postprocessing resulting.

Podobné prezentace


Prezentace na téma: "Previously existing lexical data resources automatic preprocessing manual annotation manual & automatic consistency checking automatic postprocessing resulting."— Transkript prezentace:

1 previously existing lexical data resources automatic preprocessing manual annotation manual & automatic consistency checking automatic postprocessing resulting valency lexicon (XML) Valency Dictionary of Czech Verbs: Complex Tectogrammatical Annotation Markéta Straňáková-Lopatková Zdeněk Žabokrtský Center for Computational Linguistics, Charles University, Prague { stranak,zabokrtsky}@ckl.mff.cuni.cz ACT PAT ADDR EFF vyměnit (to exchange) ACT MEANS DIR3 jet (to go) based on Functional Generative Description [Sgall et al.,1986] closely related to the tectogrammatical tree structures of the Prague Dependency Treebank [Hajičová et al., 2001] 38 Example of the TG- structure : "... od prodejců vybírá poplatky každé ráno správce tržiště." ("... the janitor of a market-place collects fees from the sellers every morning. ") Theoretical background Motivation Goals There is no wide-coverage valency lexicon containing functors ("thematic roles"). Indeed, there is no Czech lexicon where valency phenomena are treated in a sufficiently systematic way! to develop an annotation scheme, methodology and software tools for a tectogrammatically annotated valency dictionary of Czech verbs to verify the approach on a small set of verbs to keep maximal consistency for all captured phenomena emphasis on both human and machine readability of the output Processing steps What should the dictionary ideally capture? for each verb set of valency frames for each valency frame ordered sequence of frame slots synonyms examples of usage reciprocity control (“equi/raising verbs“) possible type of reflexivity type of passivisation lemma of aspectual correlate(s) type of usage (prim./second./idiom.) semantic class (verba dicendi etc.) pointer(s) to EuroWordNet synset(s) number of occurences in a text sample for each frame slot functor surface realization(s) type (obligatory/optional/quasivalency/typical) Sample from the dictionary Conclusion Future work we can build an interesting and important language resource, but......creating high-quality data requires a lot of human effort; the task cannot be fully automated finding hard-and-fast criteria for annotator’s decisions enlarging the current lexicon intensive use of different language resources linking to other languages ACT PAT čekat na někoho (to wait for sb.) Current State 1000 verbs being annotated (350 finished) circa 60% coverage on verbs in running text * dodat [to deliver,to supply,to add] -aspect: (imp); dodávat (perf) = + ACT(1;obl) PAT(4;obl) DIR3(;obl) BEN(3,pro+4;typ) -synon: dopravit -example: dodat (někomu/pro někoho) někam zboží [to deliver goods somewhere] -note: alter -use: prim -freq: 1 + ACT(1;obl) ADDR(3;obl) PAT(4;obl) DIR3(;typ) -synon: dopravit -example: dodat někomu zboží (někam); dodali si (jeden druhému) zboží (někam) [to deliver goods to somebody; they delivered goods to each other] -note: alter -use: prim -reciprocity: ACT-ADDR -freq: 1 + ACT(1;obl) PAT(k+3;opt) EFF(4,že;obl) -synon: říci -example: dodal k tomu své připomínky /vše, co věděl [to add a remark on something] -use: secondary -class: dicendi -freq: 18 + ACT(1;obl) ADDR(3;obl) PAT(4,2;obl) -example: dodat někomu odvahu/odvahy; dodali si odvahy (jeden druhému) [to encourage somebody] -reciprocity: ACT-ADDR -use: idiom * předpokládat [to presuppose,to assume,to demand] -aspect: (imp.) + ACT(1;obl) PAT(o+6;opt) EFF(4,že;obl) -synon: brát za dané -example: předpokládal o tom, že je to pravda; předpokládali o sobě (navzájem), že nelžou [he presupposed about it that it was true; they assumed (one about the other) that they didn’t tell lies] -reciprocity: ACT-PAT -use: prim -class: dicendi -ewn: 1 -freq: 27 + ACT(1,inf,že,aby;obl) PAT(4,že;obl) -synon: žádat -example: tato práce předpokládá jistou zručnost; pracovat zde předpokládá zručnost [this work demands certain skill; it demands skill to work here] -control: gen -use: secondary -ewn: 2 -freq: 3 Verbal valency frame a range of syntactic elements (verbal modifiers) either required or specifically permitted by the verb


Stáhnout ppt "Previously existing lexical data resources automatic preprocessing manual annotation manual & automatic consistency checking automatic postprocessing resulting."

Podobné prezentace


Reklamy Google