Words That Matter. Towards a Swedish-Czech Colligational Lexicon

of Basic Verbs

Ústav formální a aplikované lingvistiky MFFUK 2009

ISBN 9788090417533

Basic verbs, i.e. very common verbs that typically denote physical movements, locations, states or actions, undergo various semantic shifts and acquire different secondary uses. In extreme cases, the distribution of secondary uses grows so general that they are regarded as auxiliary verbs (go and to be going to), phase verbs (turn, grow), etc. These uses are usually well-documented by grammars and language textbooks, and so are idiomatic expressions (phraseologisms) in dictionaries. There is, however, a grey area in between, which is extremely difficult to learn for non-native speakers. This consists of secondary uses with limited collocability, in particular light verb constructions, and secondary meanings that only get activated under particular morphosyntactic conditions.

The basic-verb secondary uses and constructions are usually semantically transparent, such that they do not pose understanding problems, but they are generally unpredictable and language-specific, such that they easily become an issue in non-native text production. In this thesis, Swedish basic verbs are approached from the contrastive point of view of an advanced Czech learner of Swedish. A selection of Swedish constructions with basic verbs is explored.

The observations result in a proposal for the structure of a machine-readable Swedish-Czech lexicon, which focuses on basic verbs and their constructions. The lexicon is anchored in the valency theory of the Functional Generative description, coupled with analysis of collocations according to the semantically motivated principles of Corpus Pattern Analysis, in order to achieve the necessary level of delicacy to make meaning distinctions correctly. The lexicon consists of two parts: SweVallex, which is a lexicon of verb frames, and a Predicate Noun Lexicon, which captures predicate nouns (the nominal components of light verb constructions). These two parts are interlinked.

The verb collocates of predicate nouns are sorted according to the Mel'čukian Lexical Functions. Features such as telicity, punctuality, and volitionality are described for each light verb construction, whenever possible. Special attention is paid to the morphosyntactic behavior of the respective predicate nouns (determiner use, and modifier insertion). In order to facilitate the routine of building such a lexicon, the 20-million morphosyntactically annotated Swedish corpus PAROLE was lemmatized and loaded into the corpus GUI Bonito, which includes the Word Sketch Engine, a tool for automatic collocation analysis. Word Sketch Definitions for Swedish were created and loaded into the Word Sketch Engine. In addition to the PAROLE corpus, a two-million parallel Swedish-Czech corpus was used, which has been built within a different project.