Theoretical framework and computational basis

The model of verb valency used in e-Glava is based on the results of German valency research and their lexicographic application in valency dictionaries, especially the German valency dictionary VALBU (Schumacher, H. et al. 2004), and its online version E-VALBU. The basic assumption of VALBU’s approach is the extraction of valency complements at the level of sublemmas or meanings, not at the level of a verb or a lemma.

Hence, valency is understood as the syntactic and semantic description of obligatory and optional complements on the level of verb’s sense. However, unlike German dictionary VALBU, e-Glava introduces the morphological layer of analysis (as a realization of syntax) and classification of verbs into semantic classes. Approximately 900 most frequent Croatian verbs were divided into 34 semantic classes and 91 subclasses. Semantic classes are motivated by Beth Levin’s model (Levin, B. 1993: English Verb Classes and Alternations), but we deviate from her model to a great extent, which is described in details in paper by Ivana Brač and Tomislava Bošnjak Botica (Brač, I. & Bošnjak Botica, T. 2015: Semantička razdioba glagola u Bazi hrvatskih glagolskih valencija. Fluminensia: časopis za filološka istraživanja, 27 (1), 105.

As a part of preparation the team memebers had to decide whether to develop its own customized CMS (Content Management System) or to use an existing lexicographic package. Considering some practical facts, we began to develop a three-level linguistic schema for a valency dictionary in TshwaneLex, which we considered a computerisation phase of our lexicographic process. Accordingly, we began writing new lexicographic entries in the prepared TschwaneLex schema for 57 psychological verbs. The IT department attempted to make the dictionary entry writing process as precise and user-friendly as possible for the researchers and lexicographers, mostly through the implementation of drop-down menus and controlled multiple choices for all linguistic features.

Although the dictionary grammar was developed using a DTD (Document Type Definition) editing module of TshwaneLex and an ODBC connection, and the DTD was automatically transcribed into a PostgreSQL database environment, the project team still had to make some adjustments before the data could be presented on an internet platform. The first version of the operating work interface was completed by 2014, and today's form is a stable working version that we will probably improve further, but certainly not at the level of the linguistic structure rather than the individual fields.

We decided to export the native XML file for all verbs within the semantic class that were marked “completed” to an easily-accessible SQL database. This process made the part of dictionary we consider completed automatically browsable through a web-based search engine using PHP and HTML5. This gave researchers the ability to make verbs currently being described available by exporting an updated XML file, which then goes “live” on the website. In addition to this first version, which is browsable by lemma, an advanced search function is being developed and it will enable users to search by specific categories, such as valency complements, morphological forms, or semantic features.