The focus of this week's podcast Številke is on the Sinonimni slovar slovenskega jezika, the thesaurus of the Slovenian language, published recently by the Scientific Research Centre of the Slovenian Academy of Sciences and Arts. We interviewed Jerica Snoj, the co-author.
The Slovenian language corpora consists of two basic parts, Slovar slovenskega knjižnega jezika (the SSKJ dictionary of the Slovenian language) and Pravopis (The Slovenian Orthography). Jerica Snoj is convinced that Sinonimni slovar slovenskega jezika, the thesaurus of the Slovenian language, is an integral part of the corpora. Linguists have long been yearning for such a book, but the concrete idea emerged in 2001, when Slovar slovenskega pravopisa (the dictionary of Slovenian Orthography) was being made: "At that time those authors came together who agreed on the lexicography we considered productive, on which manual would be needed, and believed in our abilities to complete it without unsurmountable problems," the lexicographer recalls. In 2002 the thesaurus was officially confirmed as the Institute's project, but due to some logistic problems the work started only in 2013. Three years later the book weighing more than 3 kilograms was completed.
We all remember from our school days that synonyms are words with similar meaning, but Jerica Snoj gave a more precise definition: "When speaking of a common meaning, the commonness is limited to the fact that all the synonyms can refer to the same non-linguistic concept, but from the linguistic point of view the most interesting is the difference between them. Let's take a simple example: medved (bear) - kosmatinec (shaggy, hairy creature). The words are connected by a non-linguistic object, the animal the words refer to."
The dictionary includes 74,509 entries, including all the synonyms, components of phrase words, and synonym strings. Not all the words have synonyms. Snoj explains that "some 5% of the words included in the SSKJ have no established synonyms in dictionary terms. Of course, there are more synonyms than the number included in the thesaurus, as the spoken or dialectal use of language is not included."
The word 'zelo' (very) has the most synonyms
The words without synonyms, or with a limited number of synonyms, are words "denominating concepts which are defined relatively sharply and clearly, mostly objects – e.g. chemical elements.... The words referring to concepts denominating properties or actions, or states, where the person using the word wants to express his or her attitude towards the denominated concept," Snoj explained, and mentioned that the adverb zelo (very) is the word boasting the largest number of synonyms.
The Thesaurus of the Slovenian language will be at first published as a book, and a digital version is planned as well.