yokome.features.dictionary

yokome.features.dictionary.GLOSS_SEPARATOR = '▪'

A character that separates different glosses for the same sense.

Asserted not to occur in the text of any gloss.

class yokome.features.dictionary.Lexeme(conn, language_code, entry_id, restrictions)

A lexeme (i.e. an entry) in the dictionary.

An entry in this context means a base meaning that may be denoted by either element of a set of highly similar pairs of graphic and phonetic variants. The base meaning may be further refined to one of several connotations of this lexeme, see Sense.

The same lexeme may appear in different grammatical positions, and different connotations of the same lexeme might be restricted to multiple, different grammatical usages, see Role.

Furthermore, there might be restrictions as to which graphic and phonetic variants may appear together, as well as which of those variants may appear with which connotations.

On construction, all relevant data is loaded from the database.

Parameters
  • conn – The database connection for the dictionary.

  • language_code (str) – ISO 639-3 language code of the language of interest.

  • entry_id (int) – The ID of the dictionary entry.

  • restrictions (dict) – A dictionary describing the restrictions imposed on the possible structural ways in which the POS tags may interrelate. Necessary in order to provide POS tag trees.

static lookup(conn, language_code, graphic, phonetic, restrictions)

Look up all lexemes that may be represented by the specified combination of a graphic and a phonetic variant.

Parameters
  • language_code (str) – ISO 639-3 language code of the language of interest.

  • graphic (str) – The graphic variant.

  • phonetic (str) – The phonetic variant.

  • restrictions (dict) – A dictionary describing the restrictions imposed on the possible structural ways in which the POS tags may interrelate. Necessary in order to provide POS tag trees.

Returns

A tuple of lexemes that contain the specified combination of a graphic variant and a phonetic variant in their list of headwords.

class yokome.features.dictionary.Role(conn, language_code, entry_id, pos_list_id, sense_ids, restrictions)

A role in the dictionary.

A role in this context means a collection of connotations of a lexeme that have the same grammatical functions in text.

In addition to the connotations, a role has a part-of-speech (POS) list. POS tags in this list may have mutually hierarchical, nonconflicting, and even exclusive relations.

A dictionary entry may contain multiple roles A and B with the same POS lists if the entry’s connotations are sorted by frequency of use, and a third role C with a different POS list has connotations with a lower frequency than those of A and with a higher frequency than those of B.

On construction, all relevant data is loaded from the database.

Parameters
  • conn – The database connection for the dictionary.

  • language_code (str) – ISO 639-3 language code of the language of interest.

  • entry_id (int) – The ID of the dictionary entry to which this role belongs.

  • pos_list_id (int) – The ID of the list of POS tags for this role.

  • sense_id – An iterable of integer IDs of the connotations of this role.

  • restrictions (dict) – A dictionary describing the restrictions imposed on the possible structural ways in which the POS tags may interrelate. Necessary in order to provide POS tag trees.

normalized_pos_tags()

Translate the list of POS tags as used in the dictionary to a list of POS tags in the representation used internally.

Returns

The list of POS tags associated with this role, in their internal representation.

pos_tree() → yokome.features.tree.TemplateTree

From the POS tags of this role, build a tree structure.

The restrictions of this role are used on tree creation.

Returns

A template tree that represents the list of POS tags associated with this role in a hierarchical fashion.

class yokome.features.dictionary.Sense(conn, language_code, entry_id, sense_id)

A connotation in the dictionary.

A connotation in this context means an abstract word meaning that is limited to a specific lexeme. Multiple lexemes may appear in text conveying the same meaning, and multiple meanings may be denoted by the same lexeme, but each combination of lexeme and sense is a unique connotation.

A connotation may be described by multiple glosses, each of which can be a direct translation, a description or similar.

On construction, all relevant data is loaded from the database.

Parameters
  • conn – The database connection for the dictionary.

  • language_code (str) – ISO 639-3 language code of the language of interest.

  • entry_id (int) – The ID of the dictionary entry to which this connotation belongs.

  • sense_id (int) – The ID of this connotation w.r.t. the entry with ID entry_id.

yokome.features.dictionary.circled_number(number, bold_circle=True)

Provide a Unicode representation of the specified number.

Parameters
  • number (int) – The positive number to convert to a string.

  • bold_circle (bool) – If True, return a white number on a black circle; return a black number on a white circle otherwise.

Returns

A string that is the specified number enclosed in a circle. For integers that have no such representation in Unicode, return the number enclosed in parentheses.