yokome.data.jpn.dictionary_to_rdbms

Import script to transfer entries from a JMdict XML file to an SQLite database.

yokome.data.jpn.dictionary_to_rdbms.DIALECT = {'Hokkaido-ben': 'hob', 'Kansai-ben': 'ksb', 'Kantou-ben': 'ktb', 'Kyoto-ben': 'kyb', 'Kyuushuu-ben': 'kyu', 'Nagano-ben': 'nab', 'Osaka-ben': 'osb', 'Ryuukyuu-ben': 'rkb', 'Tosa-ben': 'tsb', 'Touhoku-ben': 'thb', 'Tsugaru-ben': 'tsug'}

Mapping from JMdict dialect entities to dialect codes.

yokome.data.jpn.dictionary_to_rdbms.DOMAINS = {'Buddhist term': 'Buddh.', 'Shinto term': 'Shinto', 'anatomical term': 'anat.', 'architecture term': 'archit.', 'astronomy, etc. term': 'astron.', 'baseball term': 'baseb.', 'biology term': 'biol.', 'botany term': 'bot.', 'business term': 'bus.', 'chemistry term': 'chem.', 'computer terminology': 'comp.', 'economics term': 'econ.', 'engineering term': 'engin.', 'finance term': 'fin.', 'food term': 'food', 'geology, etc. term': 'geol.', 'geometry term': 'geom.', 'law, etc. term': 'law', 'linguistics terminology': 'ling.', 'mahjong term': 'mahj.', 'martial arts term': 'MA', 'mathematics': 'math.', 'medicine, etc. term': 'med.', 'military': 'mil.', 'music term': 'music', 'physics terminology': 'phys.', 'shogi term': 'shogi', 'sports term': 'sports', 'sumo term': 'sumo', 'zoology term': 'zool.'}

Mapping from JMdict domain entities to domain abbreviations.

yokome.data.jpn.dictionary_to_rdbms.GLOSS_TYPES = {'expl': 'i.e.', 'fig': 'fig.', 'lit': 'lit.'}

Mapping from JMdict gloss types to more readable representations.

yokome.data.jpn.dictionary_to_rdbms.POS = {'Godan verb - -aru special class': 'verb:quintigrade::ra column:-aru special class:', 'Godan verb - Iku/Yuku special class': 'verb:quintigrade::ka column:iku/yuku special class:', 'Godan verb - Uru old class verb (old form of Eru)': 'verb:quintigrade::ra column:uru special class:', "Godan verb with `bu' ending": 'verb:quintigrade::ba column:regular:', "Godan verb with `gu' ending": 'verb:quintigrade::ga column:regular:', "Godan verb with `ku' ending": 'verb:quintigrade::ka column:regular:', "Godan verb with `mu' ending": 'verb:quintigrade::ma column:regular:', "Godan verb with `nu' ending": 'verb:quintigrade::na column:regular:', "Godan verb with `ru' ending": 'verb:quintigrade::ra column:regular:', "Godan verb with `ru' ending (irregular verb)": 'verb:quintigrade::ra column:-ru irregular:', "Godan verb with `su' ending": 'verb:quintigrade::sa column:regular:', "Godan verb with `tsu' ending": 'verb:quintigrade::ta column:regular:', "Godan verb with `u' ending": 'verb:quintigrade::a column:regular:', "Godan verb with `u' ending (special class)": "verb:quintigrade::a column:-'u special class:", 'Ichidan verb': 'verb:monograde::ra column:regular:', 'Ichidan verb - kureru special class': 'verb:monograde::ra column:kureru special class:', 'Ichidan verb - zuru verb (alternative form of -jiru verbs)': 'verb:monograde:::-zuru ending:', 'Kuru verb - special class': 'verb:k-irregular::::', "Nidan verb (lower class) with `bu' ending (archaic)": 'verb:bigrade:lower class:ba column:regular:', "Nidan verb (lower class) with `dzu' ending (archaic)": 'verb:bigrade:lower class:da column:regular:', "Nidan verb (lower class) with `gu' ending (archaic)": 'verb:bigrade:lower class:ga column:regular:', "Nidan verb (lower class) with `hu/fu' ending (archaic)": 'verb:bigrade:lower class:ha column:regular:', "Nidan verb (lower class) with `ku' ending (archaic)": 'verb:bigrade:lower class:ka column:regular:', "Nidan verb (lower class) with `mu' ending (archaic)": 'verb:bigrade:lower class:ma column:regular:', "Nidan verb (lower class) with `nu' ending (archaic)": 'verb:bigrade:lower class:na column:regular:', "Nidan verb (lower class) with `ru' ending (archaic)": 'verb:bigrade:lower class:ra column:regular:', "Nidan verb (lower class) with `su' ending (archaic)": 'verb:bigrade:lower class:sa column:regular:', "Nidan verb (lower class) with `tsu' ending (archaic)": 'verb:bigrade:lower class:ta column:regular:', "Nidan verb (lower class) with `u' ending and `we' conjugation (archaic)": 'verb:bigrade:lower class:wa column:regular:', "Nidan verb (lower class) with `yu' ending (archaic)": 'verb:bigrade:lower class:ya column:regular:', "Nidan verb (lower class) with `zu' ending (archaic)": 'verb:bigrade:lower class:za column:regular:', "Nidan verb (upper class) with `bu' ending (archaic)": 'verb:bigrade:upper class:ba column:regular:', "Nidan verb (upper class) with `dzu' ending (archaic)": 'verb:bigrade:upper class:da column:regular:', "Nidan verb (upper class) with `gu' ending (archaic)": 'verb:bigrade:upper class:ga column:regular:', "Nidan verb (upper class) with `hu/fu' ending (archaic)": 'verb:bigrade:upper class:ha column:regular:', "Nidan verb (upper class) with `ku' ending (archaic)": 'verb:bigrade:upper class:ka column:regular:', "Nidan verb (upper class) with `mu' ending (archaic)": 'verb:bigrade:upper class:na column:regular:', "Nidan verb (upper class) with `ru' ending (archaic)": 'verb:bigrade:upper class:ra column:regular:', "Nidan verb (upper class) with `tsu' ending (archaic)": 'verb:bigrade:upper class:ta column:regular:', "Nidan verb (upper class) with `yu' ending (archaic)": 'verb:bigrade:upper class:ya column:regular:', "Nidan verb with 'u' ending (archaic)": 'verb:bigrade:lower class:a column:regular:', "Yodan verb with `bu' ending (archaic)": 'verb:quadrigrade::ba column:regular:', "Yodan verb with `gu' ending (archaic)": 'verb:quadrigrade::ga column:regular:', "Yodan verb with `hu/fu' ending (archaic)": 'verb:quadrigrade::ha column:regular:', "Yodan verb with `ku' ending (archaic)": 'verb:quadrigrade::ka column:regular:', "Yodan verb with `mu' ending (archaic)": 'verb:quadrigrade::ma column:regular:', "Yodan verb with `nu' ending (archaic)": 'verb:quadrigrade::na column:regular:', "Yodan verb with `ru' ending (archaic)": 'verb:quadrigrade::ra column:regular:', "Yodan verb with `su' ending (archaic)": 'verb:quadrigrade::sa column:regular:', "Yodan verb with `tsu' ending (archaic)": 'verb:quadrigrade::ta column:regular:', "`kari' adjective (archaic)": 'kari-adjective', "`ku' adjective (archaic)": 'ku-adjective', "`shiku' adjective (archaic)": 'shiku-adjective', "`taru' adjective": 'taru-adjective', 'adjectival nouns or quasi-adjectives (keiyodoshi)': 'na-adjective', 'adjective (keiyoushi)': 'i-adjective', 'adjective (keiyoushi) - yoi/ii class': 'yoi/ii class', 'adverb (fukushi)': 'adverb', "adverb taking the `to' particle": 'quotable', 'adverbial noun (fukushitekimeishi)': 'adverb:adverbial noun;noun:adverbial noun', 'archaic/formal form of na-adjective': 'nari-adjective', 'auxiliary': 'auxiliary', 'auxiliary adjective': 'auxiliary;adjective', 'auxiliary verb': 'auxiliary;verb', 'conjunction': 'conjunction', 'copula': 'copula', 'counter': 'suffix:counter', 'expressions (phrases, clauses, etc.)': 'multiword', 'interjection (kandoushi)': 'interjection', 'intransitive verb': 'intransitive', 'irregular nu verb': 'verb:n-irregular::::', 'irregular ru verb, plain form ends with -ri': 'verb:r-irregular::::', 'irregular verb': 'verb:irregular::::', 'noun (common) (futsuumeishi)': 'noun', 'noun (temporal) (jisoumeishi)': 'adverb:temporal noun;noun:temporal noun', 'noun or participle which takes the aux. verb suru': 'suru verb', 'noun or verb acting prenominally': 'prenominal', 'noun, used as a prefix': 'prefix;noun', 'noun, used as a suffix': 'suffix;noun', "nouns which may take the genitive case particle `no'": 'no-adjective', 'numeric': 'numeral', 'particle': 'particle', 'pre-noun adjectival (rentaishi)': 'pre-noun adjectival', 'prefix': 'prefix', 'pronoun': 'pronoun', 'proper noun': 'proper noun', 'su verb - precursor to the modern suru': 'verb:s-irregular:::su class:', 'suffix': 'suffix', 'suru verb - included': 'verb:s-irregular:::suru class:suru ending', 'suru verb - special class': 'verb:s-irregular:::suru class:-suru ending', 'transitive verb': 'transitive', 'unclassified': '', 'verb unspecified': 'verb:::::'}

Mapping from JMdict POS entities to POS tags.

yokome.data.jpn.dictionary_to_rdbms.UK = 'word usually written using kana alone'

JMdict entity marking a word that is usually written using kana only.

Triggers the insertion of a kana-only row into the lemma table of the resulting database. For entries in JMdict that do not contain this entity in a <misc/> tag, a kana-only row is only inserted if there is no <k_ele/> tag in the entry.

yokome.data.jpn.dictionary_to_rdbms.USAGE = {'abbreviation': ('POS', 'abbr.'), 'archaism': ('frequency', 'archaic'), "children's language": ('speaker', 'childish'), 'colloquialism': ('meaning', 'coll.'), 'derogatory': ('relationship', 'derog.'), 'exclusively kana': ('spelling', 'exclusively kana'), 'exclusively kanji': ('spelling', 'exclusively kanji'), 'familiar language': ('relationship', 'fam.'), 'female term or language': ('speaker', 'f. language'), 'honorific or respectful (sonkeigo) language': ('relationship', 'hon.'), 'humble (kenjougo) language': ('relationship', 'hum.'), 'idiomatic expression': ('meaning', 'idiom'), 'jocular, humorous term': ('meaning', 'joc.'), 'male slang': ('speaker', 'm. slang'), 'male term or language': ('speaker', 'm. language'), 'manga slang': ('speaker', 'manga slang'), 'obscure term': ('accuracy', 'obscure'), 'obsolete term': ('frequency', 'obsolete'), 'onomatopoeic or mimetic word': ('POS', 'on./mim.'), 'poetical term': ('meaning', 'poet.'), 'polite (teineigo) language': ('relationship', 'pol.'), 'proverb': ('POS', 'proverb'), 'quotation': ('POS', 'quote'), 'rare': ('frequency', 'rare'), 'rude or X-rated term (not displayed in educational software)': ('meaning', 'rude/X-rated'), 'sensitive': ('meaning', 'sensitive'), 'slang': ('speaker', 'slang'), 'vulgar expression or word': ('meaning', 'vulg.'), 'word usually written using kana alone': ('spelling', 'usu. wr. in kana'), 'word usually written using kanji alone': ('spelling', 'usu. wr. in kanji'), 'yojijukugo': ('spelling', 'yojijukugo')}

Mapping from JMdict usage entities to usage types and short descriptions.

yokome.data.jpn.dictionary_to_rdbms.WRITING = {'ateji (phonetic) reading': 'ateji', 'gikun (meaning as reading) or jukujikun (special kanji reading)': 'gikun/jukujikun', 'irregular okurigana usage': 'irregular okurigana usage', 'old or irregular kana form': 'old/irregular kana form', 'out-dated or obsolete kana usage': 'outdated/obsolete kana usage', 'word containing irregular kana usage': 'irregular kana usage', 'word containing irregular kanji usage': 'irregular kanji usage', 'word containing out-dated kanji': 'outdated kanji'}

Mapping from JMdict writing style entities to short descriptions.