Where to go from here¶
Yokome can be extended and improved in a multitude of ways. Some ideas:
Extend to more languages
Improve the language model
Try out a character-based language model (might work well with Chinese and Japanese, since characters are semantically very rich in those languages)
Provide personalized example sentences based on a user language proficiency estimation
Provide images alongside the glosses
Improve support for sentences spanning multiple HTML elements as well as rotated text
Performance considerations:
Block off stopwords
Denormalize the dictionary database
Make better use of underlying database optimization techniques (espc. caching)
Precompute tokenized sentences / disambiguation results
Based on recency (starting from the top of the page)
Based on word frequencies in the corpus
Based on the estimated proficiency of the learner, expressed as a word-frequency range
Based on structural elements (headings, links), text size, color, …
Improve mouse pointer localization using a binary search on elements
Trade disambiguation accuracy for faster processing: Use windowed inputs to the language model instead of a recurrent neural network
User interface:
Add loading indicators
Provide better data on entries
All headwords
More user-friendly presentation of POS tags
Restrictions and notes for glosses
Make the Yokome infobox’s style independent from the webpage it is displayed on