Feb 17

Dev Blog For www.OpenKanjiDic.com started

In which I talk about my ideas on discovering Kanji, start my development blog and give one-third-way up in my endeavour to finish a design document at 8 am.

So for documenting the process of implementing www.openkanjidic.com, here's the new and shiny development blog. I guess there's actually a lot to be said about my plans for the page and how it should work, or not work.

The reason there's already an online version, with the ability to sign up, log in and browse the kanji database is, that i firmly believe in always having a running version of your project. Sadly this means that some poor souls will be subjected to the unfinished site.

So to get started somewhere, here's a bit of an overview what's to be expected under those tabs.

The Kanji Tab

I like Kanji, and I like browsing through them. That's what can be done right now. But that's not really useful. What I like to do is having an interface similar to jlex.org, however I am thinking that it should be possible to reduce the initial selection from the overwhelming matrix seen on their search to 5 to 7 options at a time.

The information given by the grapheme database (kradfile) certainly makes this possible, though it's one of the more interesting problems to approach. The idea behind this is to reduce the number of choices one has to make to at each step. The human mind has problems handling more than 5 to 7 pieces of information at the same time. On the other hand the number of steps to actually find the kanji should still be as minimal as possible.

In the end it might turn out to not be a good idea, but finding out mistakes is half the fun.

Similar Approaches

Of course I'm not the first one to think of this. For example the quite interesting tatoeba.org project offers this search: Hanzis - Kanjis search. It seems to be similar in spirit, let me quote:

This tool allows you to find information about kanjis/hanzis, especially when you don't know how to input them directly with IMEs. The main way to use this is by submitting subglyph of the character.

They are using data compiled on a the Commons: Chinese characters decomposition page over at wikimedia.org. I think it will be interesting to see, whether this can be improved. (Of course, I do think so.)

Dictionary Cross Reference

The list of dictionary entries available under the page for each kanji is mildly useful at most. Already the results are sorted in the order of complexity of the kanji element of the entry, thus at least delivering a few of the more important entries directly at the top. However no ones going to read through those lists at all, and honestly why?

Once you have understood the way certain grammatical constructs work, e.g. languages are composed by the country with the kanji for language those entries aren't that interesting at all ... what can be done about that?

  1. Nothing at all Maybe it's just a nice to have feature, listing those dictionary entries there, but I should invest time in other things first.

  2. Add filters for grammatical constructs. E.g. find all transitive (vt) or intransitive (vti) verbs with this. る and れる come to mind. With complexity filters this might be interesting.

I guess for now, I'll not pour too much thought into this, as there other tabs to fill, the kanji selector to write ... and so on.