Dates and stuff

I’ve been reading the W3C’s internationalization (i18n) page, since Beyond will be a global app almost from the get-go. It’s really important that it support not only other languages, but also the cultural aspects that come with them — date and time formats and script direction, for example.

As far as dates go, don’t forget that genealogy often deals with incomplete dates — just a year (”1775″), or just a month (”March”), or ranges (”before 1933″, “about 1854″). This makes date storage just a wee bit tricky, since the standard SQL date format wants more detail. Storing as a string takes care of the flexibility, but searching by date then becomes harder if you include ranges on the search form (1816 +/- 5 years). I’m still not sure how this’ll work. (Gee, I seem to be saying that a lot lately, don’t I. :))

And back to the translation issue: because different languages take up different amounts of space (Arabic is more compact than English, which is more compact than Finnish and German), designing the page can lead to headaches. It’s something you have to keep in mind the whole time. (Unless you’re only developing for English. But I’m not.)

Finally, as for dealing with the actual translating of the app, I’ll probably put together a simple web interface to the database, which volunteers can then access to translate the various terms. (We do something similar here at work, where our main site — http://immigrants.byu.edu/ — is available in six different languages.)

    Comments on “Dates and stuff”:

  1. Permalink to this comment Hilton

    I’ve been looking at the date problem as well. I have a scheme that I think will work. Let me know what you think:

    I want to store the dates in two parts: the date, and a qualifier. The date can be just a year, a year and month, or a year month and day (time could also be included but probably not). The point is that it goes from general to specific (although there may be value to only saying the day and month?). Anyhow, then the qualifier is “exact,” “about,” “before,” “after,” or any number of others.

    Then I’ll store the dates as a string of the form YYYYMMDD (partial dates would be YYYY and YYYYMM). Then it’s a simple string search to find date ranges. For my implementaton I’m using Lucene to index the data, which does dates like this anyhow.

  2. Permalink to this comment Hilton

    One thing I’m really not sure about is calendar systems. Should it be canonicalized? Should there be a calendar specification?

  3. Permalink to this comment Ben

    I like it.

    As for whether there’s value in saying only the day and the month, I’m wondering how often it actually shows up. In theory, yes, it’s important. In practice? I don’t know. It’s possible that you could come across a record saying Great Grandpa Jones’s birthday was 16 October, but without specifying a year. But how often do those kinds of things come up? Part of me says, “If it’s quite uncommon, don’t worry about it. 80% majority rules.” But then the perfectionist in me hollers back, “What?!? It had better be as thorough as you can make it!”

    I’m thinking about the possible issue of negative years (B.C., that is). Granted, most genealogists probably haven’t gotten back that far (and perhaps never will), but then again you have Chinese lines that go back thousands and thousands of years. Hmm…

    Which brings us to the calendar question — it seems like the only way to do it is store dates internally in a single set calendar system, and then translate them to and from whatever the user wants to use (Chinese, Jewish, Revolutionary, etc.). I’ll have to look into it in more depth to see how different the other calendars are. Whatever the solution is, it needs to be easy for the user to use.

  4. Trackbacks/Pingbacks:

Leave a Reply