Archive for the 'Design' Category

Dare mighty things

I’m in the midst of sketching out what objects I’m going to need (user, person, family, message, image, relationship, etc.), along with what methods each will have. It’s elucidating. I’ve also started mapping out a URL schema (which will end up being tightly knit with the methods just mentioned).

You know, this time around, everything feels so much easier. Not that I got very far with my previous attempt, but even so, the path is clear where it was dim and murky before. And I can fit the whole thing in my head, which was something I was unable to do before. More importantly, I’m almost 100% sure now that I can do this. The mammoth project is now feasible.

But of course the doubts come flying at me every few days. That’s why I’ve started reading these quotes every day:

You must do the thing you think you cannot do. – Eleanor Roosevelt

Far better is it to dare mighty things, to win glorious triumphs, even though checked by failure…than to rank with those poor spirits who neither enjoy much nor suffer much, because they live in a gray twilight that knows not victory nor defeat. – Teddy Roosevelt

It makes a huge difference. :)

A new look

Here’s a sneak peek at what Beyond is morphing into:

New Beyond Look

(Just a mockup, by the way. We’re not quite that far yet, nor is this necessarily going to be the final graphic design for the site. :))

Flickr for genealogy

This changes everything.

Tim’s comment about a “Flickr for genealogy” has sparked a flurry of thoughts in my head, and here’s what I’ve decided.

1. Web all the way. Meaning, to pull this off, it has to be a child of the Web, not a desktop app that happens to be on the web. Think social networking. That’s going to be the focus, not so much the data. (But don’t worry, the data will still be important.)

2. Simple. While being able to store anything and everything is a noble ideal, I think it’s only going to work (at this stage) if the fields are set (the core data), with expandable metadata later on.

3. Low barrier to entry. It has to be really easy to sign in and get started. To host this, we come to…

4. A startup. I never really thought I’d do this, but I think it’s time to create a startup to host Beyond. It’s going to be a service rather than a software package. (For now, that is. Later on I’ll wrap the software into an install-on-your-own-server deal, but that’s secondary.) Time to figure out a good business model…

5. Web 2.0. Kind of in line with #1, Beyond will now be even more part of the Web 2.0 paradigm. For example, individuals and families will have permalinks, so that you can tell people about them. You’ll be able to share whatever parts of your family tree you want to. There’ll be a simple blog built-in (to function as a research log, basically), but you’ll also be able to pull in an RSS feed from your existing blog instead if you want, and you’ll be able to post to your blog directly from Beyond (via the various blogging APIs). There’ll be tags, watchlists (RSS feeds, that is), comments (on individuals, families, your profile, etc.), the works.

6. Community. I think a lot of the stuff in #5 will help with this. As for linking people together, I’m thinking about something along the lines of LibraryThing’s “works.” Basically, you have John Doe born in 1801, and user B has a John Doe born in 1801, and the system automatically picks them up as a possible match. You can say, “My John Doe is the same as user B’s John Doe.” And it’ll keep track of how many people say so-and-so is the same as so-and-so. I haven’t figured out all the logistics yet, but this seems like the right way to go. And I hesitate to have the system actually match anyone together; I’d rather leave the logistics of that to the humans.

7. Share. So far I’ve been in the genealogy-is-private mindset, but I think I’ve finally gotten rid of that. In this new world, it’s about sharing. Sure, you’ll be able to mark your data private if you want, but the default will be to share, share, share.

8. People, not pedigree. Up till now, my mindset has been that there’s this structure out there — a pedigree — and you make people fit into the pedigree. It doesn’t always work, though, because people got remarried, etc. When looking at Flickr last night for inspiration, however, I realized that there’s a better way to go about it. (Well, I think it’s better. Only time and a prototype will tell. :)) Individuals are like photos, and families are like groups. Instead of putting things into pedigrees, you add individuals (like adding photos), and then you can sort them into families (like putting photos into groups or sets, with predefined roles like “father,” “mother,” “child”). To link generations together, you just put the linking person into both families. (For example, Hoover Macgillicuddy is the father of Family A. His parents are Wilford and Maretta Pinegar, who are in Family B. You just have to add Hoover as a child in Family B.) And it’ll automatically stitch together the pedigree for you. What this means is that you won’t be doing most of your work from the pedigree. Instead, you’ll work from the individual and group lists (using browse and search). It’s different, to be sure, but I’ve got a gut feeling that it’s a good change. We’ll see.

9. A deadline. I’ve got a feeling that I can make this happen by the end of August. I don’t know yet if that’s utter madness or not, but heck, there’s a rumor floating around that Flickr went up two weeks after the initial idea. If they can do it, so can I. (And if the rumor’s false, well, that’s not going to stop me. :))

10. A name. Somebody’s already got www.beyondproject.com. I do rather like the name Beyond, so it’s time to come up with some variation with a dot-com ending (kind of like how Backpack is at www.backpackit.com). Hmm…

On relationships

I’ve been working on Beyond for the past five hours or so. Worked on the database schema, realizing I’d forgotten about translations and a few other things (tags). Split the schema into four separate stages, which roughly parallel development on the program. Then I started work on a little Ruby on Rails app to create the database (via migrations) and populate it (by loading an XML file created from a GEDCOM). So far it’s working okay, and it’s giving me an opportunity to rethink some decisions.

As it stands, the current model has everything (name, gender, UID, etc.) as a characteristic which gets linked to the person via a relationship. So the People table itself only stores the ID, really. This means lots of characteristics and even more relationships. Hmm… The flexibility of the current relationships table means I can relate any two records in the database (two events, or an event and a picture, for example). But is that even a good idea? I guess my main concern is having a huge, unwieldy relationships table. We’ll have to see if the benefits of flexibility outweigh the downsides.

Anyway, now that I have some real data, I’ll be working on integrating it with the mockups (pedigree and so on). And then, after I figure out a good navigation scheme, you’ll be able to load a GEDCOM and view it online. Small steps. :)

Data model redux

Having thought about it some more, I’ve decided to go with a loose data model. The advantage is flexibility: people will be able to store whatever they need to, in a way that makes sense to them. As far as interoperability, yes, I think there will be some problems, but I have a feeling they won’t be too bad, especially with the template system (where there’ll be “expected” fields, like “First Name” and “Gender” and such). Another concern is data analysis, but I have some ideas on that.

Let me see if I can describe this clearly. When you go to the detail page for an individual (or a family, or an event), you’ll be able to organize the information about that person into groups (”Vital Events,” “Other Information,” whatever you want to call them). Each group contains items, which are either metadata (key/value pairs like “Hair color”/”brown” or “Religion”/”Baptist”), events (which in turn contain dates, places, and other metadata about the event), or research items (to-do lists, images, files, notes, etc.). You can order the groups and items any way you want via drag-and-drop.

I’m not entirely sure this’ll even work, but I’ll go forward with it and see how the tests go. If it flops, I’ll fix it. Collaboration could get interesting… Granted, I don’t think everyone is going to move things around all the time, and I suspect that most will stick with the standard template (vital events first — birth, christening, death, burial, etc. — and so on). But the flexibility’s there for those who need it. If you want to add a “Got Eagle Award” event, you can. If you want to add a to-do list for a particular ancestor (or a particular family), you can. If you want to add a table with census results from 1830 to 1870 for your ancestor, you can put it right there with the rest of his data, if that helps you. (And if it doesn’t help you, you can put it on a research page instead, keeping things separate.) My goal with Beyond is to set up as loose a framework as possible, just the foundation, and then get out of the way and let users do things the way they want to do them.

A handful of ramblings

First off, I’ve been reading the GENTECH Data Model spec. I read it years ago, when I was working at Ancestry, but time erases a lot of details. :) Anyway, it’s interesting food for thought. I don’t think I’ll end up adopting it (at least not wholesale), but it does have a lot of good ideas. I like the idea of being able to associate dates and places with characteristics (so you can say, “John Smith was a farmer from 1730 to August 1749 in Hartford, Connecticut”).

OpenID caught my interest today, primarily because of the easy sign-in capability. I’m still not entirely sure how it works, or if it’s even desirable for Beyond (genealogy may be a touchy area as far as that goes), but it’s definitely an option. I do plan on having MicroIDs implanted in the header of user’s pages, which’ll make it easy to use claimID to say “This is my genealogy.”

Coding-wise, I took some Ruby code to convert GEDCOM to XML and started writing some classes which convert the XML to Ruby objects. (Eventually the XML will disappear, of course; this is just a temporary hack to get some data to work with.) Once that’s done, I’ll write code to import the Ruby objects into the database, and then it won’t take long before the prototype goes live.

Speaking of the database, I’m almost done drafting the data model (and GENTECH is influencing a few things here and there). One issue I came up with a tentative solution for is that of sources. Ideally, you should be able to add a source to any bit of data that could reasonably have a source. So, to that end, I think the Sources table is going to be open-ended — instead of having a set list of types, there’ll be an “object_id” field and an “object_table” field, which means I can add a source to anything that shows up in a table.

But that’s still not enough, at least not yet. For example, events will include fields for the date and the place, preventing a source from being added specifically for the date (or the place), and instead forcing it to be added for the event as a whole. Hmm. The idea of being able to source everything is really nice, but is it feasible without turning the database into a mess?

One last thing. For storing information about individuals, I’m thinking about using a similarly flexible system: everything gets stored as a key/value pair. So instead of having set fields, you’d just add a “first_name” key and fill in the value. If there’s no middle name, you don’t have to add a middle name. The advantage is that you don’t have to use fields you don’t need, and you can use other fields that you do need (and that I’ve never heard of). The disadvantage comes in displaying the information and ordering it into groups. But maybe I could include a groups table, so you could put all the name information and gender and birth/death information into a “Vitals” group (or whatever you want to call it). Hmm… I’m considering the possibility of using templates to make this kind of thing easier for newbies — a scaffolding with common keys already in place for you to use.

End brain dump. :)

One small step for man

Last night as I was sitting in front of the Manti temple waiting for the Mormon Miracle Pageant to start, I realized I had a perfect opportunity to work on the database design. And so I did. I’ve got enough of it done that I can start implementing it and get some simple prototypes up and running. And finals are now officially over, so I’ll have much more time to work on this. :D

Coming up with titles for these little updates is hard. ~sigh~

Dates and stuff

I’ve been reading the W3C’s internationalization (i18n) page, since Beyond will be a global app almost from the get-go. It’s really important that it support not only other languages, but also the cultural aspects that come with them — date and time formats and script direction, for example.

As far as dates go, don’t forget that genealogy often deals with incomplete dates — just a year (”1775″), or just a month (”March”), or ranges (”before 1933″, “about 1854″). This makes date storage just a wee bit tricky, since the standard SQL date format wants more detail. Storing as a string takes care of the flexibility, but searching by date then becomes harder if you include ranges on the search form (1816 +/- 5 years). I’m still not sure how this’ll work. (Gee, I seem to be saying that a lot lately, don’t I. :))

And back to the translation issue: because different languages take up different amounts of space (Arabic is more compact than English, which is more compact than Finnish and German), designing the page can lead to headaches. It’s something you have to keep in mind the whole time. (Unless you’re only developing for English. But I’m not.)

Finally, as for dealing with the actual translating of the app, I’ll probably put together a simple web interface to the database, which volunteers can then access to translate the various terms. (We do something similar here at work, where our main site — http://immigrants.byu.edu/ — is available in six different languages.)

Data models

Yesterday Hilton blogged about the Genesis Data Model, which includes the Genealogy Core and Genealogy Provenance ontologies. I hadn’t really looked much at RDF until I read this, but it seems like a pretty good idea for Beyond’s interchange format.

For example, named graphs allow you to say, “All of this information came from this source.” I’m still not entirely clear on how redundant that might get, but it does allow you to stitch together what each source offered.

I checked out Practical RDF from the library and will be reading it once finals are over. Grokking RDF will no doubt help a lot with figuring out just how it’ll be useful in Beyond’s world. I do very much like the idea of the semantic web. More on this later. Much more. :)

I’ve also started sketching out the Beyond data model. Once I’ve established its requirements, I’ll better be able to tell whether any existing models (including Genesis) fit the bill, and if none do, which of them might be similar enough to alter instead of starting from scratch.

Now to get back to studying…

The database skeleton

I’m starting to design the database. Not having studied a lot of database theory, I find it slightly intimidating, especially with possible scalability concerns in the future. But I’m not going to let that paralyze me. If it ends up having problems, then I’ll fix them and move on. Perfection is rarely achieved on the first try. It’s better to get something up and then tweak and iterate until you get it right.

It is tempting to try to read through some database theory books before I get too far, but I’m going to resist, if only because I know that’ll stop me dead in my tracks. Better to do it in parallel. (I’m saying this more to persuade myself than for any other reason. :))

As a brief overview of what’s going to go into the database, here are the objects/entities I’ve come up with so far which need to be represented, and I’m guessing each will be its own table:

  • Individuals
  • Families
  • Sources
  • Research pages
  • Media
  • Users
  • Files
  • Translations

And some thoughts on each:

Individuals

Has a collection of metadata associated with each record. Need to be linked somehow.

Families

Basically the same as individuals (collection of metadata). I’m not sure if these will be linked or if we’ll rely on the individual linkings.

Sources

Still not sure if I want to go with an elaborate citation system or if it should be more free-form.

Research pages

Each page can have lists, notes, pictures, other media, etc. Not sure how to store it.

Media

Pictures, etc. Right now there’s overlap between this and the research pages. I’m not sure yet how I’ll resolve that. Research page pictures will probably end up being more pictures of documents and such, whereas media will be pictures of individuals and families. Maybe. Hmm…

Users

List of users on the system. Usernames and passwords, first and last name, last login, that kind of stuff.

Files

A file can be a local Beyond store (on that server) or a link to another Beyond server (or a PhpGedView server or Family Tree, for that matter). (If you want to open a PAF or GEDCOM file on your hard drive, will it convert it to a local Beyond store first? Hmm…) You can have more than one of each, of course.

Translations

Support for internationalization.