Family tree using Graphviz and Ruby 2009-05-23


My dad spent a lot of time putting together a family database, currently containing about 12000 people covering both my parents ancestors as well as tracking forward to contain a lot of living descendants. Unfortunately, since he started this over 16-17 years ago it's been managed as a custom dBase III+ app, and the code grew by accretion over at least 7 years (until my father died). Spurred on by an e-mail from a possible distant relative (who it turns out I've even met) I finally dumped the dbf files into an Sqlite database and put together a few scripts to generate diagrams from it.

Here's an example (click to enlarge). The birth/death dates are in Norwegian format (day.month.year)

Ancestors of Ole Martin Hokstad

The full SVG diagram (which is zoomable in Safari and Firefox) is here.

This tree shows the known ancestors of Ole Martin Hokstad - the first person amongst my direct ancestors to be born to the Hokstad name (there are one or two other families where the name Hokstad was taken at different times). In our case the name stems from the Hogstad farms in Frosta, near Trondheim, Norway. The farms kept being divided as a result of children inheriting parts etc.. 
At one point the farm Lille-Hogstad was bought by Ola Viktil, and one of his grandsons, Peter Magnus Hokstad Johansen combined two smaller properties to Hogstad Lille Vestre in 1854, which was then renamed Hokstad (presumably he didn't like the thought of the name Peter Magnus Hogstad Lille Vestre Johansen). His children, including my great-grandfather Ole Martin Hokstad, got the name by birth.

The tree above doesn't show any siblings, and leaves out a few people we don't have any certain information about. I did render one of all my know ancestors as well, but it's too huge to be practical to reproduce here (about 20 times the size of the tree above).

To produce this I put together a very quick and dirty little Ruby script:


    require 'model'
    require 'set'
    
    id = ARGV[0].to_i
    
    # Prevent double inclusion of a node                                                                                                                       
    $memo = Set.new
    
    def filter_node(per)
      return nil if !per
      return nil if per.firstname.strip == "?" || 
             per.lastname.strip == "?" || 
             per.maidenname.strip == "?"
      return per
    end
    
    def node (per,color)
      return false if $memo.member?(per.pk)
      $memo << per.pk
      name = [per.firstname, per.middlename, 
                    per.lastname, 
                    per.maidenname]
      name = name.collect do |n| n && n != "" ? n : nil }.compact.join(" ")
      label = "#{name}\n#{per.birthdate} - #{per.deathdate}"
      puts "   p#{per.pk}  [ shape = box, style=\"filled\","+
              " fillcolor=\"#{color.to_s}\", label=\"#{label}\" ];"
      return true
    end
    
    def ancestors per
    
      father = filter_node(per.father)
      mother = filter_node(per.mother)
    
      pk = per.pk
      arrowhead = "normal"
      if mother and father
        merge = "m#{mother.pk}and#{father.pk}"
        if !$memo.member?(merge)
          puts " p#{merge} [ shape = point ]"
          puts " p#{merge} -> p#{pk} [ arrowtail=none ]"
          arrowhead = "none"
        else
          $memo << merge
        end
        pk = merge
      end
    
      if father
        if node(father,:green)
          puts "   p#{father.pk} -> p#{pk} [ arrowhead=#{arrowhead} ]"
          ancestors(father)
        end
      end
    
      if mother
        if node(mother,:gold)
          puts "   p#{mother.pk} -> p#{pk} [ arrowhead=#{arrowhead} ]"
          ancestors(mother)
        end
      end
    end
    
    
    def graph per
      puts "digraph ancestors {"
      node(per,:red)
      ancestors(per)
      puts "}"
    end
    
    per = Person[:id => id]
    if !per
      puts "Unable to find #{id}"
      exit
    end
    
    graph(per)

I'm not going to spend a lot of time going through the script, other than to point out the dependencies if you want to try this for yourself:

You need to create a class with the methods #pk that returns a unique key suitable to be part of a Graphviz dot-file node name, #father and #mother that returns an equivalent object for the father and mother respectively or nil if not known, and methods #firstname, #middlename, #lastname and #maidenname respectively that returns the names as strings. Whether it comes from a database or not is irrelevant - you can load it all into memory first if you like. In my case it's all from a Sequel model, as you can see I retrieve a Person object for the id provided as the root of the tree at the end of the script. 
I don't think I'll put in much effort to make this a generic package, but it should be easy enough to adapt if you know some Ruby. I will probably post a couple of variations to add output of siblings and also to generate an equivalent one for descendants instead of ancestors though.

I then use this little bash script to generate the SVG file (requires xsltproc)


    #!/bin/sh
    
    ruby ancestors.rb $1 >/tmp/$1.dot
    dot -Tsvg /tmp/$1.dot >/tmp/$1.svg
    xsltproc /opt/diagram-tools/notugly.xsl /tmp/$1.svg >$2

This assumes my diagram-tools GIT repository has been cloned into /opt/diagram-tools (git clone git://github.com/vidarh/diagram-tools.git /opt/diagram-tools), to pretty up the Graphviz output.




blog comments powered by Disqus