Inline Graphviz 2013-09-24


This post by David King on inline Graphviz graphs inspired me to write my first Rack middleware in a long time. David's code uses PHP on the server-end coupled with javascript to parse out Graphviz data from script tags, and then render the graphs with the appropriate Graphviz tool and serve up the image. It's a great solution for "graphviz-as-a-service" if you want a really generic way of service up Graphviz graphs from "anywhere".

However for my own use, I don't tend to like doing stuff like this client side, and I don't want to run PHP when the rest of my blog is all Ruby. I also don't need a very generic solution.

My blog runs through a multi-stage pipeline that parses an augmented Markdown variation anyway, so I decided I wanted a solution that does it all server-side and simply caches the output.

So I figured I'd like something that let me pass in the graph that my filters extract, and get either a url to an inline graph back, or optionally inline SVG, combined with a Rack middleware to handle requests for the graph image (if not using the inline option) and render it and cache it in a file.

First a couple of examples

The dot code:


        digraph G {
             graph [
               truecolor=true
               bgcolor="#FFFFFF00"
             ];
    
             edge [  color=blue; ];
             node [  style=filled; fillcolor=blue; ];
    
            this -> that -> more;
        };

And the inline graph:

And another example using a record:


    digraph structs {
    
        node [  style=filled; fillcolor=green; ];
        node [shape=record];
            struct1 [label="<f0> left|<f1> middle|<f2> right"];
            struct2 [label="<f0> one|<f1> two"];
            struct3 [label="hello&#92;nworld |{ b |{c|<here> d|e}| f}| g | h"];
            struct1:f1 -> struct2:f0;
            struct1:f2 -> struct3:here;
        }
                            

And the resulting graph:

Generating the HTML and Graphviz graphs

You can find the code below in this GIST rather than trying to cut and paste it together.

Lets dive straight in:


    require 'digest'
    
    module GViz
    
      CACHE_PATH  = "/tmp/gviz/"
      ENGINES = %w{dot neato twopi circo fdp sfdp}
    
      XSLTPROC = `which xsltproc`.chomp
      notugly = File.dirname(__FILE__)+"/notugly.xsl"
      NOTUGLY  = (File.exists?(notugly) && XSLTPROC != "") ? notugly : nil
      CONVERT = `which convert`

The CACHE_PATH is where we'll store the cached images. ENGINES is a list of the Graphviz tools to use for layout/rendering.

The rest of this chunk is used to detect the pre-requisites to process the Graphviz output using my XSL for prettying up Graphviz SVG output - it requires the notugly.xsl file from those articles, as well as xsltproc. If you also want to be able to convert the cleaned up SVG to PNG, it requires a convert equivalent to that from ImageMagick (if my XSL isn't used, the code will get Graphviz to render directly to PNG).


      class Graph
        def initialize(graph)
          @graph = graph
        end
        
        def slug
          # The MD5 hash is not for security, since I'm not accepting posted graphs,
          # nor keeping any private graphs. 
          @slug ||= Digest::MD5.hexdigest(@graph)
        end
    
        def write(fname)
          `mkdir -p #{CACHE_PATH}`
          File.open(fname,"w") do |f|
            f.write(@graph)
          end
        end
    
        def cache
          fname = "#{CACHE_PATH}#{slug}.dot"
          write(fname) if !File.exists?(fname)
          slug
        end
        
            
        def self.dot_file(slug)
          "#{CACHE_PATH}#{slug}.dot"
        end
      end

This small class just contains the code needed to cache the graph somewhere. The graph needs to be available in a file when rendering it, but if you want to store the extracted graph fragments somewhere else, like in a database, replacing this class is an easy way - only the cache and Graph.dot_file methods are called from elsewhere.


      class Cache
        def initialize hash,engine,format
          @hash   = hash
          @engine = engine
          @format = format
    
          raise "Unsupported layout" if !ENGINES.member?(@engine.to_s)
        end
    
        def layout(format, src, dest)
          system("#{@engine.to_s} -T#{format.to_s} #{src} -o #{dest}")
        end
    
        def xsl(src,dest)
          system("#{XSLTPROC} --nonet #{NOTUGLY} #{src} >#{dest}")
        end
    
        def convert(src,dest)
          system("convert #{src} #{dest}")
        end

The above methods layout, xsl and convert covers the conversion. Note: You want to make sure that if you expose this via a web server, that format is sanitized - in my Rack code, I limit this to the strings "png" and "svg". For that matter, this applies to src and dest too, but in my code those never comes from the client.

This is just used to determine the target cache filename:


        def target_file(format=nil)
          format ||= @format
          CACHE_PATH+@hash+".#{format.to_s}"
        end

This handles the rendering, including conditionally applying my XSL:


        def render(format=nil)
          format ||= @format
          src  = CACHE_PATH+@hash+".dot"
          dest = target_file(format)
    
          if (format.to_sym == :svg && NOTUGLY)
            layout(format,src,dest +".tmp")
            xsl(dest+".tmp",dest)
          elsif NOTUGLY && CONVERT != ""
            layout(:svg, src, dest+".tmp")
            xsl(dest+".tmp",dest+".tmp.svg")
            convert(dest+".tmp.svg",dest)
          else
            layout(format,src,dest)
          end
          File.read(dest) rescue nil
        end

If you want to change how the URL is formatted (but note if you want to use the Rack handler, you need to change a Regexp for that as well:


        def url(url_base)
          "#{url_base}#{@engine}_#{@hash}.#{@format}"
        end

For any external images, we create an image link:


        def img_link(url_base)
          "<img src='#{url(url_base)}' class='gviz' />"
        end

While if the format given is :inline_svg, we render it with :svg, and then read back the generated SVG and strip off the XML declaration and DOCTYPE:


        def to_html(url_base="/images/gviz/")
          return img_link(url_base) if @format != :inline_svg
          svg = render(:svg)
          svg = svg.split("\n")[2..-1] # Strip xml declaration and doctype
          svg.join("\n")
        end

This simply reads the final cached file,


        def file
          file = File.read(target_file) rescue nil
          return file if file
          render
        end
    
      end
    
      #
      # Cache the graph, and return either a link, or inline SVG depending
      # on the preferred format
      #
      def self.graph_to_html(engine,graph,preferred_format=:inline_svg, url_base="/images/gviz/")
        hash = Graph.new(graph).cache
        Cache.new(hash,engine,preferred_format).to_html(url_base)
      end
    end
    

Rack Middleware

I put the above in lib/gviz in my blog app - adjust the require accordingly if not:


    
    require 'lib/gviz'
    
    require 'rack'
    
    module GViz
    
      class Controller
        def initialize app, base = "/images/gviz/"
          @app = app
          @base = base
          @uri_regexp = /^(#{ENGINES.join("|")})_([0-9a-f]{32}).(png|svg)$/
        end

The regexp above is used to extract the engine and an MD5 of the cached graph file, and the image format from the URL if you use server side rendered images.

If the URL doesn't start with the specified base, pass the URL on to the next in the chain:


        def call env
          uri = env["REQUEST_URI"]
    
          if (uri[0.. @base.size-1] != @base)
            return @app.call(env) if @app
            return Rack::Response.new("(empty)").finish
          end

If it doesn't match the Regexp, give an error - someone is being naughty:


          match = uri[@base.size .. -1].match(@uri_regexp)
          if !match
            return Rack::Response.new("Invalid URL",403)
          end

Otherwise, get the file from the cache:


          engine = match[1]
          hash   = match[2]
          format = match[3]
    
          cache = GViz::Cache.new(hash,engine,format)
          file = cache.file
          return Rack::Response.new("No such graph",404) if !file
    
          r = Rack::Response.new
          r["Content-Type"] = format == "png" ? "image/png" : "image/svg+xml"
          r.write(file)
          r.finish
        end
      end
    
    end

As example of how to call it from your code, my markdown filter calls this:


        GViz.graph_to_html(format, graph, :png)

... to generate the HTML for any graphviz graphs embedded in my pages. Since I've specified PNG output, it will cache the graph to file if not already present, and just outputs an image link.

And my config.ru simply includes this:


       use GViz::Controller

.


blog comments powered by Disqus