As most bloggers I like to keep an eye on where my traffic is coming from, and especially when there are surges in traffic. I'm using both Google Analytics and Feedburner for stats, and it works great for trends, but not see what's happening right now.
This morning I needed a distraction and figured I'd just throw together a quick and dirty
Rack middleware class to keep track of the latest referrers.
What I ended up doing was keeping a rolling buffer in an array that holds the last N referrers, and generate a histogram from that as needed. I'm not interested in accuracy, since I have the logs + Google Analytics + Feedburner to get the daily totals, so I didn't bother persisting the buffer to disk or anything - if I restart my app the stats will reset. This is just to get a live image of what's going on right now.
The downside of that is that this approach does
not scale beyond a single process. If you want it to, you really do want to persist the data to a database or something, though adds a lot of overhead. Maybe I'll do that next - it's easy, but until my blog has a lot more traffic I don't really have the motivation.
Here's the class (yes, I know referrer is misspelled, but it matches the HTTP header):
module LatestReferers
class Gather def initialize app, opts = {}
@app = app
@referers = []
@limit = 100
@exclude = []
opts.each do |k,v|
@limit = v.to_i if k == :limit
@exclude = v if k == :exclude
end
end
def call env
ref = env["HTTP_REFERER"] || "-"
req = env["REQUEST_URI"]
if [email protected]{|pat| req = pat || ref = pat }
@referers << [ref,req]
@referers.shift if @referers.size > @limit
end
env["hokstad.latestreferers"] = self
@app.call(env)
end
def histogram
h = {}
@referers.each do |ref,req|
h[ref] ||= {:total => 0}
h[ref][req] ||= 0
h[ref][req] += 1
h[ref][:total] += 1
end
h.sort_by{|ref,pages| -pages[:total]}
end
end
end
In turn:
- #initialize takes the next app and a hash of options. Currently it recognized :limit, which controls how many referrers it will track, and :exclude which takes an array of regexp's to check against both the request uri and referrer for patterns to reject - I'm not interested in local referrals internally on my site, or referrals to the page I use to view the referral stats.
- #call just gets the fields, checks them against the patterns, and i they don't match, it adds the referrer and page to the end of the array and removes the first if it exceeds the limit, to create a FIFO queue.
- #histogram creates a sorted hash of hashes mapping a referrer to page names and the number of times each page has been accessed, plus a total.
#call passes the object on in the environment. I do this to reduce coupling - you can then choose to render the page in the framework of your choice if it has a rack adapter and allow you access to the environment, using a simple Rack middleware adapter such as the one I'll show below, or writing your own. Since it depends only on Rack, you can put this in front of most Ruby frameworks, including Rails if you so choose.
The class above can be plugged in by requiring the file you put it in, and adding something like this to your config.ru file if you use Rackup, or by adding the class to whatever Rack setup you use:
use LatestReferers::Gather, {:exclude => [ /\/referers/, /http:\/\/www\.hokstad\.com/, /\.xml/, /\/feed/, /\.rdf/ ]}
The above is the config I use for this site.
If you just want a simple table of the results, you can use something like this. I just want the numbers, I don't care how the page looks:
module LatestReferers
class View
def initialize app, page
@app = app
@page = page
end
def show(ref)
return Rack::Response.new("Missing 'latestreferers' object",500).finish if !ref
r = Rack::Response.new
r.write("<html><head/><body>")
r.write("<table border='1'><tr><th>Referer</th><th>Pages</th></tr>\n")
ref.histogram.each do |k,v|
r.write("<tr><td>#{k}</td> <td><table>")
total = 0
v.sort_by{|page,count| -count}.each do |page,count|
r.write("<tr><td>#{count}</td><td>#{page.to_s}</td></tr>")
total += count
end
r.write("</table></td></tr>\n")
end
r.write("</table></body></html>")
r.finish
end
def call env
if env["REQUEST_URI"] == @page
show(env["hokstad.latestreferers"])
else
@app.call(env)
end
end
end
end
That serves as a simple example of using Rack::Response too - it's completely optional, and you can stream out any template from your favorite templating system instead of hardcoding the HTML, but for this I just wanted something with no other external dependencies than Rack.
There's probably a lot of things I
could do to the view code, but it's a throwaway hack - I just want to be able to see at a glance if anything interesting is happening. If you want a pretty page, it's easy enough to use the above as a starting point.
You can see the
live result of using the above classes here with this config (expect it to be reset quite often, and I only track the last 100, so don't expect it to show a huge list):
use LatestReferers::Gather, {:exclude => [ /\/referers/, /http:\/\/www\.hokstad\.com/, /\.xml/, /\/feed/, /\.rdf/ ]}
use LatestReferers::View, "/referers"