5 min read by Zeke Nierenberg

Hindsight: Contextual Logging at Fraight

A few months ago, we decided to do something a little bit risky. We invested meaningful engineering resources into something far afield from our core product. We built a tool to analyze our application logs. It turned out to be one of the biggest development productivity wins we’ve had. In this post I’ll cover what, why and how we built Hindsight. I hope that publishing this post is the first step to open sourcing Hindsight. Right now, its far too coupled to our stack and our needs. But if you’d like to help us build a more abstract tool, we’re hiring ????.

What

When our software sends a message to a truck driver, what happens? Which queries run? What transaction are those queries on? Who initiated the request in the first place? If an exception occurs, does the transaction rollback correctly? The answers to all these questions are lying in the application logs. Traditionally, developers squint at their process’ stdout to try to understand “what the heck is going on.” Hindsight lets us understand every log line in in its context. An example of a context is an http request, job, or even test — the unit of the application fundamentally caused this log entry to exist. Hindsight doesn’t really care how logs are generated or stored. It just needs a flat file of line delimitated json to analyze. In development, our app is appending log messages to a file. Hindsight watches the file and re-renders when new messages are inserted. This means that it works in real time. Hindsight sits open on a second monitor alongside chrome dev tools. Hindsight isn’t just used in development. Since all it needs is a file, when we have a production emergency, we can analyze a filtered subset of our application logs. We even went so far as to write a Slack bot to allow our ops employees to produce log dump files at the moment a problem occurs and send them to an engineer to look at. The UI also has basic filtering of contexts. I can search for /messages and see logs only pertaining to http requests in the message API.

Why

I’m reading John Ousterhout’s wonderful new book “A Philosophy of Software Design” right now.

The greatest limitation in writing software is our ability to understand the systems we are creating… Over time, complicity accumulates, and it becomes harder and harder for programmers to keep all of the relevant factors in their minds as they modify the system… Good development tools can help us deal with complexity, and many great tools have been created over the last several decades

(He goes on to articulate how good tools aren’t enough, and you must write modular, well designed code… read the rest of the book for that part!)

We built Hindsight so that we could understand the most complex parts of our application as a story. Narrative is one of the most powerful reasoning tools in our cognitive toolbox. It helps us remember, explore, and play with ideas.

How

Making our logs structured

It would have been impossible to build Hindsight without structured json logs. We replaced console.log with Bunyan. We had to do a couple things to make this work, for example we wrote a sequelize instance serializer. We also had to tell sequelize to log every query (something that likely won’t scale, but is incredibly useful for where we are now)

Building contexts

One of the hardest parts of the project, and one of the trickiest to open source was building contexts. Consider the following.

// Route router.post('/messages', asyncWrap(async (req, res) => { const message = await db.model('message') .create(clean(req.body)); res.status(201).json(message); })); // Model
Message.addHook('afterCreate', async (message) => { const mentions = AtMention.parseString(message.content); for (const mention of mentions) { await db.model('mention').create(mention); // ???? will log `INSERT INTO mention ...`
} });

In our route, we create a message from a cleaned request body. In the model, we do some string parsing to detect at mentions. How does the log statement (INSERT INTO… ) get tied to our request context? The tricky thing here is getting any data about the request in the model. Information about the request isn’t available in an after create hook, which is called by sequelize (in this case). When you're passing data between middleware, its a common pattern to stick it on the req object (req.user for example). But, In this case, we don’t have req at all in the model… what do we do? We turned to a technique called Continuation Passing. Specifically we used an npm package called Continuation Local Storage (CLS).

CLS let's us create a memory space that’s shared between disparate functions in node. We wrote some middleware that attaches one of our RequestContext objects to a CLS namespace on each request.

class RequestContext { 
	constructor() { 
    	this.id = uuidv4(); 
      } 
   /** ... */ 
} 
function clsMiddleware(clsNamespace) {
	return function (req, res, next) { 
    		clsNamespace.set('context', new RequestContext()); next(); 
    } 
}

We intercept messages passed to Bunyan and find their context.

function contextSerializer(context) { return { id: context.id, type: context.contructor.name }; } const log = bunyan.createLogger({ /** ... **/ serializers: { /** ... **/ context: contextSerializer } }); module.exports = ['fatal', 'error', 'warn', 'info', 'debug', 'trace'].reduce((moduleExports, method) => { moduleExports[method] = function (obj, message) { obj.context = clsNamespace.get('context'); return log[method](obj, message); }; return moduleExports; }, module.exports);

This makes our log output look like:

{/** ... **/} { "context": { "id": "eb21f34e-aa9a-495a-8133-238644240331", "type": "RequestContext" }, "sql": "INESRT INTO ..."} {/** ... **/}

Viewing the data

We built an electron app with vanilla js to watch the log files. The main process uses fs.watch to observe changes. It tells the render process over RPC about new lines. The render process in a few hundred lines of javscript plus PhotonKit.

Hindsight has been surprisingly stable and versatile. But, its scope has expanded some.

Over the months since we’ve released Hindsight, we’ve added a couple of nice features.

  • Autoformat SQL queries, plus add query time
  • Fade out parts of error stack traces that aren’t in our code
  • Hyperlinks to source code that open in the developers $EDITOR
  • Prod log snapshots and streams

Zeke Nierenberg