I’ve posted the code for a project that I’ve been thinking about for a while; CoI.
This project is, at this point, a draft or an early work in progress but I wanted to get it actually started and work on some code; something I haven’t done enough of lately.
The goal with CoI is to have a single place to record and track incident post-mortems. I’ve worked quite a few places and most had terrible post-mortem practices that left things unresolved, untracked, and unfixed and it’s driven me crazy.
If you know that something can cause a production outage because it has and you’ve identified the fix should you really accept that being thrown into a team’s backlog and just.. left there? It’s not a new and exciting feature. It’s not something that is going to move the needle for customer adoption. It’s probably just not all that interesting. That fix can go ignored by the engineering team and project managers for months and while it waits to be addressed your site is still vulnerable.
The intent with CoI is to surface those action items and clear ownership over the original incident and who needs to do the work identified to prevent it from happening again. While there are solutions that people have come up with to do this using other issue tracking systems I’ve seen those attempts fail.
In any case; the draft is up and I plan on working on it occasionally to build it into something more ready to use.
-Nathan