Engineering excellence is about delivering software you can be proud of. Great software has features supporting well articulated business and user needs (see Product Hierarchy of Needs: Winning, Keeping and Growing Business). Having great features however is not enough. Quality characteristics such as usability, uptime and performance drive user satisfaction. Well thought through processes create an environment in which teams can not only deliver but also exceed user expectations. This post introduces continuous improvement process for assessing and driving engineering excellence.

Engineering Excellence Process

The process I have been using with my teams focuses on gradual improvements. Here is how it works. On a quarterly basis, we have a working session with each team to self-assess the current state of engineering excellence. The process utilizes scorecard shown below. An outcome of the session is an action plan to improve in at least one area. The objective is not to address all issues, gaps or risks, but to gradually improve while delivering features and functionality. Here is the process in more details:

For each engineering excellence topic (see scorecard below)
  Assess the current state (green, yellow, red)
    - Green means the topic requires no attention (e.g. we are world-class)
    - Red means the topic requires immediate attention (e.g. there are issues or risks with significant 
      business impact)
    - Yellow means we could and should do better
If there are any red topics
  Devise plan to address them
  Proceed to execute the plan ASAP
Else if there are any yellow topics
  Pick up 1-3 topics to focus on in the next three months
  Develop action plan for the selected topic(s)
Repeat the entire process next quarter

Engineering Excellence Scorecard

Engineering excellence scorecard has two buckets: engineering processes and software quality attributes:

Process Status Notes Action Items Sample Questions
Product/Project Management Do we have clearly articulated goals and measures of success for every project?
Do we have clearly articulated scope for every project?
Development Do we have predictable and repeatable process?
Do we have clear doneness criteria?
Do we have any impediments to productivity?
Source Code Management Do we have an effective branching model?
Release Do we have zero downtime release process?
QA Do we have 100% code review coverage?
Do we have 80%+ automated unit test coverage?
Do we have 80%+ automated integration test coverage?
Do our staging environments adequately represent production environments?
Monitoring Are we alerted to all critical issues?
Are we warned before issues become critical?
Exception Handling Are all exceptions handled?
Logging Do we have gaps in logging?
Backup Is all data backed up?
Have we tested recovery?
Disaster Recovery Will our applications stay up when an entire data center goes down?
Quality Attribute Status Notes Action Items Sample Questions
User Satisfaction Do we have user satisfaction gaps?
Uptime Do we have any single point of failure?
Is failover fully automated?
Do we comply with uptime SLA?
Performance Do we have clearly articulated performance SLAs?
Do we regularly test performance?
Scalability Do we understand how much load can our systems handle?
Do we regularly stress test?
How quickly can we scale?
Security Do we have any security exposure?
Regulatory Compliance Do we have any legal exposure?