Quality metrics: Build sustainable software
Quality metrics track the shifting technical health of an application. The team needs to monitor escaped defects, automated test coverage, test run times, accumulation of technical debt, performance, and downtime. All products and custom applications require constant attention.
New features are released.
Technologies mature or deprecate.
Team members come and go.
The market shifts over time.
Software needs to adapt to changes within the business, market, and customer base. Scrapping the codebase and starting from scratch burns capital and time. To keep product evergreen, craft clear metrics. Modular design enables teams to maintain, refactor, and improve products incrementally.
Systemic observation and correction of quality metrics prevent software from becoming a legacy application. Treat all products as evergreen.
On the surface, business value metrics and quality metrics appear unrelated. However, failing to maintain a proper testing strategy or tolerating a volatile production environment in the long-term impedes business outcomes and customer success. It is thus important to establish a system of metrics instead of looking at each in isolation when prioritizing investment.
Escape defect reports
Each user story goes through several development and testing stages before meeting the Definition of Done and being released into production. While techniques vary between Scrum, extreme programming, and test-driven development, our recommended best practice is for each engineer to own both implementation and rudimentary testing of a story. In other words, the feature needs to work before a story is handed over to testing engineers. Testing engineers provide an additional layer of strategy with multiple tools and methodologies at their disposal (e.g., end-to-end tests, unit tests, integration tests, interface tests, and manual tests).
To flag potential issues needing repair, track two types of defects:
those caught by testing engineers within the sprint.
those reported in the production environment that escape initial detection.
Testing engineers document both types of defects with a project management tool like Jira or Azure Devops—allowing the team to report aggregate numbers over time. Detect and resolve defects in context of continuous improvement.
|Monitor escaped defects across various teams and look for outliers for teams with an extremely low or high number of defects.||Evaluating the testing strategy and domain complexity in both scenarios empowers the team to make educated adjustments to the overall testing strategy.|
|Review the root cause of each escaped defect in terms of the overarching narrative unearthed in the defect reports.||Understanding the root cause helps the team revise the testing strategy and prevent future defects.|
|Investigate internal defects within the team stemming from a lack of domain knowledge, onboarding, or skill set—causing significant loss of efficiency. (The recommended ratio of engineers to testers is around 3:1 or 2:1.)||Looking at the ratio of engineers to testers ensures the team operates efficiently. (A team operating with an equal number of engineers and testers may present issues with the testing strategy or the team’s competence.)|
|Keep a record of defect metrics before and after a test automation strategy.||These numbers communicate the ROI of a sound strategy that requires investment.|
Automation maturity & test coverage reports
A testing strategy encompasses functional and nonfunctional validation of the product (e.g., manual and automated performance, accessibility, user interface tests). Test coverage monitors the percentage of functionality covered by unit tests or end-to-end tests. Successful testing strategies lower the lifetime QA costs. The number of defects goes drastically down. Technical debt decreases. Delivery maintains a higher velocity.
A counterproductive industry expectation is for every product to have a large percentage of testing automation (e.g., 80 percent of functionality covered by tests). These standards become inefficient when the feature set being evaluated is low value or non-mission-critical. The spend becomes a financial burden, difficult for the business to justify.
The following are best practices to review when establishing metrics for automation maturity and test coverage:
Agree on automation depth. While automation is preferred, it is often an underinvested area of a testing strategy, especially when time to market is the focus. When it is critical to minimize the number of defects, however, full automation takes precedence over budgetary or schedule restrictions.
Embed elements in the toolkit. Include static and dynamic code analysis, test case management with TestRail™ or other tools, and integration to security monitoring tools.
Establish healthy documentation standards. Choose living documentation over static artifacts for historical testing evidence. Opt for a self-explanatory behavior-driven development (BDD) test to drive the acceptance of stories.
Collaborate with business analysis teams. Ensure test engineers work together with product managers and business stakeholders to expand the story acceptance criteria, as well as define developer checklists to verify when writing code for a story.
Integrate testing into the build pipeline. Include automated tests that run each time a code check occurs into the source repository and reject deployment when detecting defects.
Automate all tracked metrics in Jira or a project management tool. Consider tracking the defect resolution time from discovery through release and defect resolution cost. Compare these metrics over time to determine if the testing strategy continuously improves the team’s output and efficiency.
Technical debt report
Technical debt is a software engineering term that describes the accumulation of undesirable decisions in the codebase. When writing code, the team settles for a quick fix over a better approach that takes longer to execute. While the compromise appears sensible, undesirable decisions in the codebase add up and require rework (aka technical debt). These decisions, intentional or unintentional, result in code that’s hard to maintain, inhibits the product longevity, and lowers team velocity.
Teams always work within constraints of time, quality, and cost. Consequently, they incur debt to ship product to market faster. Accumulating some debt, especially when prioritizing market launch speed, is acceptable. However, too much debt has the power to cripple a team and product with performance issues or poorly designed architecture that deteriorates maintainability of software. To account for this reality, track the debt so that it’s easily addressed in a future sprints of hardening.
1 - Bad debt
Unprofessional & intentional
Arises when writing bad code is created intentionally due to laziness, ignorance, or other unethical reasons.
THE FIX: Tackle technical debt and hold the team accountable for following best practices.
2 - Good debt
Intentional & professional
Occurs when the team selects an easier, faster solution intentionally, fully aware of the long-term impediments.
THE FIX: Have a senior team member evaluate the pros and cons to determine whether the benefits of delivering outweigh the compromises.
3 - Lack of experience
Unprofessional & accidental
Stems from decisions that require rework after completing code review.
THE FIX: Use this type of debt as a vehicle for inexperienced team members to improve and learn.
4 - Mistake debt
Accidental & professional
Happens when a mature team with technical skills makes a bad decision with too little context or time constraints.
THE FIX: Allocate time to pinpoint and resolve mistake incurred by design decisions or changing requirements.
Similar to product debt, the main objective for a technical debt metric is to capture all known issues, estimate remediation, and continuously prioritize a certain amount of time for resolution. Create a technical debt burndown chart to track the remaining work against time. Beware if the backlog growth outpaces resolution velocity.
Application performance report
A performance strategy correctly sets initial targets while also providing the testing framework to monitor and ensure ongoing compliance. Poor performance like a long load time detracts from a positive customer experience. Worse yet, performance issues frustrate users to the point where they leave and never come back.
To track efficacy, run tests and reports on the server-side and client-side to see how the application functions. Once in flight, analyze the performance requirements of new features proactively (e.g., data design). Then respond to the results from testing activities.
For user-centric metrics
Google’s RAIL performance model is a solid starting point:
Response: User input response occurs in 100 milliseconds or faster
Animation: Display transitions and animations smoothly—60 frames per second
Idle: Loading as little data as possible first and then using idle time to load the rest
Load: Content appears in 5 seconds or faster
Try PerformanceObserver for client-side performance testing to track the first contentful paint (FCP), first meaningful paint (FMP), time to interactive (TTI), and more. For custom metrics such as single component initialization, render, or patch times, consider using a User Timing API. To monitor client-side performance, implement client app integration with a monitoring service (e.g., Kibana, Splunk, New Relic), and push client-side performance metrics to the monitoring service for continuous tracking.
For server-side metrics
Track resource/endpoint response time characteristics such as median, average, error rate, and percentiles. Ensure that high percentages (95 percent or 99 percent) fall below five seconds under a high, but typical load.
JMeter is the gold standard when performance testing. Use monitoring services such as Kibana, Splunk, New Relic, or InfluxDB. The same instance of server-side monitoring services track client-side metrics as well.
Forecasting performance needs
Establish base target numbers to forecast future processes for a targeted implementation including the following:
The load hitting a feature. Use existing data to determine patterns and necessary volume. For current products, refer to the latest monitoring to forecast the typical traffic on a specific feature. For new products, make educated guesses about concurrent users and feature usage frequency. Then make corrections once the MVP reaches the market.
The data variations/distribution. Recognizing that many performance issues are data-driven, examine the application in light of datasets that may be used at high volume or concurrency. Include data research as part of exploratory testing. In the majority of cases, data exists and is accessible before implementation because most performance issues are data-driven. Simply query the required distributions from the respective databases or APIs.
Because of high cost, performance testing environments do not have equally balanced back-end resources that mimic production environments. Take this limitation into account when establishing metrics and testing strategy.