Engineering

Amp up app performance with RAIL

A good performance is a requirement for user-facing web applications. In the context of web app performance, the word “good” is well-defined by a RAIL model. This model breaks down the four aspects of the app life cycle—response, animation, idle, and load (RAIL)—and puts each aspect under strictly defined metrics that matter most to the user experience. There are metrics and key points that should be considered (listed below), when designing and building fast, responsive, and intuitive user-focused software.

  • App responds to user input in under 100 ms
  • Each animation frame created in 16 ms or less (60 fps)
  • Idle time is maximized to assure a quick response to a user input
  • Interactive content loads in under 5000 ms

To improve anything meaningfully, we need to have a method of how to measure it. Therefore, these above RAIL model metrics can be used for both improving the performance and tracking its regression.

Performance regression

Performance regression can be caused by factors such as hardware updates, software updates, introducing a new logic, system design changes, increasing the size of data, changes in user behavior, and more. All these factors might happen without notice. There are so many aspects impacting the performance that it does not make sense to track their performance metrics at the level of each unit. Similarly, when a change is going to be applied to a single or several units, it often gets very complex to estimate the cost to the overall performance. In this case, simulation works significantly better than imagination. The best thing the development team can do is deploy the code and track changes in performance metrics.

Response

When testing the web application performance, I strongly recommend separating the client application performance testing from the server-side performance testing because these two testing areas require different approach. The server-side performance testing requires mimicking a lot of infrastructures and simulating realistic traffic. Whereas the client application performance depends on the application code itself and the user machine’s ability to perform application commands. 

The client application performance can be tested in every environment, even on a local machine. Because the client application communicates with both the users and the server, there are two types of actions that should be tracked within the client application: the response to the user input and the response to the server output. 

Imagine a user clicks the button to expand an element containing the list of items. On the client application level, we can break this into three parts:

Part 1: Click received (request sent to the server) – element expanded

Part 2: Element expanded – response received from the server

Part 3: Response received – items displayed inside the element

The first part displays how the client application handles user input. The RAIL performance model suggests keeping the response time under 100 milliseconds to demonstrate application’s ability to immediately respond to user input. As the client application requests some data from the server, it waits until the data loads, indicating this waiting state with loading indication (e.g. a spinner). The second part for the client application is nothing but waiting for the resources and appreciating the spinner. The third part displays how the application handles server output. 

Note: If the first part is obvious to a user and represents the app’s responsiveness, the third part is unrecognizable. However, the third part’s duration adds to the total load time as observed from a user perspective—meaning it is also important to keep the third part short.

It is even more important to track situations when the client-side application handles user input itself without a request to the server. Client-side manipulations with long data sets or complex data structures might result in an unexpected application crash, incomplete action, and unfinished measure of action duration. Thus, it is important to check if testing or monitoring somehow logs these types of errors.

Although some client application frameworks have similar performance metrics included built-in and are even supported by popular monitoring services, others might lack this type of feature. Therefore, User Timing API allows implementing custom time marks on custom events.

Animations

When we think about animations, we usually imagine them during transitions (e.g. filling the transition from one view to another, animating the opening of some element, showing animated loader, or a splash screen). In these cases, animation fills the gaps in particular workflows. The animation might explain the gap or transition that’s making the system more intuitive. However, to avoid damage to the user experience, animations should not be displayed for a longer period than the gap itself, regardless of how beautiful they are. 

Example 1

A user is following workflow for the 10th time or multiple times in a row. They are probably already familiar with all the animations that appear. They want to be taken to the next step as soon as the content is ready. Using RAIL, you should keep the duration of gaps in the workflow to under one second. Any pauses longer than that hurts user focus on the task they are performing.

Example 2

Another great animation example is a Yeti Login Form by Darin Senneff which is shown in parallel with a workflow. In this case, the animations do not block the flow. As a result, users can proceed at their speed whether it is their first or the 10th time performing a task.

Animations should always look smooth. The cinematography standard for visual smoothness is 60 frames per second—with a single frame loading every 16 milliseconds. Knowing this, the metric can be easily tracked using the browser’s built-in developer tools.

(Demonstration can be found in https://frames-per-second.appspot.com/)

Idle

The word “Idle” in RAIL means minimizing the initial content to deliver it to users as soon as possible and, after it is delivered, use idle time to deliver the rest of the content. A great example of maximizing a system’s idle time is showcased on a music platform when you’re searching for a song. First, you search for and select the desired song. Then, you press play and the music starts instantly. Could you imagine? Is it realized already? From the moment you pressed play until the moment the song started playing, the platform only loads the minimum amount of data necessary. As the song keeps playing, the platform continues loading the following chunks of the song. These chunks are small enough for the platform to be still responsive to user input, so if the user decides to fast-forward, the platform can respond immediately.

What’s cool is that similar standards apply to web applications. RAIL suggests using various techniques to minimize the time it takes to render the first meaningful paint—allowing the users to interact with content as soon as possible. Also, the initial content should be minimized when taking Optimistic UI considerations into account. Running performance audits such as Lighthouse might provide valuable insights on how to improve the application load time and maximize idle time.

Load

Tracking the server-side performance is essential to the resources load time—which usually takes the biggest chunk of web page load time. The RAIL performance model suggests keeping interactive content load times to under five seconds. Therefore, the server-side performance and system design are usually the first to blame if the resources load times do not fit under five seconds.

Monitoring the server-side performance metrics in production sounds like a must for an effective DevOps process. It is a source of truth, and we cannot have this truth in any other test environment. But is having only the monitoring enough? Would we (product creators) be happy with catching performance issues in production? How fast will we respond? How we will make sure that we fixed these problems if we have never reproduced them? In fact, the production issue might be hiding until the circumstances, which the development team has not found yet, reoccur. If the team cannot play with circumstances and conditions and if they cannot be simulated, the team cannot test for them. Products that can be both monitored and tested for server-side performance in production is the exception rather than the rule.

In most cases, server-side performance should be tracked in test environments. When testing for the server-side performance, we need to establish a production-like test environment. Usually, the most challenging task is simulating production-like traffic. This type of simulation can be implemented using JMeter or a similar tool. The design of this type of simulation is critical. The lack of coverage in one area or the other may bypass performance issues to the production environment. Therefore, the reliability of the simulation can be validated against production environment performance metrics or known issues. In this instance, if the load test is reliable, similar metrics and issues should be reproducible in the test environment. After establishing a reliable load test, tuning it for stress, spike, volume, and other types of tests, gets relatively simple and may be also considered.

Example of server-side performance regression report generated using JMeter in Jenkins CI/CD environment

Prevention

Based on the RAIL performance model and my personal experience dealing with performance issues in a continuous delivery environment, the following are key points that help to prevent performance issues.

System design. A lot of performance issues can be identified during the requirements refinement carried by the development team. In other words, a lot of performance issues can be resolved by simply not introducing them in design. So, do not forget performance aspect during refinements, and do not approve the designs without development team review.

Audits. Quality automatic audits (such as Lighthouse) can be a powerful and cost-effective tool supporting the performance testing efforts. Automatic audits provide a report in a minute or two. This way if your system is fast and responsive, receiving an obsolete report is not a big loss. However, if your system is experiencing issues or bad design, you will be provided with several insights rapidly. Consider running audits periodically—some of them may be integrated with CI/CD pipeline.

Be realistic. Don’t waste time or money fixing issues that don’t exist. As good performance usually isn’t cheap. We should have a good understanding of what kind of performance is expected and not expected. We should know the current state of system performance and address the issues by reproducing or fixing them. Also, we should be able to estimate the future performance at two key times: after the development team introduces changes to the code base and after the marketing team introduces more users (which generates more traffic).
Common mistakes: not monitoring the production environment, unclear performance requirements, inability to simulate and test production-like behavior.

Exploratory test. As there are many tools for testing the system performance at various levels, it is easy to fall into the mindset that they will do it all. Testing tools can do a lot, especially catch performance regression, but using testing tools and audits means repetitive inspection. They are not universal and do not cover every aspect of UX. To perform more robust performance tests, consider using exploratory testing techniques such as persona testing. Combine manual exploratory testing with some background load and you will be surprised with the observations you make. Therefore, the testing team should be reminded to address the performance (manually).

Learn and educate. Even though system performance is very important, metrics are rarely self-explanatory. Each of the high-level metrics results from multiple low-level processes. You can explore low-level processes that result in both understanding the system better and improving high-level metrics. As low-level improvements are usually interpreted as technical debt (and, as we know, selling the tech-debt is difficult), high-level metrics could as well. Therefore, the development team and product owner should be educated to understand those high-level metrics. 

Stay aligned. Finally, make sure your team is aligned. Clearly describe which metrics indicate how each specific piece of tech-debt will improve. A few common mistakes that lead to misalignment are approving technical debt tasks without defined metrics, not educating the programmers on how to test for improvement, and not demonstrating technical debt results.

Conclusion

Web application performance should be addressed and tested in multiple layers. Therefore, consider using the RAIL performance model. It’s a good fit because it provides multi-layer, user-centric metrics as we thrive to make our work measurable and to deliver user-focused value.