An automation approach for native mobile end-to-end testing

Developers today most likely can’t imagine not having source control and continuous integration systems. However, one area enterprises may still be struggling with is how to integrate an automated test solution for native mobile applications into a continuous integration workflow. This article provides suggestions on how to embed automated end to end (e2e) tests for native mobile devices into the continuous integration environment for greater stability and maintainability. I also include an example of how we did it here at Devbridge. 

Before you get started with mobile application end to end testing

You most likely will want to run your e2e tests in conditions close to your live system. As such, it is wise to use a staging environment or something very similar to production with as less mocked services as possible to ensure the best coverage and more accurate tests. Additionally, you can consider e2e tests as regression testing to eliminate manual testing effort. There also are several challenges to consider prior to implementing automated e2e tests. These include:

  • A dedicated machine with multiple mobile devices connected to run tests on (at least one per platform)
  • A dedicated and stable test environment
  • A dedicated person to maintaining tests, as there might be instances of developers changing text, button labels, etc.

Lastly, before deciding on a test framework, consider the following:

  • What platforms do you want to support? (e.g., Android/iOS/Windows)
  • What type of applications do you need to test? (e.g., Native/Hybrid/Web)
  • Will the applications being tested run on simulators or on physical devices?
  • Will you use device farms?
  • Is scripting language required to write test cases that are suitable for you?
  • Do you want to use Gherkin language to specify test cases?
  • Are you allowed to use open source applications?
  • What test result format can it output?
  • Is development of that framework still active?
  • What tools are available to create test cases? (e.g., Visual Studio plugin)

The answers to these questions are organization-specific, hence the above list should be used to cross reference all possible candidates from which you can choose what best fits your organization’s needs. For Devbridge, we chose Calabash for the following reasons:

  • It supports both iOS and Android
  • It can execute tests on simulators and physical devices
  • Most device farms have compatibility with this framework
  • It uses Gherkin
  • It is free, open source and supported by Xamarin and community
  • It can produce quite a range of possible report outputs

Once you have chosen a framework, the next step is to make a proof of concept. Try to implement the framework for your applications, create a simple test case and any other activities you might need to carry out (e.g., generate reports, debug failed test cases, etc.). If you are satisfied with the results, continue to the next step.

Build a solid ground for test automation

Maintainability is critical, because without it, you may bring frustration on the team instead of confidence and speed. The key components to avoid maintenance ruckus is:

  • A good approach for test data management
  • A thoughtful writing of the test scripts
  • Well-chosen test cases to automate

Now let’s dig in more deeply into each component:

Test data management

Managing test data is a very important step for test automation! I recommend that you use a stable environment and preferably clean the data you created or modified after each test run to keep the environment consistent and predictable. Alternatively, if you choose to pollute your database with: user = “test_user” + getTimestamp(), then you should consider the following:

  • You will collect garbage after each run
  • This garbage may impact your system performance over time
  • Some tests may fail if run twice per day. For example, if you want to assert average transfer amount per day
  • If a specific test step fails, it may create an unstable environment during the next run
  • Some test cases may not be possible to automate in an unpredictable environment

The point I want to make is that you need to make the right decision for a test data management solution. For example, if your project is capable of building a database in memory and discard of it afterwards then it’s an easy and fast approach to take. Otherwise you should consider having a dedicated database (or shard) for test automation and use database backups to restore the database to a fixed state before each test suite execution. Keep the test automation database dedicated only for this purpose. Keep it clean and with the minimum amount of data required for your tests to run.

Test scripts

Writing test scripts may be as easy as pressing a record button, executing steps, and then clicking the stop button, or writing, “Given I click "Login" button.” This sounds easy in theory; however, it usually does not work in practice (or at least not without fine tuning the execution scripts). Some of the most common problems are:

  1. Device lag
  2. Long server response time
  3. Test framework issues
  4. Random assert failures
  5. Inconsistent application behavior

Overcoming the above problems can be handled in the following manner.

The first two problems, device lag and long server response time, could be compensated with a sleep function. However, it is not as efficient, because sometimes the object you want to press will be on screen, yet the test will still wait for 10 seconds, or Wi-Fi may be down and response will take 11 seconds. This is a common problem for almost all automated e2e test frameworks. One solution can be found here. You can choose to implement a custom solution to meet your specific needs and to work on framework used, such as checking for that element every half second for one minute or so, and only after confirming it is an onscreen, perform click operation. The best case scenario is that it will click on the button after a maximum of half a second of its appearance, or wait for one minute if it does not appear and fail the test then. This could be a default scenario while executing “Given I click “Login” button” script. This approach of writing automated test cases is both efficient for test execution time and maintenance.

The third problem, test framework issues, should be identified in phases while you are choosing a framework to work with. However, as with all open source projects it might happen along the framework lifecycle. If you identified that it is a framework issue, then you can use the last stable version or wait for an update to fix the issue.

The fourth problem, random assert failures, can be avoided by choosing wisely which assert you perform. For example, you may want to assert for timestamp being exact down to a millisecond. However, that might create more maintenance in the future and might make implementation difficult. Therefore, consider checking only the date, hour and minute to reduce maintenance, increase test stability and reduce time required to implement the test case. However, be aware of the risks you introduce by skipping some data assertions.

As far as inconsistent application behavior (the fifth problem), some applications behave differently and depend on time of the date, the amount of time passed since it was last launched, the amount of concurrent users, etc. Try to consider all options while writing test steps and make them flexible to compensate for this inconsistency. For example, if your application under test logs out users after 24 hours, the app might launch with user logged in or logged out, depending on the amount of time passed since the last user action in that application.

Test cases to automate

If you are lucky enough to start e2e test automation at the beginning of a project, you can automate test cases for newly created features and build up your test suite while the project is being developed. This might also easy your test case development as you can ask developers to build application with testability in focus, not an afterthought. A more frustrating situation is having to automate test cases for an already developed project. In the latter scenario, you should start by automating the most important user stories. You can use the equation Value over Complexity Ranking Formulato rank user stories with the same priority.

Another suggestion would be to automate test cases where bugs were found. That way you can build test cases around areas where bugs are introduced, and find them earlier the next time and be confident that the same bug is not introduced again.

An example of how to automate e2e tests with continuous integration

If you are considering using Jenkins, then there are already quite a few guides to follow. In general, however, your procedure should look like this:

  • Prepare environment
    • Database restore or deploy
    • Install framework executables and its dependencies
    • Fetch app to use as release candidate
    • Perform any other framework-specific activities
    • Wake up and unlock mobile device
  • Execute test steps
    • Publish results

For Jenkins, there are quite a few extra plugins to parse cucumber reports. However, I run into problems with the content security policy not allowing me to render it in .html.

Our approach

Devbridge currently uses TeamCity for all Android and iOS builds. For that reason, we chose to integrate mobile e2e test automation in TeamCity as well. To have more control over the testing process, we incorporated our local Mac Mini as a build agent in TeamCity and connected both Android and iOS mobile devices to that Mac. This enabled us to write cross-platform automated test scripts on the same hardware and in a local environment, either by remote connecting or connecting the Mac Mini to a monitor.

Additionally, we configured Calabash to output results in JUnit and .html format. JUnit (xml) is used by TeamCity to display it under each build Team City Build Example and .html format as a means to report for whomever is concerned on test coverage. However, it also is used to debug, as it stores screenshots and stack trace for all failed test cases.

Mobile end to end testing - Team City Example

We then made the Calabash build configuration to have artifact and snapshot dependency on the appropriate application build. This enabled us to have a build chain and use the same binaries for running automated tests on physical devices as well as testing manually.

The idea of using the same feature files to run on both devices was not feasible, because both applications were not written to have identical workflow. However, we could reuse most of the steps across devices.

For our database, we mapped a test user to a specific shard, created specifically for test automation purposes, and restored a whole shard into the predefined state before each run on both devices. This enabled us to have confidence in our data quality for each run. If we needed additional data in a backed up database state, we just had to restore backup, add the required data and make a new backup, which was then used for the next database restores. This workflow saves us from a lot of future maintenance, as we expand and create new features, which might be data sensitive.

To reduce data concurrency issues we configured e2e tests to run on only one build agent. Additionally, it helps us avoid restoring shard while other test run is not finished.

We then compiled a list of user stories and ranked them using theValue over Complexity Ranking Formulaformula. We calculated value by most used feature (or feature with most bug reports), and started implementing them. That list is updated by adding newly developed features or bugs reported by users.

All source code is stored in Bitbucket and new features are pushed directly to Bitbucket. Before each test run, TeamCity pulls from the Calabash baseline and executes all current features. This enables us to track changes for multiple test engineers to work simultaneously on the project and to run up-to-date test cases in TeamCity without manually putting them there.

The workflow we ended up with is as follows:

  • Android or iOS build configuration produces application as build artifact
  • On successful build of application, its artifacts are used in Calabash configuration and triggers test execution on local Mac Mini
  • Executing script restores backup on shard dedicated for test automation
  • Execute all tests on physical devices
  • Calabash configuration produce reports and stores them as TeamCity artifacts for that run
  • If any tests have failed, the responsible person is informed by email


By following the above guidelines, we managed to incorporate automated e2e test in a continuous integration environment and obtain test results for every new application build. Overall results are positive, and we are using it to perform smoke tests, and in the future, for regression testing. Currently, there are still some issues with the framework, which introduced some stability problems. I am hopeful, however, that this will be fixed in the next release.

To achieve optimal results, make sure you start developing your application with testability in mind. Otherwise, you will run into various issues. In our case, we decided to automate an application that was developed without considering testability and added e2e automated tests later in product’s lifecycle. Thus, test automation was not the easiest thing to achieve. However, it helped to uncover some architectural flaws, which have been put in backlog and are waiting to be addressed.

It is also nice to have unit, integration and e2e test results in a single place accessible and visible by all team members.

The cucumber framework allows you to gain speed in time as you build up a test step library over time, and you can reuse it in the future. You also can create another test case and cover the same test flow, but with different options, by copying and pasting the current test and editing one test step. It will take less than a minute, and you will have additional tests in your test suite as well as increased code coverage.