dft street manager doc 0004 test strategy

Author(s) - Phil Allen, Steven Alexander

Reviewed by - Barry Smyth 09/02/2018

Delivery Pipeline

The testing activities described in this document form part of the definition of a feature being 'production ready'.

The different classes of test illustrated in this diagram each exercise functional qualities of the system in different ways.

In this project we aim to include non-functional testing (performance/security/accessibility) early, using automation. This will help avoid the situation were all the effort to do these activities is left to immediately before a release or audit. There will still be some manual exploratory testing required in these areas.

Generally, an effective test plan incorporates a range of testing activities over favoring just one - i.e. purely automated versus purely manual.

In-Sprint Activities

A list of test activities that should occur during each sprint ceremony are listed below.

Principles

Test as many requirements (functional, performance, security, ...) in sprint as practically possible.
Agile testing is a 'whole-team' approach - the whole team must take responsibility for quality on the project
Test pyramid: a well-balanced test set will have more unit tests than component tests than whole stack tests. The underlying principle is to detect errors on the smallest possible subsystem, because smaller tests run faster and localize diagnosis of errors.
Automate wherever possible. This includes unit and integration testing, functional and non-functional testing.
Clearly repeatable automated tests. We will try to avoid manual set-up and long-lived (mutable, drifting) environments.
Pipeline: start simple and refactor as necessary. The team should start with all tests in a single phase. As the test set grows and uses more resources (e.g. performance), the team will split the pipeline that run serially and/or in parallel.
Test driven development. Tests should specify desired behavior and drive what is implemented. Testing and coding should proceed in parallel as a single activity as development proceeds
Testing will provide continuous feedback to the team in the form of automated test results, discoveries made during exploratory testing and observations made by business users of the system
Collaborate with the users/business stakeholders to clarify and prioritise requirements

Backlog Grooming

Assure that the user stories can be adequately tested; use specification by example for non-trivial business rules.
Agree testing approach/requirements for stories between testers/BAs/POs (ensures shared understanding for implementation and verification, so after testing sign off with PO is easier)
Highlight testing impacts on user story estimation

Sprint Planning

Establish clear understanding of the test approach for each story among the group
Understand how new features/functionality will affect existing functionality and regression testing
Size stories and provide estimates from a test perspective with a view on whether or not the team have capacity to cover the anticipated test effort
Establish a clear view and plan of testing tasks that need to be completed for the sprint (e.g., manual tests, automated tests, non-functional tests, test data prep, test environment prep etc.)

Test Prep and Execution

Complete/co-ordinate any test preparation activities that may be required for the sprint (e.g., test data, test environment, deciding on appropriate candidates for automation)
Conduct exploratory testing to investigate unknown and unexpected behavior within the application
Design and develop automated functional unit and integration tests for all stories
Design and prepare non-functional tests – Performance, Security, Accessibility for stories which require them
Provide feedback to the PO on the functionality so that they can make informed decisions on how to proceed with delivery

Three Amigos

Before starting a story, the developer should consult with the BA and tester (three amigos) to agree the implementation and testing approach. This should define any additional testing requirements, such as what browser tests to update and performance tests to write.

Demo for QA

After finishing development developer should consult with tester following points:

Manual tests carried by developer. Task canot be accepted if Acceptance Criteria haven't been verified by developer
Test automation coverage
Any non-functional tests done
Changes are deployed and confirmed on development environment and ready to go onto test environment
Additional test ideas for further exploratory testing

Issues

The person identifying a defect should raise it in the first instance with the developer of the feature (face-to-face communication)
Bug which are discovered during development of features being delivered in-sprint should be fixed in-sprint
Bugs discovered later will be added to the backlog
There will be triaged and each sprint will have reserved time set aside to fix bugs
If there is an occasion where a defect has still not been resolved by the end of a sprint and the Product Owner has confirmed acceptance of the related user story, then:
- The user story will be modified by the PO to reflect any deviations from acceptance criteria
- A defect will be formally logged and will follow the normal process described above

Template

Defects should at a minimum capture the following information (this can be used as a template for submission to the backlog):

Description: Succinct high-level description of the issue
Environment: (Dev/Test/Staging/Production)
Build number: Identify the build against which the issue was produced
Steps to reproduce: Numbered sequence of steps taken to cause the issue (include any data, screenshots etc which may be required to facilitate this

Other ceremonies

Ensure that the team remains clear on test activities that need to be completed during, and before the end of, the sprint
Show and Tell is an opportunity to elicit feedback to the incorporated into the test approach for subsequent sprints
Retrospectives will provide input from a test perspective and look to continuously improve the team's test approach in future sprints

Exploratory testing

Exploratory testing is effectively manual testing which is unguided by a pre-defined set of scripts.

Manual testing is human-present testing. A human tester comes up with scenarios that will cause software to fail using their experience, creativity and intuition.

This gives a better chance of finding bugs related to the underlying business logic of the application.

Manual testing has some issues. It can be slow, too ad hoc, not repeatable and not reproducible.

To mitigate these issues, we propose the following strategy for any manual exploratory testing activities which are carried out in-sprint.

Guidelines

Any new feature of the solution completed in-sprint should have exploratory testing carried out against it
The feature should always be tested by a person other than the one who developed it
Focus on the goals of the system - think like a user of the system
Tests are unscripted, but should still be planned and the outcome accurately captured (See below)

A popular model for exploratory testing is the Tourist metaphor. At a minimum, the Landmark and Intellectual tour models listed in the above link are good approachs to use in combination, though I recommend having a read through all of the approaches listed.

Supporting Evidence

Exploratory testing generates documentation as tests are being performed instead of ahead of time in the form of a rigid test plan.

We will initially adopt a light-weight approach to capturing the output of an exploratory test in the form of a simple one-page report, which states:

Name of the feature tested
Goal of the exploratory test
Note how the test was conducted
List any issues identified in enough detail to reproduce (i.e. a sequential, numbered set of steps)
Any attached test data or screenshots that might be useful

There are many tools which can be used to support exploratory testing such as screen records and loggers, however we will adopt the simple proposal above and re-assess if need dictates.

Information on the scope of an exploratory test can be found here: Gov Exploratory Testing Tips

Mapping tests

As automated testing of mapping functions is extremely difficult, we will use exploratory testing of mapping screens after changes. A mindmap of common functions to test will be created to guide developers and testers on the sort of actions which should be performed.

Unit testing

JavaScript Unit Testing

Tools: mocha, chai, supertest, sinonJS

NodeJS

All application code should be unit tested with the tests run on commit and coverage checked. Ensuring this is the responsibility of the developer and code reviewer.

Client-side JavaScript

All complex client-side JavaScript should be unit tested where possible. This excludes mapping JavaScript (see below).

Mapping JavaScript

Due to the difficulty of unit testing mapping JavaScript (OpenLayers used on mapping screens), it will not be unit tested as it provides limited value. Mapping screen logic should be checked via browser testing.

Integration testing

Tools: mocha, chai, knex

Integration tests are automated tests written by the developer which do not include mocks or stubs for dependencies on other components or external services, but test against ideally physically deployed components or test harnesses.

The Development envrionment will act as the integration environment against which these tests will run.

This will include a development database and any other instances of services against which the tests should run.

Integration tests which create data (in the form of DB records, file uploads or similar) should include a teardown step to ensure that the environment is left in a consistent state comparable to before the test was executed.

Integration tests should include error scenarios and fringe cases to cover when APIs fail to respond, databases are down or integration points are slow. In these cases it is acceptable to mock or re-configure the application to create the correct conditions.

UI testing

Tools: mocha, chai, webdriver.io, knex

UI Testing is focussed on flow through the application rather than on functionally testing the application. Tests will focus on happy path flows and no negative testing (e.g. will not UI test validation logic). This avoids the test icecream cone problem (see here) with long running, complex, flakey browser tests which cannot be used by developers for quick verification.

The UI test purely tests:

Page Content - expected elements are visible
Forms - user can complete all fields they need to and can click buttons
Complex Javascript - page specific Javascript functions works as expected (such as mapping, tested to limited extent)
Navigation - user brought to the expected page following specific action

IMPORTANT data, including the values entered on forms and any results returned from the server are NOT part of a UI test.

Testing of values is the responsibility of Integration testing, UI tests should be kept to a minimum so running the tests does not require large amounts of time (discouraging their use as part of development).

Full details are documented in the section below on cross-browser testing.

Cross-browser and multi-platform testing

As part of regular builds, the browser tests should be run against a range of browsers and devices (based on user survey results) using a testing platform (SauceLabs/Browserstack).

Supported devices

Based on user survey responses and GOV.UK Service Manual browser requirements.

Windows
- Internet Explorer 11
- Edge (latest versions)
- Google Chrome (latest versions)
- Mozilla Firefox (latest versions)
macOS
- Safari 9+
- Google Chrome (latest versions)
- Mozilla Firefox (latest versions)

Excluding earlier versions of IE due to security concerns and lack of users reporting using it. Mobile devices excluded as currently only Planners/HA noticing officers are based in offices using desktop/laptop, site will use standard GDS styles and be responsive if this changes later.

Accessibility

As part of non-functional testing, accessibility audits will be conducted on a continuous basis by the team.

It is important to note that Accessibility Auditing and Accessibility Testing are not the same.

Accessibility testing

Tools: Pa11y, WAVE toolbar

Developers will consider accessibility requirements as they design the screens for the application.

Tools, such as pa11y will be used to help automate the audit of WCAG AA compliance against the screens.

For any new screens that are added to the application, developers are expected to confirm that this audit check passes without any errors as part of the review process using browser plugins like WAVE toolbar.

Accessibility auditing

Accessibility auditing is performed both manually by the tester or automated by a tool as part of the continuous integration process.

GDS require that web applications be developed to meet the AA WCAG 2.0 accessibility standard.

All web content, including both the internal and external facing sites, must meet all of the criteria as defined in standard which are a requirement of Level AA. The aim of auditing is to highlight any areas of the application screens that may not be compliant with the guidelines.

An external accessibility audit of the solution may be performed prior to leaving private Beta.

Security testing

Tools: OWASP Zap, BurpSuite, WatchDog (below)

See Security overview for more details.

Developers are responsible for ensuring that new web application features developed meet the OWASP security guidelines as a minimum standard.

The OWASP Top Ten is a useful awareness document which gives a broad overview of where most web application security flaws will be found.

Code Security

"Using Components With Known Vulnerabilities" is now part of the OWASP Top 10 - insecure libraries can pose a large risk to web applications.

Using automated tools to receive continuous feedback on the security of our node applications is good practice which mitigates the risk of introducing security flaws early in delivery. Where possible, we should be scanning the different parts of our solution using a tool or service, such as NSP, as part of the automated build process.

Penetration Testing

We expect an external security audit will be carried out prior to leaving private Beta.

As with any other element of testing strategy - variety is good when it comes to security testing a running application.

Kainos have developed a standalone docker image containing several common security testing tools which can scan a web application to assess a number of different attack vectors.

watchdog

Included tools are:

Arachni - scanner and penetration testing framework
sslyze - identify SSL misconfigurations
SQLMap - automate SQL injection flaw detection
Garmr - Inspect responses for basic security requirements

Instructions for how to download, install and run Watchdog are included in the README. A worked example is shown below.

Example Usage

The following steps can be used to run a local manual test against the external web application (for example):

Run the application locally
Clone the watchdog repo
cd watchdog/vagrant
vagrant up build
vagrant ssh run
sudo -i
docker run --rm -it -e "GAUNTLT_ATTACK_SUBJECT=localhost:3000" -v /attacks:/data moomzni/gauntlt

Installing and running the docker image as part of a CI process is possible and this is also documented in the README.

Design Security

The techniques outlined above focus on the identification of issues with the basic structure of the web application itself.

However, at a higher level there exists the possibility that security flaws will exist within the design of the application, and this type of error is usually both harder to detect and to fix.

This is where security must be considered as part of either automated unit and integration tests or the manual exploratory testing conducted during the sprint. Common issues to look out for include:

Bypassing required navigation - In a sequence of screens, can the pages be requested out of order? Can certain steps be bypassed? This can be easily tested by examing the appropriate URLs and then using this information to navigate to an out-of-sequence URL on subsequent attempts
Attempting privileged operations - Catalog all the links for actions accessible only as an admin user. As a regular user or guest, attempt to access each in turn to check for privilege escalation
Abusing predictable identifiers - If an ID in a resource URL is easily predictable (e.g. sequential) then it may be possible to access resources which the user should not be able to see
Abusing repeatability - Any given action should bear the question - what if I do this again? If you can do it again, how many times can you do it and what happens? This type of test is a good candidate for automation.
Abusing high-load actions - Actions like image upload, loading in files or any other action which could incur a high resource cost is a candidate for DoS attacks.

Common attacks and specific areas of concern are highlighted in the Security Overview.

Performance testing

Tools: Artillery, JMeter

Both web and API components will need to performance tested to ensure it meets the expected load on the service. Performance testing will be carried out regularly as part of automated testing, so not all left immediately prior to releases.

TBC Response time targets - required before specific performance testing starts

Since Street Manager is not replacing a centralized system, we do not know what demand we will face. However, we know that organisations will transition over a period of months, so we expect to have time to react. We will focus our end of sprint and release testing on finding our throughput headroom (within response time requirements) and where in the architecture our current constraint lies.

As part of our regular pipeline, we will do regression tests with scaled down demand and resources

Developers will be responsible for defining performance tests for new API endpoints they create and including new screens in web performance tests with existing scripts. The extent required should be decided before starting the start by Technical Leads and Three Amigos sessions.

Prior to major releases there should be a more rigorous performance testing exercise to ensure that all main flows and functional areas have been performance tested to meet expected load.

Mapping

Tools: JMeter, Wraith

Performance testing mapping screens is difficult, as it involves client browser executing JavaScript and external dependencies such as the WMTS provider for rendering the base map layer. Suggested approaches from other projects are using static data, so elements on the map are known, and measuring how long it takes to appear completely. Screenshot comparison of map screens (Wraith) to check for issues can be used then to check that the map has rendered correctly.

Performance testing should be used to check maps respond within expected time for normal mapping scenarios populated with realistic load of data (sample works data), monitoring the response times of map APIs and render time on client screens.

External mapping integration (exposed WMS/WFS layers) should also be tested, to ensure that the expected load on GeoServer (serves WMS/WFS layer) and database is not too high.

Resources not Environments

Within the CI pipeline, we will try to keep our tests self-contained. Practically, this means avoiding reuse of stateful resources across different runs. Rather, the CD pipeline will provision resources needed to run tests. See Solution Architecture for details. There are practical limits to this: spinning up databases from scratch would slow down the pipeline.

We expect to use a long-lived pre-prod environment, but this will be managed in a similar way to Production, rather than by our pipeline.

The CD pipeline will save test results. These results will be traceable to specific builds/commits to track changes, and this will include infrastructure provisioning code, and parameters of load generation, where appropriate.

Development

All Developer changes will be automatically released here. This environment will change rapidly and frequently be unstable.

Used for integration tests of developer changes and manual developer checks before promoting release to test environment.

Test

Hosts promoted releases from development environment. This environment should be relatively stable and developers should avoid breaking changes reaching here.

Used for automated browser testing and manual testing and verification of stories. Stories should be signed off here.

Staging

Hosts releases before being deployed to production. This environment should be stable and reflect the production environment as much as possible.

Used to validate releases against a production conditions.

Performance/Security

Used for performance and security testing, should reflect production conditions.

dft street manager doc 0004 test strategy

Author(s) - Phil Allen, Steven Alexander

Reviewed by - Barry Smyth 09/02/2018

Contents

Delivery Pipeline

In-Sprint Activities

Principles

Backlog Grooming

Sprint Planning

Test Prep and Execution

Three Amigos

Demo for QA

Issues

Template

Other ceremonies

Exploratory testing

Guidelines

Supporting Evidence

Mapping tests

Unit testing

JavaScript Unit Testing

NodeJS

Client-side JavaScript

Mapping JavaScript

Integration testing

UI testing

Cross-browser and multi-platform testing

Supported devices

Accessibility

Accessibility testing

Accessibility auditing

Security testing

Code Security

Penetration Testing

Example Usage

Design Security

Performance testing

Mapping

Resources not Environments

Development

Test

Staging

Performance/Security

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!