-
Notifications
You must be signed in to change notification settings - Fork 0
dft street manager doc 0004 test strategy
- Delivery Pipeline
- In Sprint Activtiies
- Exploratory Testing
- Unit testing
- Integration testing
- UI testing
- Accessibility testing
- Security testing
- Performance testing
- Environments
The testing activities described in this document form part of the definition of a feature being 'production ready'.
The different classes of test illustrated in this diagram each exercise functional qualities of the system in different ways.
In this project we aim to include non-functional testing (performance/security/accessibility) early, using automation. This will help avoid the situation were all the effort to do these activities is left to immediately before a release or audit. There will still be some manual exploratory testing required in these areas.
Generally, an effective test plan incorporates a range of testing activities over favoring just one - i.e. purely automated versus purely manual.
A list of test activities that should occur during each sprint ceremony are listed below.
- Test as many requirements (functional, performance, security, ...) in sprint as practically possible.
- Agile testing is a 'whole-team' approach - the whole team must take responsibility for quality on the project
- Test pyramid: a well-balanced test set will have more unit tests than component tests than whole stack tests. The underlying principle is to detect errors on the smallest possible subsystem, because smaller tests run faster and localize diagnosis of errors.
- Automate wherever possible. This includes unit and integration testing, functional and non-functional testing.
- Clearly repeatable automated tests. We will try to avoid manual set-up and long-lived (mutable, drifting) environments.
- Pipeline: start simple and refactor as necessary. The team should start with all tests in a single phase. As the test set grows and uses more resources (e.g. performance), the team will split the pipeline that run serially and/or in parallel.
- Test driven development. Tests should specify desired behavior and drive what is implemented. Testing and coding should proceed in parallel as a single activity as development proceeds
- Testing will provide continuous feedback to the team in the form of automated test results, discoveries made during exploratory testing and observations made by business users of the system
- Collaborate with the users/business stakeholders to clarify and prioritise requirements
- Assure that the user stories can be adequately tested; use specification by example for non-trivial business rules.
- Agree testing approach/requirements for stories between testers/BAs/POs (ensures shared understanding for implementation and verification, so after testing sign off with PO is easier)
- Highlight testing impacts on user story estimation
- Establish clear understanding of the test approach for each story among the group
- Understand how new features/functionality will affect existing functionality and regression testing
- Size stories and provide estimates from a test perspective with a view on whether or not the team have capacity to cover the anticipated test effort
- Establish a clear view and plan of testing tasks that need to be completed for the sprint (e.g., manual tests, automated tests, non-functional tests, test data prep, test environment prep etc.)
- Complete/co-ordinate any test preparation activities that may be required for the sprint (e.g., test data, test environment, deciding on appropriate candidates for automation)
- Conduct exploratory testing to investigate unknown and unexpected behavior within the application
- Design and develop automated functional unit and integration tests for all stories
- Design and prepare non-functional tests – Performance, Security, Accessibility for stories which require them
- Provide feedback to the PO on the functionality so that they can make informed decisions on how to proceed with delivery
Before starting a story, the developer should consult with the BA and tester (three amigos) to agree the implementation and testing approach. This should define any additional testing requirements, such as what browser tests to update and performance tests to write.
After finishing development developer should consult with tester following points:
- Manual tests carried by developer. Task canot be accepted if Acceptance Criteria haven't been verified by developer
- Test automation coverage
- Any non-functional tests done
- Changes are deployed and confirmed on development environment and ready to go onto test environment
- Additional test ideas for further exploratory testing
- The person identifying a defect should raise it in the first instance with the developer of the feature (face-to-face communication)
- Bug which are discovered during development of features being delivered in-sprint should be fixed in-sprint
- Bugs discovered later will be added to the backlog
- There will be triaged and each sprint will have reserved time set aside to fix bugs
- If there is an occasion where a defect has still not been resolved by the end of a sprint and the Product Owner has confirmed acceptance of the related user story, then:
- The user story will be modified by the PO to reflect any deviations from acceptance criteria
- A defect will be formally logged and will follow the normal process described above
Defects should at a minimum capture the following information (this can be used as a template for submission to the backlog):
- Description: Succinct high-level description of the issue
- Environment: (Dev/Test/Staging/Production)
- Build number: Identify the build against which the issue was produced
- Steps to reproduce: Numbered sequence of steps taken to cause the issue (include any data, screenshots etc which may be required to facilitate this
- Ensure that the team remains clear on test activities that need to be completed during, and before the end of, the sprint
- Show and Tell is an opportunity to elicit feedback to the incorporated into the test approach for subsequent sprints
- Retrospectives will provide input from a test perspective and look to continuously improve the team's test approach in future sprints
Exploratory testing is effectively manual testing which is unguided by a pre-defined set of scripts.
Manual testing is human-present testing. A human tester comes up with scenarios that will cause software to fail using their experience, creativity and intuition.
This gives a better chance of finding bugs related to the underlying business logic of the application.
Manual testing has some issues. It can be slow, too ad hoc, not repeatable and not reproducible.
To mitigate these issues, we propose the following strategy for any manual exploratory testing activities which are carried out in-sprint.
- Any new feature of the solution completed in-sprint should have exploratory testing carried out against it
- The feature should always be tested by a person other than the one who developed it
- Focus on the goals of the system - think like a user of the system
- Tests are unscripted, but should still be planned and the outcome accurately captured (See below)
A popular model for exploratory testing is the Tourist metaphor. At a minimum, the Landmark and Intellectual tour models listed in the above link are good approachs to use in combination, though I recommend having a read through all of the approaches listed.
Exploratory testing generates documentation as tests are being performed instead of ahead of time in the form of a rigid test plan.
We will initially adopt a light-weight approach to capturing the output of an exploratory test in the form of a simple one-page report, which states:
- Name of the feature tested
- Goal of the exploratory test
- Note how the test was conducted
- List any issues identified in enough detail to reproduce (i.e. a sequential, numbered set of steps)
- Any attached test data or screenshots that might be useful
There are many tools which can be used to support exploratory testing such as screen records and loggers, however we will adopt the simple proposal above and re-assess if need dictates.
Information on the scope of an exploratory test can be found here: Gov Exploratory Testing Tips
As automated testing of mapping functions is extremely difficult, we will use exploratory testing of mapping screens after changes. A mindmap of common functions to test will be created to guide developers and testers on the sort of actions which should be performed.
Tools: mocha, chai, supertest, sinonJS
All application code should be unit tested with the tests run on commit and coverage checked. Ensuring this is the responsibility of the developer and code reviewer.
All complex client-side JavaScript should be unit tested where possible. This excludes mapping JavaScript (see below).
Due to the difficulty of unit testing mapping JavaScript (OpenLayers used on mapping screens), it will not be unit tested as it provides limited value. Mapping screen logic should be checked via browser testing.
Tools: mocha, chai, knex
Integration tests are automated tests written by the developer which do not include mocks or stubs for dependencies on other components or external services, but test against ideally physically deployed components or test harnesses.
The Development envrionment will act as the integration environment against which these tests will run.
This will include a development database and any other instances of services against which the tests should run.
Integration tests which create data (in the form of DB records, file uploads or similar) should include a teardown
step to ensure that the environment is left in a consistent state comparable to before the test was executed.
Integration tests should include error scenarios and fringe cases to cover when APIs fail to respond, databases are down or integration points are slow. In these cases it is acceptable to mock or re-configure the application to create the correct conditions.
Tools: mocha, chai, webdriver.io, knex
UI Testing is focussed on flow through the application rather than on functionally testing the application. Tests will focus on happy path flows and no negative testing (e.g. will not UI test validation logic). This avoids the test icecream cone problem (see here) with long running, complex, flakey browser tests which cannot be used by developers for quick verification.
The UI test purely tests:
- Page Content - expected elements are visible
- Forms - user can complete all fields they need to and can click buttons
- Complex Javascript - page specific Javascript functions works as expected (such as mapping, tested to limited extent)
- Navigation - user brought to the expected page following specific action
IMPORTANT data, including the values entered on forms and any results returned from the server are NOT part of a UI test.
Testing of values is the responsibility of Integration testing, UI tests should be kept to a minimum so running the tests does not require large amounts of time (discouraging their use as part of development).
Full details are documented in the section below on cross-browser testing.
As part of regular builds, the browser tests should be run against a range of browsers and devices (based on user survey results) using a testing platform (SauceLabs/Browserstack).
Based on user survey responses and GOV.UK Service Manual browser requirements.
- Windows
- Internet Explorer 11
- Edge (latest versions)
- Google Chrome (latest versions)
- Mozilla Firefox (latest versions)
- macOS
- Safari 9+
- Google Chrome (latest versions)
- Mozilla Firefox (latest versions)
Excluding earlier versions of IE due to security concerns and lack of users reporting using it. Mobile devices excluded as currently only Planners/HA noticing officers are based in offices using desktop/laptop, site will use standard GDS styles and be responsive if this changes later.
As part of non-functional testing, accessibility audits will be conducted on a continuous basis by the team.
It is important to note that Accessibility Auditing and Accessibility Testing are not the same.
Tools: Pa11y, WAVE toolbar
Developers will consider accessibility requirements as they design the screens for the application.
Tools, such as pa11y will be used to help automate the audit of WCAG AA compliance against the screens.
For any new screens that are added to the application, developers are expected to confirm that this audit check passes without any errors as part of the review process using browser plugins like WAVE toolbar.
Accessibility auditing is performed both manually by the tester or automated by a tool as part of the continuous integration process.
GDS require that web applications be developed to meet the AA WCAG 2.0 accessibility standard.
All web content, including both the internal and external facing sites, must meet all of the criteria as defined in standard which are a requirement of Level AA. The aim of auditing is to highlight any areas of the application screens that may not be compliant with the guidelines.
An external accessibility audit of the solution may be performed prior to leaving private Beta.
Tools: OWASP Zap, BurpSuite, WatchDog (below)
See Security overview for more details.
Developers are responsible for ensuring that new web application features developed meet the OWASP security guidelines as a minimum standard.
The OWASP Top Ten is a useful awareness document which gives a broad overview of where most web application security flaws will be found.
"Using Components With Known Vulnerabilities" is now part of the OWASP Top 10 - insecure libraries can pose a large risk to web applications.
Using automated tools to receive continuous feedback on the security of our node applications is good practice which mitigates the risk of introducing security flaws early in delivery. Where possible, we should be scanning the different parts of our solution using a tool or service, such as NSP, as part of the automated build process.
We expect an external security audit will be carried out prior to leaving private Beta.
As with any other element of testing strategy - variety is good when it comes to security testing a running application.
Kainos have developed a standalone docker image containing several common security testing tools which can scan a web application to assess a number of different attack vectors.
Included tools are:
- Arachni - scanner and penetration testing framework
- sslyze - identify SSL misconfigurations
- SQLMap - automate SQL injection flaw detection
- Garmr - Inspect responses for basic security requirements
Instructions for how to download, install and run Watchdog are included in the README. A worked example is shown below.
The following steps can be used to run a local manual test against the external web application (for example):
- Run the application locally
- Clone the watchdog repo
cd watchdog/vagrant
vagrant up build
vagrant ssh run
sudo -i
docker run --rm -it -e "GAUNTLT_ATTACK_SUBJECT=localhost:3000" -v /attacks:/data moomzni/gauntlt
Installing and running the docker image as part of a CI process is possible and this is also documented in the README.
The techniques outlined above focus on the identification of issues with the basic structure of the web application itself.
However, at a higher level there exists the possibility that security flaws will exist within the design of the application, and this type of error is usually both harder to detect and to fix.
This is where security must be considered as part of either automated unit and integration tests or the manual exploratory testing conducted during the sprint. Common issues to look out for include:
- Bypassing required navigation - In a sequence of screens, can the pages be requested out of order? Can certain steps be bypassed? This can be easily tested by examing the appropriate URLs and then using this information to navigate to an out-of-sequence URL on subsequent attempts
- Attempting privileged operations - Catalog all the links for actions accessible only as an admin user. As a regular user or guest, attempt to access each in turn to check for privilege escalation
- Abusing predictable identifiers - If an ID in a resource URL is easily predictable (e.g. sequential) then it may be possible to access resources which the user should not be able to see
- Abusing repeatability - Any given action should bear the question - what if I do this again? If you can do it again, how many times can you do it and what happens? This type of test is a good candidate for automation.
- Abusing high-load actions - Actions like image upload, loading in files or any other action which could incur a high resource cost is a candidate for DoS attacks.
Common attacks and specific areas of concern are highlighted in the Security Overview.
Tools: Artillery, JMeter
Both web and API components will need to performance tested to ensure it meets the expected load on the service. Performance testing will be carried out regularly as part of automated testing, so not all left immediately prior to releases.
TBC Response time targets - required before specific performance testing starts
Since Street Manager is not replacing a centralized system, we do not know what demand we will face. However, we know that organisations will transition over a period of months, so we expect to have time to react. We will focus our end of sprint and release testing on finding our throughput headroom (within response time requirements) and where in the architecture our current constraint lies.
As part of our regular pipeline, we will do regression tests with scaled down demand and resources
Developers will be responsible for defining performance tests for new API endpoints they create and including new screens in web performance tests with existing scripts. The extent required should be decided before starting the start by Technical Leads and Three Amigos sessions.
Prior to major releases there should be a more rigorous performance testing exercise to ensure that all main flows and functional areas have been performance tested to meet expected load.
Tools: JMeter, Wraith
Performance testing mapping screens is difficult, as it involves client browser executing JavaScript and external dependencies such as the WMTS provider for rendering the base map layer. Suggested approaches from other projects are using static data, so elements on the map are known, and measuring how long it takes to appear completely. Screenshot comparison of map screens (Wraith) to check for issues can be used then to check that the map has rendered correctly.
Performance testing should be used to check maps respond within expected time for normal mapping scenarios populated with realistic load of data (sample works data), monitoring the response times of map APIs and render time on client screens.
External mapping integration (exposed WMS/WFS layers) should also be tested, to ensure that the expected load on GeoServer (serves WMS/WFS layer) and database is not too high.
Within the CI pipeline, we will try to keep our tests self-contained. Practically, this means avoiding reuse of stateful resources across different runs. Rather, the CD pipeline will provision resources needed to run tests. See Solution Architecture for details. There are practical limits to this: spinning up databases from scratch would slow down the pipeline.
We expect to use a long-lived pre-prod environment, but this will be managed in a similar way to Production, rather than by our pipeline.
The CD pipeline will save test results. These results will be traceable to specific builds/commits to track changes, and this will include infrastructure provisioning code, and parameters of load generation, where appropriate.
All Developer changes will be automatically released here. This environment will change rapidly and frequently be unstable.
Used for integration tests of developer changes and manual developer checks before promoting release to test environment.
Hosts promoted releases from development environment. This environment should be relatively stable and developers should avoid breaking changes reaching here.
Used for automated browser testing and manual testing and verification of stories. Stories should be signed off here.
Hosts releases before being deployed to production. This environment should be stable and reflect the production environment as much as possible.
Used to validate releases against a production conditions.
Used for performance and security testing, should reflect production conditions.