dft street manager doc 0002 solution architecture

Solution Architecture

Author(s) - Alistair Cowan, Phil Allen, Steven Alexander

Introduction

Street Manager is a centralised system for collecting and processing street work information, used by Promoters (utility companies) and Local Authorities.

The vision of the project is:

To transform the planning, management and communication of street works through open data and intelligent services to minimise disruption and improve journeys for the public.

Audience

This document is aimed at Product Delivery, Software Architects, Delivery and Operations teams.

Architectural design approach

In this document, we will describe the system from different viewpoints so that each member of the delivery/operations team have a shared understanding of the system.

We are using a cross functional agile approach to delivery, so all functions of the team are represented (dev/test/operations).

We are taking a DevOps approach to implementation, including fully automated testing, deployment and security testing. Teams are responsible for pushing their code from development to live, taking all non-functional requirements into consideration when starting and completing a story.

Functional view

We use the C4 approach to illustrating the overall system architecture. See draw.io for the source of diagrams below.

System context

draw.io diagram

NOTES:

Assuming GOV.UK Notify for email/sms/post notifications.
NSG data will be loaded from regular data dumps from GeoPlace which will be imported into the application database. The data will be used in mapping and business logic, initially directly as data but may be separated into another service later if necessary.

TODO DRAFT: Phil to review

System Containers

Street Manager system

Street Manager system containers

draw.io diagram

Services and responsibilities

Undertaker web

Separate front end for serving undertaker HTML requests so that it can handle load and scale independently.

Requires common elements: GDS styles, mapping Javascript, security filters.

Local Highways Authority web

Separate front end for serving undertaker HTML requests so that it can handle load and scale independently.

Requires common elements: GDS styles, mapping Javascript, security filters.

Work API

API for handling all updates to works data, persisting into database. Works data will use an event data model approach, so all updates to the works will be recorded as events.

Requires common elements: database change management, security filters, API documentation.

Party API

API for handling all updates to Person and Organisation data, persisting into database. Data will be modelled based on the Universal Person and Organization Data Model approach. Separate to scale and manage independently as other systems may require party details, such as authentication and registration.

Requires common elements: database change management, security filters, API documentation.

Task API

API for adding/checking tasks. Tasks are regularly scheduled, fragile external calls to integration points or long running jobs required by the system. They should record their status (created, in progress, completed, failed) and capable of re-running in case of failure. Only internal components should call the Task API.

Requires common elements: database change management, API documentation.

Questions:

Mapping Server questions:
- Do we need one? SA - "Assuming yes"
- Do we need to expose WMS layers publicly? Phil - "Public net - yes. Call to WMS/WFS could be from client-side tool. Unlike for RESTful APIs, which will always be server with a cert (and TBD white-listed IP address). We need a scenario about securing calls to WMS/WFS."
- How do we authenticate requests? SA - "GeoServer/MapServer supports using authentication and if necessary we can put a whitelisting load balancer infront of it"
- GeoServer vs MapServer SA - "Assuming GeoServer"
Do we need to expose the Work API? SA - "Assuming all UI components need API"
Should we design for handling EToN messages now? SA - "Assuming no"
Do we need a Task queue or can we just use DB with worker approach? SA - "Assuming simple solution for Solution Architecture, queue for estimation"

TODO DRAFT: Phil to review

Scenarios

Scenarios show how the components of the solution collaborate on key behaviours of the solution.

See service design board for details on user scenarios and journeys.

External mapping system

Street Manager provides two ways of accessing mapping data. Users may use either of the Street Manager websites. Or users may use their local mapping tools, accessing Street Manager via the standard WFS and WMS protocols. Street Manager uses separate resources to address these needs.

Street Manager GIS containers

draw.io diagram

The left hand side of the figure shows the stack for users' own mapping tools. Note that the mapping tool's WFS and WMS requests include a valid basic authorisation header. TBD: management of basic auth credentials.

The Mapping Gateway intercepts WFS / WMS calls from the user mapping tool. It provides TLS termination and simple IP load balancing. TBD: it may also check authorisation headers. TBD: the gateway will either be implemented by nginx deployed as part of the application, or by a cloud service.
GeoServer is a standard component that is configured for particular data sources. It is a Java application. It will be containerised. TBD: GeoServer may additionally check for a valid basic auth header. TBD: monitoring; at very least, we have JMX support.
We use a read only replica to keep the GeoServer workload away from our master DB. For the same reason, replication will be asynchronous.

The right hand side of the figure shows the stack for the SM web interface. Note that the browser already has a valid login session and has already loaded a Street Manager mapping page.

The browser triggers JavaScript according to user actions, such as panning the map and selecting / deselecting layers. The bespoke JavaScript makes calls to the RESTful Works API, including the session token in the header.
These calls are intercepted by the Gateway, which terminates TLS, validates the session token (redirecting if invalid) and passes the call through to Works API
Works API returns resources that include GeoJSON to represent works geometry.

Reporting and archive system

Street Manager reporting and archive system containers

draw.io diagram

TODO DRAFT: Phil to review

Authentication and Registration system

TODO Steven - need detail on managed authentication solutions available

Information view

Information principles

Full details on the Data model and standards approach are documented here.

SM will enforce organisation-level access policies on a single DB instance.
SM will support messages (e.g. permit requests) in draft state and potentially not passing all validation rules, for UI access, only. The API made available to promoters will support only messages in their final state, fully valid.
SM will preserve all successfully created messages between promoters and authorities immutably: requests, refusals, variations, and so on. Shared entities with mutable state will be summaries of these primitive messages. To that extent, the SM information model will be event-orientated.
SM will support occurence times that precede insertion (capture) times, allowing SM to "catch up" during DR without losing time information.

Reporting and Analysis

See the Reporting and archive system containers diagram for details and here for implementation details.

In Street Works industry parlance, reporting includes any operational queries where the user needs to export results for wider distribution.

Analysis includes aggregate queries, whether for wider distribution.

Intially, SM will provide a SQL endpoint to a read replica for DfT analytical use. DfT statisticians will access this via VPN from their office network.

Deployment View

Overview of build pipeline, quality gates etc. will be outlined here, these will be standard across the project.

CI/CD Principles

TODO Ali

Branching Strategy

The project will use a feature branch workflow:

Developers work on feature branches until their work is thoroughly tested and is signed off by the product owner.
They produce a squashed commit that combines all the changes on that branch
Merging to master entails commitment to release order. The team tests merge commits and tags them as release candidates
Merging a feature to master means that other features that are in-flight will have to rebase their code
Fixes are treated as urgent features. That is, other in-flight features should wait on branch for the fix to be merged to master so that the rebasing overhead falls on them rather than on the urgent fix

See here for a tutorial.

This model is based of the "DVSA MOT Code Workflow - 27/02/2017" whitepaper.

Workflow

TODO diagram Ali

TODO description of gates Ali

Build pipeline

TODO diagram Ali

Operations view

Web analytics

SM will integrate with a web analytics service, and will not deploy a self-hosted solution. On grounds of cost-effectiveness, the most likely candidate is Google Analytics Pro. In that case, SM would be what is termed a property. It may sit within a DfT account, or another government account.

The selected tool will conform to the W3C CEDDL standard and we will integrate with our web tier on that basis. This will mitigate lock-in and will keep open options for future tag management by DfT staff who do not have development skills.

Integrating with GOV.UK performance platform

As part of the service standard we must integrate with the GOV.UK performance platform.

This will involve engaging with the performance platform team, agreeing what metrics will be supplied and the technical effort in sending these metrics.

Technically the metrics will be sent as a regular task (daily or monthly) via an API call.

Monitoring and Logging

TODO overview of how we will monitor the components: Steven

System monitoring

TODO - Alistair

Application logging

All components will log important events, such as requests and integration/task events. Logs will use JSON emitters (bunyan) and use common names for data items ("username", "Interaction identifier"). See here for recommendations on data to include and exclude (for security).

As a minimum, components should be logging:

Severity
HTTP method
URL
User identifier
Timing info (timestamp, start/stop duration)
Interaction identifier (session id maintained across common interaction requests)
Source name (component)
Source address
Event type

Error logging should also use a consistent form and allow tracing of errors to code location.

The application will be storing important event data in the database, so logging will not be used for auditing, see data model for more details.

TODO - Alistair, review and detail any managed log/aggregation services

Application monitoring

All components should expose:

/healthcheck endpoint which returns 200 if healthy based on dependency checks (database/API etc.)
/status endpoint which returns 200 if application is alive (used to ping service availability)
/metrics endpoint which returns Node application metrics such as CPU/memory usage and (using appmetrics)

Each component should implement these consistently, so they can be used in monitoring the same for all existing and new components.

TODO - Alistair to review

Development principles

Development architecture overview

System context

draw.io diagram

Technology Stack

Node (v8.x.x, latest LTS) - JavaScript language for server side web and api logic
Express - Node web framework
OpenLayers - JavaScript mapping framework
PostGres with PostGIS extensions - Relational database with GIS functions

Rationale:

The application will require significant client side JavaScript so using NodeJS for web/api logic means a single consistent language for the application with good support for including GDS styles in the application
Express is the most common and flexible web framework for Node
OpenLayers is a mature JavaScript mapping library and existing GOV.UK solutions have passed Alpha assessment using it (Land Registry LLC)
PostGres scales extremely well, has good managed RDB support in hosting providers and mature GIS extensions

Useful links:

Microservice approach

The solution should be split into small separate components which can be released independently. This aids rapid development, allows independent scaling and Node applications are more manageable when kept simple.

Useful links:

API design and documentation

See API design principles are documented here.

Definition of done

The definition of done is defined here.

Security overview

The security overview is defined here.

Testing strategy

The test strategy is documented here.

Technical overview of Alpha

The technical overview of Alpha is documented here.

dft street manager doc 0002 solution architecture

Solution Architecture

Author(s) - Alistair Cowan, Phil Allen, Steven Alexander

Table of contents

Introduction

Audience

Architectural design approach

Functional view

System context

System Containers

Street Manager system

Services and responsibilities

Scenarios

External mapping system

Reporting and archive system

Authentication and Registration system

Information view

Information principles

Reporting and Analysis

Deployment View

CI/CD Principles

Branching Strategy

Workflow

Build pipeline

Operations view

Web analytics

Integrating with GOV.UK performance platform

Monitoring and Logging

System monitoring

Application logging

Application monitoring

Development principles

Development architecture overview

Technology Stack

Microservice approach

API design and documentation

Definition of done

Security overview

Testing strategy

Technical overview of Alpha

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!