-
Notifications
You must be signed in to change notification settings - Fork 0
dft street manager doc 0002 solution architecture
- Introduction
- Functional view
- Information view
- Deployment view
- Operations view
- Development principles
- Security overview
- Testing strategy
- Technical overview of Alpha
Street Manager is a centralised system for collecting and processing street work information, used by Promoters (utility companies) and Local Authorities.
The vision of the project is:
To transform the planning, management and communication of street works through open data and intelligent services to minimise disruption and improve journeys for the public.
This document is aimed at Product Delivery, Software Architects, Delivery and Operations teams.
In this document, we will describe the system from different viewpoints so that each member of the delivery/operations team have a shared understanding of the system.
We are using a cross functional agile approach to delivery, so all functions of the team are represented (dev/test/operations).
We are taking a DevOps approach to implementation, including fully automated testing, deployment and security testing. Teams are responsible for pushing their code from development to live, taking all non-functional requirements into consideration when starting and completing a story.
We use the C4 approach to illustrating the overall system architecture. See draw.io for the source of diagrams below.
NOTES:
- Assuming GOV.UK Notify for email/sms/post notifications.
- NSG data will be loaded from regular data dumps from GeoPlace which will be imported into the application database. The data will be used in mapping and business logic, initially directly as data but may be separated into another service later if necessary.
TODO DRAFT: Phil to review
Undertaker web
Separate front end for serving undertaker HTML requests so that it can handle load and scale independently.
Requires common elements: GDS styles, mapping Javascript, security filters.
Local Highways Authority web
Separate front end for serving undertaker HTML requests so that it can handle load and scale independently.
Requires common elements: GDS styles, mapping Javascript, security filters.
Work API
API for handling all updates to works data, persisting into database. Works data will use an event data model approach, so all updates to the works will be recorded as events.
Requires common elements: database change management, security filters, API documentation.
Party API
API for handling all updates to Person and Organisation data, persisting into database. Data will be modelled based on the Universal Person and Organization Data Model approach. Separate to scale and manage independently as other systems may require party details, such as authentication and registration.
Requires common elements: database change management, security filters, API documentation.
Task API
API for adding/checking tasks. Tasks are regularly scheduled, fragile external calls to integration points or long running jobs required by the system. They should record their status (created, in progress, completed, failed) and capable of re-running in case of failure. Only internal components should call the Task API.
Requires common elements: database change management, API documentation.
Questions:
- Mapping Server questions:
- Do we need one? SA - "Assuming yes"
- Do we need to expose WMS layers publicly? Phil - "Public net - yes. Call to WMS/WFS could be from client-side tool. Unlike for RESTful APIs, which will always be server with a cert (and TBD white-listed IP address). We need a scenario about securing calls to WMS/WFS."
- How do we authenticate requests? SA - "GeoServer/MapServer supports using authentication and if necessary we can put a whitelisting load balancer infront of it"
- GeoServer vs MapServer SA - "Assuming GeoServer"
- Do we need to expose the Work API? SA - "Assuming all UI components need API"
- Should we design for handling EToN messages now? SA - "Assuming no"
- Do we need a Task queue or can we just use DB with worker approach? SA - "Assuming simple solution for Solution Architecture, queue for estimation"
TODO DRAFT: Phil to review
Scenarios show how the components of the solution collaborate on key behaviours of the solution.
See service design board for details on user scenarios and journeys.
Street Manager provides two ways of accessing mapping data. Users may use either of the Street Manager websites. Or users may use their local mapping tools, accessing Street Manager via the standard WFS and WMS protocols. Street Manager uses separate resources to address these needs.
The left hand side of the figure shows the stack for users' own mapping tools. Note that the mapping tool's WFS and WMS requests include a valid basic authorisation header. TBD: management of basic auth credentials.
- The Mapping Gateway intercepts WFS / WMS calls from the user mapping tool. It provides TLS termination and simple IP load balancing. TBD: it may also check authorisation headers. TBD: the gateway will either be implemented by nginx deployed as part of the application, or by a cloud service.
- GeoServer is a standard component that is configured for particular data sources. It is a Java application. It will be containerised. TBD: GeoServer may additionally check for a valid basic auth header. TBD: monitoring; at very least, we have JMX support.
- We use a read only replica to keep the GeoServer workload away from our master DB. For the same reason, replication will be asynchronous.
The right hand side of the figure shows the stack for the SM web interface. Note that the browser already has a valid login session and has already loaded a Street Manager mapping page.
- The browser triggers JavaScript according to user actions, such as panning the map and selecting / deselecting layers. The bespoke JavaScript makes calls to the RESTful Works API, including the session token in the header.
- These calls are intercepted by the Gateway, which terminates TLS, validates the session token (redirecting if invalid) and passes the call through to Works API
- Works API returns resources that include GeoJSON to represent works geometry.
TODO DRAFT: Phil to review
TODO Steven - need detail on managed authentication solutions available
Full details on the Data model and standards approach are documented here.
- SM will enforce organisation-level access policies on a single DB instance.
- SM will support messages (e.g. permit requests) in draft state and potentially not passing all validation rules, for UI access, only. The API made available to promoters will support only messages in their final state, fully valid.
- SM will preserve all successfully created messages between promoters and authorities immutably: requests, refusals, variations, and so on. Shared entities with mutable state will be summaries of these primitive messages. To that extent, the SM information model will be event-orientated.
- SM will support occurence times that precede insertion (capture) times, allowing SM to "catch up" during DR without losing time information.
See the Reporting and archive system containers diagram for details and here for implementation details.
In Street Works industry parlance, reporting includes any operational queries where the user needs to export results for wider distribution.
Analysis includes aggregate queries, whether for wider distribution.
Intially, SM will provide a SQL endpoint to a read replica for DfT analytical use. DfT statisticians will access this via VPN from their office network.
Overview of build pipeline, quality gates etc. will be outlined here, these will be standard across the project.
TODO Ali
The project will use a feature branch workflow:
- Developers work on feature branches until their work is thoroughly tested and is signed off by the product owner.
- They produce a squashed commit that combines all the changes on that branch
- Merging to master entails commitment to release order. The team tests merge commits and tags them as release candidates
- Merging a feature to master means that other features that are in-flight will have to rebase their code
- Fixes are treated as urgent features. That is, other in-flight features should wait on branch for the fix to be merged to master so that the rebasing overhead falls on them rather than on the urgent fix
See here for a tutorial.
This model is based of the "DVSA MOT Code Workflow - 27/02/2017" whitepaper.
TODO diagram Ali
TODO description of gates Ali
TODO diagram Ali
SM will integrate with a web analytics service, and will not deploy a self-hosted solution. On grounds of cost-effectiveness, the most likely candidate is Google Analytics Pro. In that case, SM would be what is termed a property. It may sit within a DfT account, or another government account.
The selected tool will conform to the W3C CEDDL standard and we will integrate with our web tier on that basis. This will mitigate lock-in and will keep open options for future tag management by DfT staff who do not have development skills.
As part of the service standard we must integrate with the GOV.UK performance platform.
This will involve engaging with the performance platform team, agreeing what metrics will be supplied and the technical effort in sending these metrics.
Technically the metrics will be sent as a regular task (daily or monthly) via an API call.
TODO overview of how we will monitor the components: Steven
TODO - Alistair
All components will log important events, such as requests and integration/task events. Logs will use JSON emitters (bunyan) and use common names for data items ("username", "Interaction identifier"). See here for recommendations on data to include and exclude (for security).
As a minimum, components should be logging:
- Severity
- HTTP method
- URL
- User identifier
- Timing info (timestamp, start/stop duration)
- Interaction identifier (session id maintained across common interaction requests)
- Source name (component)
- Source address
- Event type
Error logging should also use a consistent form and allow tracing of errors to code location.
The application will be storing important event data in the database, so logging will not be used for auditing, see data model for more details.
TODO - Alistair, review and detail any managed log/aggregation services
All components should expose:
-
/healthcheck
endpoint which returns 200 if healthy based on dependency checks (database/API etc.) -
/status
endpoint which returns 200 if application is alive (used to ping service availability) -
/metrics
endpoint which returns Node application metrics such as CPU/memory usage and (using appmetrics)
Each component should implement these consistently, so they can be used in monitoring the same for all existing and new components.
TODO - Alistair to review
- Node (v8.x.x, latest LTS) - JavaScript language for server side web and api logic
- Express - Node web framework
- OpenLayers - JavaScript mapping framework
- PostGres with PostGIS extensions - Relational database with GIS functions
Rationale:
- The application will require significant client side JavaScript so using NodeJS for web/api logic means a single consistent language for the application with good support for including GDS styles in the application
- Express is the most common and flexible web framework for Node
- OpenLayers is a mature JavaScript mapping library and existing GOV.UK solutions have passed Alpha assessment using it (Land Registry LLC)
- PostGres scales extremely well, has good managed RDB support in hosting providers and mature GIS extensions
Useful links:
- Node with TypeScript
- APVS ODP - GOV.UK node project and source
- Node API generation with Typescript and Swagger
- turbolinks - preserve map state in browser with back/forward navigation
The solution should be split into small separate components which can be released independently. This aids rapid development, allows independent scaling and Node applications are more manageable when kept simple.
Useful links:
See API design principles are documented here.
The definition of done is defined here.
The security overview is defined here.
The test strategy is documented here.
The technical overview of Alpha is documented here.