|
| 1 | +# Perseus JSON Parsers |
| 2 | + |
| 3 | +The code in this directory takes raw Perseus JSON and parses it into a |
| 4 | +`PerseusItem` object. If the parse succeeds, the resulting object is guaranteed |
| 5 | +to conform to the `PerseusItem` TypeScript type. |
| 6 | + |
| 7 | +The parser gracefully handles old data formats that don't conform to the TS |
| 8 | +types. It does this by defaulting missing fields and migrating ones that have |
| 9 | +been renamed or restructured. |
| 10 | + |
| 11 | +## Regression testing against old data |
| 12 | + |
| 13 | +The tests in the `regression-tests` directory ensure that the parsing code can |
| 14 | +handle old data formats. **Understand that if you change existing regression |
| 15 | +tests, you risk breaking compatibility with old data.** The regression tests |
| 16 | +were generated from a snapshot of Khan Academy content taken in November 2024. |
| 17 | + |
| 18 | +## Exhaustive testing |
| 19 | + |
| 20 | +You can run an exhaustive test of the parser (testing against every single |
| 21 | +content item) by following the steps documented in |
| 22 | +`exhaustive-test-tool/index.ts`. This test takes about 4 hours to run and |
| 23 | +requires downloading many gigabytes of data, so it does not run as part of our |
| 24 | +normal CI builds. Run this test only if you suspect that the parser has somehow |
| 25 | +drifted out of sync with the production data. |
| 26 | + |
| 27 | +## Architecture |
| 28 | + |
| 29 | +See [ADR #773] for context. [ADR #776] describes why we chose to write our own |
| 30 | +runtime typechecking code (in `general-purpose-parsers/`) rather than use |
| 31 | +a third-party library. |
| 32 | + |
| 33 | +[ADR #773]: https://khanacademy.atlassian.net/wiki/spaces/ENG/pages/3318349891/ADR+773+Validate+widget+data+on+input+in+Perseus |
| 34 | +[ADR #776]: https://khanacademy.atlassian.net/wiki/spaces/ENG/pages/3328147539/ADR+776+Write+our+own+code+to+typecheck+Perseus+data+at+runtime |
| 35 | + |
| 36 | +A good place to start reading this code is `parser-types.ts` and `result.ts`. |
| 37 | +Then you should skim the parsers in `general-purpose-parsers/` to get a sense |
| 38 | +of what's available. The Perseus-specific parsers are all in `perseus-parsers/`. |
| 39 | +The public API is in `index.ts`. |
0 commit comments