[praeparātiō ex automatīs] MVP of idea of automation to pre-process external data to LSF internal file format

- **Related issues, but not equal:**
  - _**MVP of read access to Wikidata #3**_ 
  - _**Automate SPARQL query generation to Wikidata by items with P #40**_
- Related concepts
  - https://en.wikipedia.org/wiki/Reference_data
  - https://en.wikipedia.org/wiki/Emergency_management#Preparedness
  - https://en.wikipedia.org/wiki/Category:Disaster_preparedness
- **Projects with scrappers in a very structured way (many external referential data)**
  - https://github.com/OCHA-DAP?q=scraper
  - https://github.com/datasets
    - Beyond data, most also contain the scripts to process such data. Some may use this https://github.com/datahq/dataflows

---

This point is an minimal viable product of a one or more "crawlers" or "scripts" or "conversors" that transform external dictionaries (aka the ones we would label `origo_per_automata`, origin trough automation, vs `origo_per_amanuenses`, origin trough manuenses, the way we mostly optimized now) into the working format.


## Focuses
- The _data_ we're interested are **already referential data**, which is smaller subset of what is shared
  - Is more important have less, but actively updated with primary source and very high quality than do data hoarding and ignore the important ones.
- **We're really interested in referential data we can document how to use**
  - This also means we may intentionally name the data fields in ways that make easier to document; even if this means automatically generate user documentation
  - The entire idea must allow ways to receive collaborators help to translate documentation (not need be on sort term, but at least be planned from very start)
- **Referential data can be public; but most information managers will deal with sensitive data**
  - The best potential end user, aka the information managers, are likely to ingest all the data as soon as new emergency happens.
  - Even if information managers have good data proeficiency, or know some programming language, they're likely to be overloaded; so we need to make as easier as possible to mitigate human error (on the reference tables)
- **We're interested on reference data useful to disaster preparedness**
  - This makes even more important the idea of optimize for faster releases, user documentation, care about make less likely users would leak sensitive data, and to make data schema interoperable at international level


---

### External examples of type of reference data

#### International Federation of Red Cross and Red Crescent Societies | IFRC Data initiatives
- https://data.ifrc.org/

#### Common Operational datasets (overview)
- https://en.wikipedia.org/wiki/Common_Operational_Datasets
- https://emergency.unhcr.org/entry/50306/common-operational-datasets-cods-and-fundamental-operational-datasets-fods
- https://data.humdata.org/cod
- https://storymaps.arcgis.com/stories/dcf6135fc0e943a9b77823bb069e2578
- https://reliefweb.int/sites/reliefweb.int/files/resources/A126E188F0B88383C1257834004858BB-Full_Report.pdf


#### 2007 reference (somewhat outdated)
From https://interagencystandingcommittee.org/system/files/legacy_files/Country%20Level%20OCHA%20and%20HIC%20Minimum%20Common%20Operational%20Datasets%20v1.1.pdf

##### Table One :Minimum Common Operational Datasets

| **Category** | **Datalayer** | **Recommendedscaleof**** sourcematerial** |
| --- | --- | --- |
| Political/Administrativeboundaries | CountryboundariesAdmin level1Adminlevel2Adminlevel3Adminlevel4 | 1:250K |
| Populated places (with attributes including:latitude/longitude,alternativenames,populationfigures,classification) | Settlements |
1:100K–1:250K |
| Transportationnetwork | RoadsRailways | 1:250K |
| Transportationinfrastructure | Airports/HelipadsSeaports | 1:250K |
| Hydrology | RiversLakes | 1:250K |
| Citymaps | Scannedcitymaps | 1:10K |

##### Table Two: Optional Datasets

| **Category** | **Datalayer** | **Recommendedscaleofsourcematerial** |
| --- | --- | --- |
| Marine | Coastlines | 1:250K |
| Terrain | Elevation | 1:250K |
| Nationalmapseries | Scannedtoposheets | 1:50K-1:250K |
| Satelliteimagery | Landsat,ASTER,Ikonos, Quickbirdimagery | Various |
| Naturalhazards2 | Various | Various |
| Thematic | Various | Various |



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[praeparātiō ex automatīs] MVP of idea of automation to pre-process external data to LSF internal file format #42

Focuses

External examples of type of reference data

International Federation of Red Cross and Red Crescent Societies | IFRC Data initiatives

Common Operational datasets (overview)

2007 reference (somewhat outdated)

Table One :Minimum Common Operational Datasets

Table Two: Optional Datasets

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Category	Datalayer	Recommendedscaleof sourcematerial
Political/Administrativeboundaries	CountryboundariesAdmin level1Adminlevel2Adminlevel3Adminlevel4	1:250K
Populated places (with attributes including:latitude/longitude,alternativenames,populationfigures,classification)	Settlements
1:100K–1:250K
Transportationnetwork	RoadsRailways	1:250K
Transportationinfrastructure	Airports/HelipadsSeaports	1:250K
Hydrology	RiversLakes	1:250K
Citymaps	Scannedcitymaps	1:10K

Category	Datalayer	Recommendedscaleofsourcematerial
Marine	Coastlines	1:250K
Terrain	Elevation	1:250K
Nationalmapseries	Scannedtoposheets	1:50K-1:250K
Satelliteimagery	Landsat,ASTER,Ikonos, Quickbirdimagery	Various
Naturalhazards2	Various	Various
Thematic	Various	Various

[praeparātiō ex automatīs] MVP of idea of automation to pre-process external data to LSF internal file format #42

Description

Focuses

External examples of type of reference data

International Federation of Red Cross and Red Crescent Societies | IFRC Data initiatives

Common Operational datasets (overview)

2007 reference (somewhat outdated)

Table One :Minimum Common Operational Datasets

Table Two: Optional Datasets

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions