Skip to content

Commit dcbade4

Browse files
committed
extend readme with details about valid definitions
1 parent be47ff1 commit dcbade4

File tree

1 file changed

+31
-0
lines changed

1 file changed

+31
-0
lines changed

README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,34 @@ The general format of a mutation code is:
6262
where `gene` is a gene code (or `nuc` for the genomic nucleotide sequence), `ref` is the nucleotide or amino acids in the reference, `alt` is the specific nucleotide or amino acid for the mutatant. Either of `ref` or `alt` can be missing if no specific state is required.
6363

6464
Rules can either specify [min|max]_[ref|alt|ambig|oth] OR the call required at a mutation e.g. "N:S235F": (not )[ref|alt|ambig|oth]
65+
66+
## Valid Mutation Definitions
67+
The following are valid ways to describe variants of each type. We prefer the definition at the top of each list, but provide alternatives for backwards compatibility.
68+
* these are case insensitive e.g. S vs s
69+
* genes can be full e.g. orf1ab spike, or shortened e.g. 1ab, s
70+
* protein based definitions may be acceptable if the reference JSON includes them but may not be shortened e.g. NSP2
71+
* all coordinates are 1-based
72+
* for amino acid mutations, reference can be longer than 1 amino acid
73+
74+
SNP:
75+
* nuc:[`ref`]`nucleotide_coordinate`[`alt`]
76+
* snp:[`ref`]`nucleotide_coordinate`[`alt`]
77+
78+
Amino acid mutation:
79+
* `gene`:[`ref`]`amino_acid_coordinate_relative_to_gene`[`alt`]
80+
* `protein`:[`ref`]`amino_acid_coordinate_relative_to_protein`[`alt`]
81+
* `gene`:[`ref`]`amino_acid_coordinate_relative_to_gene` - this allows any other aa to be called as alt
82+
* aa:`gene`:[`ref`]`amino_acid_coordinate_relative_to_gene`[`alt`]
83+
* aa:`protein`:[`ref`]`amino_acid_coordinate_relative_to_protein`[`alt`]
84+
* aa:`gene`:[`ref`]`amino_acid_coordinate_relative_to_gene` - this allows any other aa to be called as alt
85+
86+
Deletion:
87+
* del:`nucleotide_coordinate`:`nucleotide_length`
88+
* `gene`:[`ref`]`amino_acid_coordinate`-
89+
* `gene`:[`ref`]`amino_acid_coordinate`del
90+
91+
Insertion (currently parsed but not typed):
92+
* nuc:`nucleotide_coordinate`+`inserted_sequence`
93+
* snp:`nucleotide_coordinate`+`inserted_sequence`
94+
* `gene`:`amino_acid_coordinate_relative_to_gene`+`inserted_sequence`
95+
* aa:`gene`:`amino_acid_coordinate_relative_to_gene`+`inserted_sequence`

0 commit comments

Comments
 (0)