Author: Roman Scharkov [email protected]; Version: 0.8.0; Last updated: 2025-06-10;
Table of Contents
- Introduction
- Problem
- TIK Syntax Rules
- ICU Encoding
- Configuration Guidelines
- Limitations
- Standards and Conventions
- FAQ
- Special Thanks
"TIK" is an abbreviation for "Textual Internationalization Key". A TIK is simultaneously the source of truth for translation and a unique message identifier within a domain.
TIKs make translation keys human-readable by closely reflecting the actual text shown to the end users in the source code. This improves context for translators, enables programmatic generation of ICU messages, and supports better automation and CI/CD integration.
TIK enables more efficient workflows by integrating TIK processors with CI and LLMs to give developers immediate feedback on i18n issues before they hit production. It reduces costs by minimizing reliance on human translators and eases pressure on them by offloading routine tasks, allowing experts to focus more on quality assurance.
TIK is designed to be agnostic to both programming languages and natural languages used in application source code.
TIP: Check out the official TIK cheatsheet.
Internationalization (i18n) and localization (l10n) are hard — and most developers avoid them. Supporting multiple languages and regions demands significant effort, expensive tooling, complex error-prone workflows with slow feedback loops, and discipline that many teams are unable to take on.
- Translators often work with vague context, leading to broken translations.
- Messages get over-abstracted for reuse breaking grammar and structure in many languages.
- Automation is limited by missing metadata and pipelines developers lack control over.
- The feedback loop is slow, brittle, and disconnected from CI/CD.
The result is missing or poor i18n and l10n that signals lack of polish, undermines user trust, alienates global audiences and subsequently blocks adoption and growth.
Traditional internationalization relies heavily on key-based systems, where developers
assign abstract message identifiers (e.g. "dashboard.report.summary"
) to translated
strings stored in external files.
i18n.ByKey("dashboard.report.summary", numberOfMessages, dateTime)
Keys offer clear benefits, such as:
- Separation of concerns - Developers reference keys, while translators manage the actual text.
- Reusability - the same message can be used across different contexts or interfaces.
- Dynamic updates - translation changes go live immediately without redeployment.
- Integration - keys work seamlessly with most existing localization infrastructure.
However, key-based i18n introduces an abstraction layer between the source code and the actual text, making it harder for developers to immediately understand what message is being displayed - and in what form.
Naming is inherently hard - and coming up with meaningful, consistent translation keys can be difficult, especially at scale. Poorly chosen keys often lead to confusion, redundancy, or fragile reuse patterns.
TIKs, by contrast, embed the meaning directly in the code using a naturally readable and self-explanatory format that serves as source of truth for the i18n pipeline:
reader.String(`You had {# messages} at {time-short}.`, numberOfMessages, dateTime)
ICU messages are a powerful internationalization tool but are too complex, unreadable and error-prone when used directly inside the application source code.
Consider the following example in Go:
i18n.Text(`You had {numberOfMessages, plural,
=0 {no messages}
one {# message}
other {# messages}
} at {time, date, jm}.`, numberOfMessages, dateTime)
With TIK, developers write simple, readable keys and still get the full power of ICU under the hood.
[ignored spaces] [optional context [ignored spaces]] [body] [ignored spaces]
A TIK consists of an optional context and the required text body while the surrounding unicode spaces are ignored. Both the context and text body must not be empty.
The TIK context is an optional namespace used to disambiguate message keys.
It is not part of the message’s text body and hence must not be included in the
If a TIK starts with an opening square bracket [
then everything up to the next
closing square bracket ]
is treated as the context.
// description.
reader.String(`[context] Text.`)
Curly braces {
}
, square brackets [
]
and reverse-solidus \
are not allowed inside the context:
[{invalid} context] Text.
[[invalid context]] Text.
[invalid\context] Text.
The context must not be empty:
[ ] This context is invalid.
[] This context is invalid.
TIKs are unique message keys within a domain. However, the same original message text can have different meanings depending on usage. In such cases, context must be added to separate two TIKs with a similar text body.
Example: a
<body>
<h1>{
// "save" as in "save from danger". <--- HERE
i18n.Text(`Save your planet`)
}</h1>
<p>{ i18n.Text(`Your planet is in grave danger. Be the hero who saves it!`) }</p>
<dialog>
<p>You're about to exit the simulation.</p>
<form method="dialog">
<button>{
// "save" as in "save to file". <--- HERE
i18n.Text(`Save your planet`)
}</button>
<button>{
// Cancel exiting the simulation.
i18n.Text(`Cancel`)
}</button>
</form>
</dialog>
</body>
In the example above, the web page contains two TIKs that will result in 1 ICU message
being produced: Save your planet
. In English, the meaning of the word "save" depends
on context, which allows this message to be reused across different contexts. But other
languages such as German might require two separate messages:
"Rette deinen Planeten"
(as in "save your planet from danger")"Speichere deinen Planeten"
(as in "save your planet to file")
Since 1 TIK can never refer to 2 different messages a new TIK must be created yet sometimes the original text must be preserved. In this case we can apply a context to either (or both) messages to disambiguate them:
// "save" as in "save to file".
i18n.Text(`[button.save] Save your planet`)
The resulting TIK defines the context "button.save"
and
the text body "Save your planet"
.
The text body must always be written in
CLDR plural rule other
.
This allows a TIK to avoid branched statements like ICU plural arguments.
Placeholders allow TIKs to be easily readable yet auto-translatable to ICU message format. Below is an example TIK that uses multiple placeholders for different data types:
Today {name} earned {currency} for completing {# tasks} in section '{text}' at {time-short}.
{text}
Text placeholder{name}
Text placeholder with gender information{integer}
Integer{number}
Number{# ...}
Cardinal pluralization{ordinal}
Ordinal pluralization{date-full}
Date placeholder{date-long}
Date placeholder{date-medium}
Date placeholder{date-short}
Date placeholder{time-full}
Time placeholder{time-long}
Time placeholder{time-medium}
Time placeholder{time-short}
Time placeholder{currency}
Currency
A pluralization statement {# ...}
begins with {#
and ends with }
.
The #
is the placeholder for the actual number value (if any).
The contents ...
may contain any contents that aren't explicitly forbidden
(see invariants).
The contents may contain any number of placeholders:
You had {# messages marked as {text} at {time-long}}
You had {# tasks} assigned at {time-short}.
- Plural statements must not begin and end with a Unicode whitespace character. (as defined by Unicode):
This TIK is illegal: {# <- two spaces here}
This TIK is illegal: {# space here-> }
- Plural statements cannot be nested:
This TIK is illegal: {# first level {# second level}}
- Plural statement contents cannot start with a placeholder:
This TIK is illegal: {# {integer}}
This TIK is illegal: {# {number}}
This TIK is illegal: {# {currency}}
This TIK is illegal: {# {date-full}}
String placeholders {text}
represent arbitrary text.
You joined group {text}.
All articles from category: {text}.
If the identifier at hand has a gender (like a person's name) then consider using a string placeholder with gender instead because for gender-aware locales this might affect the grammar.
String placeholders {name}
must be infused with gender information.
This placeholder still represents arbitrary strings values but should be used for
names and identifiers to allow correct translation for gender-aware locales.
reader.String(
`The journey began, {name} had embarked onto the ship.`, // TIK
tokibundle.String{ Value: "John", Gender: tokibundle.GenderMale },
)
TIK doesn't define how gender information is attached to the placeholder. This is determined by the TIK processor.
ℹ️ Gender may affect grammar in some languages:
Language | masculine | feminine |
---|---|---|
Ukrainian | John готовий |
Martha готова |
Italian | John è pronto |
Martha è pronta |
French | John est prêt |
Martha est prête |
Spanish | John está listo |
Martha está lista |
Russian | John готов |
Martha готова |
The translated ICU message for locale uk
would be:
Розпочалася подорож, {var0_gender, select,
female { {var0} вирушила на корабель. }
male { {var0} вирушив на корабель. }
other { {var0} вирушило на корабель. }
}
TIK placeholder | ICU equivalent |
---|---|
{text} |
{var0} |
{name} |
{var0, select, other{...}} |
{number} |
{var0, number} |
{integer} |
{var0, number, integer} |
{# ...} |
{var0, plural, other{# ...}} |
{ordinal} |
{var0, selectordinal, other{#th}} |
{name} |
{var0, select, other{...}} |
{date-full} |
{var0, date, full} |
{date-long} |
{var0, date, long} |
{date-medium} |
{var0, date, medium} |
{date-short} |
{var0, date, short} |
{time-full} |
{var0, time, full} |
{time-long} |
{var0, time, long} |
{time-medium} |
{var0, time, medium} |
{time-short} |
{var0, time, short} |
{time-short} |
{var0, time, short} |
{currency} |
{var0, number, ::currency/auto} |
The ...
stands for any content, meaning that the following TIK:
{# messages in {# groups}}
Encodes to the following ICU:
`{var0, plural, other{# messages in {var0, plural, other{# groups}}}}`
All placeholders are mapped positionally, meaning that the order of occurrence in the TIK is the order expected for argument inputs.
[report] By {time-short}, {name} received {# emails}.
All placeholders use the var
prefix with a following positional index.
generated ICU:
By { var0, time, short }, { var1_gender, select,
other { {var1} }
} {var1} received {var2, plural,
one {# email}
other {# emails}
}.
Usage example in Go:
reader.String(`[report] By {time-short}, {text} received {# emails}.`,
time.Now(), "Max", len(emailsReceived))
The TIK specification defines guidelines only and imposes no strict format or requirements. The exact configuration format is left entirely to the processor implementation.
In large-scale projects with lots of translations it might make sense to group extracted texts into domains by defining the scopes of the domains in the configuration:
{
"domains": {
"domain_A": [
"/domain_a/...",
"/templates/domain_a/_"
],
"domain_B": [
"/domain_b/...",
"/templates/domain_b/_",
"/models/domain_b/_"
]
}
}
As with any technology, TIK introduces both advantages and trade-offs.
- Advantages
- ✅ Readability: TIK keys convey the intent of the message in a clear and human-readable format.
- ✅ Automation: The TIK syntax can be programmatically converted into ICU message structures and translation can mostly be automated through LLMs.
- ✅ Simplicity: The format is relatively straightforward to learn and apply consistently.
- Limitations
⚠️ Learning Curve: Developers must become familiar with the TIK syntax conventions.⚠️ Developer Responsibility: Developers must write somewhat meaningful texts and can't fully rely on translators. They can only rely on translators and software to improve those texts later in the translation pipeline.⚠️ Tooling Requirements: A dedicated extractor tool (referred to as TIK processor through this document) is required to parse and process TIK keys to eventually produce ICU messages for translation.⚠️ Source Language Translation: Messages written in the source language (e.g., English) must also be extracted and passed through the translation pipeline.
- Plural categories follow Unicode CLDR
- Language codes follow ISO 639-1
- Currency codes follow ISO 4217
- Timestamps follow RFC3339
- JSON examples follow RFC8259
- Date/Time RFC1123
The answer depends on perspective. While abstract keys like dashboard.newsfeed.summary
offer clear benefits
they also come with certain limitations.
It is likely that, for the foreseeable future, code will continue to be written and
maintained primarily by humans. At the same time, large language models are demonstrating
increasing proficiency in translation tasks. The concept behind TIK is to define clear,
human-readable messages directly in the source code, delegating the complexity of
generating accurate ICU messages for various languages to large language models.
To give you some context, only the last sentence of this answer was actually written by a human.
This doesn't address pipeline automation issues but is a theoretically viable solution to opaque abstract keys in source code. However, this approach is inherently limited to IDEs that support such a feature. Additionally, those IDEs/extensions must be compatible with your specific translation file format and message encoding (e.g., ICU, Fluent, ARB). It also breaks down entirely when browsing code outside the IDE - for example, on GitHub - where no plugin can preload or resolve translation keys.
Fluent can be considered a worthy counterpart to the ICU MessageFormat and technically nothing speaks against using it as an alternative TIK backend yet ICU was selected due to wider adoption.
Special thanks to Muthu Kumar (@MKRhere)!