Skip to content

Commit 62d1245

Browse files
[RFC 0146] Meta.Categories, not Filesystem Directory Trees (#146)
* Meta.Categories, not Filesystem Directory Trees * Whitespace cleanup * Add a short answer to the bikeshedding problem * Add a short line on "Do nothing" alternative * Extend an answer for the "Ignore/nuke" alternative * Add "update ci" to future work * Add a repl-like interaction example * Add more arguments for categorization and against its nuking * small rewording * Add an option of category data structure * reorder arguments against nuking * add argument for usefulness of categorization * add drawback * rework nuke argument * update metainfo - shepherd team and leader * Add prior art section Also, add extra links * typo * Categorization Team * Remove the optional data structure * typo * reword the creation of a team * Debtags FAQ * Update rules and duties of categorization team * The team shall have authority to carry out their duties * A section for the team * Appstream as prior art * Section for code implementation * Move categorization team to implementation * Update future work * A hybrid approach to be considered by the future team * extra duties for the team * reword duties from team * typo * A semantic detail: treat the first element of meta.categories as most important * Move hybrid approach to alternatives section * identify AndersonTorres' tag * Suggestions from FCP * remove infinisil from shepherd team * add an extra reference to the categorization team
1 parent 6499fd8 commit 62d1245

File tree

1 file changed

+356
-0
lines changed

1 file changed

+356
-0
lines changed

rfcs/0146-meta-categories.md

+356
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,356 @@
1+
---
2+
feature: Decouple filesystem from categorization
3+
start-date: 2023-04-23
4+
author: Anderson Torres (@AndersonTorres)
5+
co-authors:
6+
shepherd-team: @7c6f434c @natsukium @fgaz
7+
shepherd-leader: @7c6f434c
8+
related-issues: (will contain links to implementation PRs)
9+
---
10+
11+
# Summary
12+
[summary]: #summary
13+
14+
Deploy a new method of categorization for the packages maintained by Nixpkgs,
15+
not relying on filesystem idiosyncrasies.
16+
17+
# Motivation
18+
[motivation]: #motivation
19+
20+
Currently, Nixpkgs uses the filesystem, or more accurately, the directory tree
21+
layout in order to informally categorize the softwares it packages, as described
22+
in the [Hierarchy](https://nixos.org/manual/nixpkgs/stable/#sec-hierarchy)
23+
section of Nixpkgs manual.
24+
25+
This is a simple, easy to understand and consecrated-by-use method of
26+
categorization, partially employed by many other package managers like GNU Guix
27+
and NetBSD pkgsrc.
28+
29+
However this system of categorization has serious problems:
30+
31+
1. It is bounded by the constraints imposed by the filesystem.
32+
33+
- Restrictions on filenames, subdirectory tree depth, permissions, inodes,
34+
quotas, and many other things.
35+
- Some of these restrictions are not well documented and are found simply
36+
by "bumping" on them.
37+
- The restrictions can vary on an implementation basis.
38+
- Some filesystems have more restrictions or less features than others,
39+
forcing an uncomfortable lowest common denominator.
40+
- Some operating systems can impose additional constraints over otherwise
41+
full-featured filesystems because of backwards compatibility (8 dot
42+
3, anyone?).
43+
44+
2. It requires a local checkout of the tree.
45+
46+
Certainly this checkout can be "cached" using some form of `find . >
47+
/tmp/pkgs-listing.txt`, or more sophisticated solutions like `locate +
48+
updatedb`. Nonetheless such solutions still require access to a fresh,
49+
updated copy of the Nixpkgs tree.
50+
51+
3. The creation of a new category - and more generally the manipulation of
52+
categories - requires an unpleaseant task of renaming and eventually patching
53+
many seemingly unrelated files.
54+
55+
- Moving files around Nixpkgs codebase requires updating their forward and
56+
backward references.
57+
- Especially in some auxiliary tools like editor plugins, testing suites,
58+
autoupdate scripts and so on.
59+
- Rewriting `all-packages.nix` can be error-prone (even using Metapad) and it
60+
can generate huge, noisy patches.
61+
62+
4. There is no convenient way to use multivalued categorization.
63+
64+
A piece of software can fulfill many categories; e.g.
65+
- an educational game
66+
- a console emulator (vs. a PC emulator)
67+
- and a special-purpose programming language (say, a smart-contracts one).
68+
69+
The current one-size-fits-all restriction is artificial, imposes unreasonable
70+
limitations and results in incomplete and confusing information.
71+
72+
- No, symlinks or hardlinks are not convenient for this purpose; not all
73+
environments support them (falling on the "less features than others"
74+
problem expressed before) and they convey nothing besides confusion - just
75+
think about writing the corresponding entry in `all-packages.nix`.
76+
77+
5. It puts over the (possibly human) package writer the mental load of where to
78+
put the files on the filesystem hierarchy, deviating them from the job of
79+
really writing them.
80+
81+
- Or just taking the shortest path and throw it on a folder under `misc`.
82+
83+
6. It "locks" the filesystem, preventing its usage for other, more sensible
84+
purposes.
85+
86+
7. The most important: the categorization is not discoverable via Nix language
87+
infrastructure.
88+
89+
Indeed there is no higher level way to query about such categories besides
90+
the one described in the bullet 2 above.
91+
92+
In light of such a bunch of problems, this RFC proposes a novel alternative to
93+
the above mess: new `meta` attributes.
94+
95+
# Detailed design
96+
[design]: #detailed-design
97+
98+
## Code Implementation
99+
[code-implementation]: #code-implementation
100+
101+
A new attribute, `meta.categories`, will be included for every Nix expression
102+
living inside Nixpkgs.
103+
104+
This attribute will be a list, whose elements are one of the possible elements
105+
of the `lib.categories` set.
106+
107+
A typical snippet of `lib.categories` will be similar to:
108+
109+
```nix
110+
{
111+
assembler = {
112+
name = "Assembler";
113+
description = ''
114+
A program that converts text written in assembly language to binary code.
115+
'';
116+
};
117+
118+
compiler = {
119+
name = "Compiler";
120+
description = ''
121+
A program that converts a source from a language to another, usually from
122+
a higher, human-readable level to a lower, machine level.
123+
'';
124+
};
125+
126+
font = {
127+
name = "Font";
128+
description = ''
129+
A set of files that defines a set of graphically-related glyphs.
130+
'';
131+
};
132+
133+
game = {
134+
name = "Game";
135+
description = ''
136+
A program developed with entertainment in mind.
137+
'';
138+
};
139+
140+
interpreter = {
141+
name = "Interpreter";
142+
description = ''
143+
A program that directly executes instructions written in a programming
144+
language, without requiring compilation into the native machine language.
145+
'';
146+
};
147+
148+
```
149+
150+
### Semantic Details
151+
[semantic-details]: #semantic-details
152+
153+
Given that `meta.categories` is implemented as a list, it is interesting to
154+
treat the first element of this list as the "most important" categorization, the
155+
one that mostly identifies with the software being classified.
156+
157+
## Categorization Team
158+
[categorization-team]: #categorization-team
159+
160+
Given the typical complexities that arise from categorization, and expecting
161+
that regular maintainers are not expected to understand its minuteness
162+
(according to the experience from [Debtags
163+
Team](https://wiki.debian.org/Debtags/FAQ#Why_don.27t_you_just_ask_the_maintainers_to_tag_their_own_packages.3F)),
164+
it is strongly recommended the creation of a team entrusted with authority to
165+
manage issues related to categorization and carry their corresponding duties.
166+
167+
# Examples and Interactions
168+
[examples-and-interactions]: #examples-and-interactions
169+
170+
In file bochs/default.nix:
171+
172+
```nix
173+
stdenv.mkDerivation {
174+
175+
. . .
176+
177+
meta = {
178+
. . .
179+
categories = with lib.categories; [ emulator debugger ];
180+
. . .
181+
};
182+
};
183+
}
184+
185+
```
186+
187+
In a `nix repl`:
188+
189+
```
190+
nix-repl> :l <nixpkgs>
191+
Added XXXXXX variables.
192+
193+
nix-repl> pkgs.bochs.meta.categories
194+
[ { ... } ]
195+
196+
nix-repl> map (z: z.name) pkgs.bochs.meta.categories
197+
[ "debugger" "emulator" ]
198+
```
199+
200+
# Drawbacks
201+
[drawbacks]: #drawbacks
202+
203+
The most immediate drawbacks are:
204+
205+
1. A huge treewide edit of Nixpkgs
206+
207+
On the other hand, this is easily sprintable and amenable to automation.
208+
209+
2. Bikeshedding
210+
211+
How many and which categories we should create? Can we expand them later?
212+
213+
For start, we can follow/take inspiration from many of the already existing
214+
categories sets and add extra ones when the needs arise. Indeed, it is way
215+
easier to create such categories using Nix language when compared to other
216+
software collections.
217+
218+
Further, the creation of a categorization team can resolve those litigations.
219+
220+
3. Superfluous
221+
222+
It can be argued that there are other ways to discover similar or related
223+
package sets, like Repology.
224+
225+
However, this argument is a bit circular, because e.g. the classification
226+
shown by Repology effectively replicates the classification done by the many
227+
software collections in its catalog. Therefore, relying in Repology merely
228+
transfers the question to external sources.
229+
230+
Further it becomes more pronounced when we take into account the fact Nixpkgs
231+
is top 1 of most Repology statistics. The expected outcome, therefore, should
232+
be precisely the opposite: Nixpkgs being _the_ source of structured metainfo
233+
for other software collections.
234+
235+
# Alternatives
236+
[alternatives]: #alternatives
237+
238+
1. Do nothing
239+
240+
This will exacerbate the problems already listed.
241+
242+
2. Ignore/nuke the categorization completely
243+
244+
This is an alternative worthy of some consideration. After all,
245+
categorization is not without its problems, as shown above. Removing or
246+
ignoring classification removes all problems.
247+
248+
However, there are good reasons to keep the categorization:
249+
250+
- The complete removal of categorization is too harsh. A solution that keeps
251+
and enhances the categorization is way more preferrable than one that nukes
252+
it completely.
253+
254+
- As said before, the categorization is already present; this RFC proposes to
255+
expose it to a higher level, in a structured, more discoverable format.
256+
257+
- Categorization is very traditional among software collections. Many of them
258+
are doing this just fine for years on end, and Nixpkgs can imitate them
259+
easily - and even surpass them, given the benefits of Nix language
260+
machinery.
261+
262+
- Categorization is useful in many scenarios and use cases - indeed they
263+
are ubiquitous in software world:
264+
- specialized search engines (from Repology to MELPA)
265+
- code forges, from Sourceforge to Gitlab
266+
- as said above, software collections from pkgsrc to slackbuilds
267+
- organization and preservation (as Software Heritage)
268+
269+
3. Debtags/Appstream hybrid approach
270+
271+
A hybrid approach for code implementation would be implement two meta
272+
attributes, namely
273+
274+
- `meta.categories` for Appstream-based categories
275+
- the corresponding `lib.categories` should follow Appstream closely, with
276+
few room to custom/extra categories
277+
- `meta.tags` for Debtags-like tags
278+
- while being inspired from the venerable Debtags work, the corresponding
279+
`lib.tags` is completely free to modify and even divert from Debtags,
280+
following its own way
281+
- generally speaking, `lib.tags` should be less bureaucratic than
282+
`lib.categories`
283+
284+
However, this approach arguably elevates the complexity of the whole work, and
285+
adds too much redundancy.
286+
287+
# Prior art
288+
[prior-art]: #prior-art
289+
290+
As said above, categorization is very traditional among software collections. It
291+
is not hard to cite examples in this arena; the most interesting ones I have
292+
found are listed below (linked at [references section](#references)):
293+
294+
- FreeBSD Ports;
295+
- Debtags;
296+
- Appstream Project;
297+
298+
# Unresolved questions
299+
[unresolved]: #unresolved-questions
300+
301+
There are remaining issues to be solved by the categorization team:
302+
303+
- What data structure is suitable to represent a category?
304+
- For now we stick to the most natural: a set `{ name, description }`.
305+
306+
- Should we have a set of primary, "most important" categories with mandatory
307+
status, in the sense each package should set at least one of them?
308+
- The answer is most certainly positive.
309+
310+
# Future work
311+
[future]: #future-work
312+
313+
- Create the [categorization team](#categorization-team)
314+
- Carry out the duties correlated to categorization, including but not limited
315+
to:
316+
317+
- Decide between possibilities of implementation;
318+
- Documentation updates;
319+
- Category curation, integration and updates;
320+
- Continuous Integration updates and adaptations;
321+
- Coordinaton of efforts to import, integrate and update categorization of
322+
packages;
323+
- Litigations and disputations:
324+
- Solve them, especially in corner cases;
325+
- Enforce implementation issues
326+
- Decide when a CI check should be converted to block
327+
- Grace periods
328+
329+
# References
330+
[references]: #references
331+
332+
- [Desktop Menu
333+
Specification](https://specifications.freedesktop.org/menu-spec/latest/);
334+
specifically,
335+
- [Main
336+
categories](https://specifications.freedesktop.org/menu-spec/latest/apa.html)
337+
- [Additional
338+
categories](https://specifications.freedesktop.org/menu-spec/latest/apas02.html)
339+
- [Reserved
340+
categories](https://specifications.freedesktop.org/menu-spec/latest/apas03.html)
341+
342+
- [Appstream](https://www.freedesktop.org/wiki/Distributions/AppStream/)
343+
344+
- [Debtags](https://wiki.debian.org/Debtags)
345+
346+
- [Debtags FAQ](https://wiki.debian.org/Debtags/FAQ)
347+
348+
- [NetBSD pkgsrc guide](https://www.netbsd.org/docs/pkgsrc/)
349+
- Especially, [Chapter 12, Section
350+
1](https://www.netbsd.org/docs/pkgsrc/components.html#components.Makefile)
351+
contains a short list of CATEGORIES.
352+
353+
- [FreeBSD Porters
354+
Handbook](https://docs.freebsd.org/en/books/porters-handbook/)
355+
- Especially
356+
[Categories](https://docs.freebsd.org/en/books/porters-handbook/makefiles/#porting-categories)

0 commit comments

Comments
 (0)