Skip to content

MSC3574: Marking up resources #3574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open
Changes from 4 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
d6a7009
Create xxxx-resource-markup.md
gleachkr Dec 18, 2021
bf4c2ab
Fix msc numbering
gleachkr Dec 18, 2021
187de9e
Replace old reference to threads
gleachkr Dec 18, 2021
767928c
Switch to sha256 for integrity checking
gleachkr Dec 18, 2021
877c010
Fix some event names
gleachkr Dec 19, 2021
c7778d6
First revision
gleachkr Dec 19, 2021
c777a5e
Merge branch 'matrix-org:main' into main
gleachkr Dec 23, 2021
49b6c2b
Create xxxx-resource-markup.md
gleachkr Dec 18, 2021
c95d52e
Fix msc numbering
gleachkr Dec 18, 2021
395c8a9
Replace old reference to threads
gleachkr Dec 18, 2021
1bf60f0
Switch to sha256 for integrity checking
gleachkr Dec 18, 2021
3ba0e0a
Fix some event names
gleachkr Dec 19, 2021
50a4e10
First revision
gleachkr Dec 19, 2021
eeb188f
Link MSC 3592, discuss Web Annotation Data Model
gleachkr Jan 10, 2022
2ff2cc9
Merge branch 'msc3574' (and spellcheck)
gleachkr Jan 10, 2022
3ff7e2a
Clarify hypothes.is support level for w3c standard
gleachkr Jan 10, 2022
e240904
Fix merge artifact
gleachkr Jan 10, 2022
2911bf3
Add w3c disadvantage: overlap in functionality
gleachkr Jan 11, 2022
caf39f4
Fix Markup Typo
gleachkr Jan 13, 2022
3d8aa19
Fix another typo
gleachkr Jan 16, 2022
41c6fc1
Merge branch 'matrix-org:main' into main
gleachkr Feb 18, 2022
e7a2dbc
First pass at w3c WADM serialization
gleachkr Feb 23, 2022
8db1772
Merge branch 'matrix-org:main' into main
gleachkr Mar 9, 2022
89f6571
Add text markup, security considerations
gleachkr Mar 13, 2022
6c238af
Link Audiovisual media markup
gleachkr May 21, 2022
3160aaf
Tiny lints, mention MSC3761
gleachkr Jun 7, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions proposals/3574-resource-markup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Marking up resources

This MSC proposes a way to annotate and discuss various resources (web pages, documents, videos, and other files) using Matrix. The general idea is to use [Spaces (MSC1772)](https://github.com/matrix-org/matrix-doc/pull/1772) to represent a general resource to be annotated, and then a combination of child rooms and [Extensible Events (MSC1767)](https://github.com/matrix-org/matrix-doc/blob/matthew/msc1767/proposals/1767-extensible-events.md) to represent annotations and discussion. This MSC specifies:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you introduce the msc as relying on spaces, rooms (and formerly threads). if we read a bit further, we learn that a space will represent a "file" that is being discussed and seemingly rooms are the individual annotations. it is not clear to me how threads play into this?

to me it seems more obvious that an annotation would be represented by a thread start, and then the discussion of that annotation is the thread. what is your design intention with rooms instead?

this picture in my mind is a bit influenced by the idea of full screen widgets as we have seen them in twim and matrix live. the pdf and website annotation apps are specialized matrix clients. in my minds eye I see a mashup of these ideas. the annotations could be backwards compatible by loading "the file" in a widget and then compatibility of your markup spec with regular clients like element allows people to discuss using the regular ui.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've mentioned the widget idea, and discussed the rationale for not focusing on threads.


* Additional data in the `m.room.create` event to mark a space as describing a resource to be annotated.
* Additional (optional) data in the `m.room.child` and `m.room.parent` events to mark sections of the resource (pages, timestamps, etc.) that are being discussed by the child room. The specific format of the location data is resource-specific, and will be described in further MSCs.
* An annotation event that is used within child rooms. The specific data describing the annotation location is once again resource-specific, and will be described in further MSCs.

# Proposal

## Additional data in `m.room.create`

A space will be considered a *resource* if its creation event includes a key `m.markup.resource`.

The `m.markup.resource` value MUST include either:

1. an `m.file` key, populated according to the `m.file` schema as presented in [Extensible Events - Files (MSC3551)](https://github.com/matrix-org/matrix-doc/blob/travis/msc/extev/files/proposals/3551-extensible-events-files.md), or
2. a `url` and `mimetype` key. This format is prefered for potentially mutable resources (like web pages with dynamic content) or for resources that require multiple network requests to display properly.

Clients should recognize that a `url` subordinate to an `m.markup.resource` (including within an `m.file` value) may contain URI schemes other than `mxc`. It may contain `http(s)`, and may ultimately contain other schemes in the future. Clients handling `m.markup.resource` should be prepared to fail gracefully upon encountering an unrecognized scheme.

An optional `sha256_hash` key may be included. If present, this key should be populated by a sha256 hash of the resource, for file-integrity checking.

### Examples

#### A hypothetical web resource

```
{
"type": "m.room.create",
"state_key": "",
"content": {
"creator": "@example:example.org",
"m.federate": true,
"room_version": "7"
"m.markup.resource": {
"url": "https://danilafe.com/blog/introducing_highlight/"
"mimetype": "text/html"
}
}
}
}
```


## Additional data in `m.room.child` and `m.room.parent`

Children of resources will be considered *conversations concerning* the resource. For purposes of discoverability, may sometimes be helpful to attach additional data to the content of `m.space.child` and `m.space.parent` events, in order to indicate a specific part of the resource that the conversation is based upon. The location of the part of the resource that the conversation is based upon will be indicated by the value of an `m.markup.location` key within the contents of the `m.space.child` and/or `m.space.parent` event.

Different mimetypes will require different notions of "location". A need for new notions of location may become evident over time. For example PDFs begin with a need to specify highlighted regions and then at a later date, pindrop locations. One location might also reasonably be presented in two or more different ways. For example, in a PDF, a location might be presented both as coordinates designating a region of a page, and as a tag or set of tags with offsets for use with a screen reader. In an audio file, a location might be presented both as a pair of bounding timestamps and as a pair of offsets within the text of embedded lyrics.

Hence, the `m.markup.location` value MUST be an object, whose keys are different kinds of locations occupied by a single annotation, with the names of those locations either formalized in the matrix spec or namespaced using Java conventions. <!-- Some proposed location types are... ADD RELATED MSCS HERE -->

### Examples

#### A hypothetical audio annotation:

```
{
"type": "m.space.child",
"state_key": "!abcd:example.com",
"content": {
"via": ["example.com", "test.org"]
"m.markup.location": {
"m.markup.audio_timespan" {
"begin": 0
"end": 31983
}
"com.genius.markup.lyrics" {
"begin": 0
"end": 35
}
}
}
}
```

## Annotation Message Events

It may be desirable, within a conversation concerning a resource, to make reference to some part of the resource. Annotation message events make this possible.

An annotation message event will treat `m.markup` as an extensible event schema following [Extensible events (MSC1767)](https://github.com/matrix-org/matrix-doc/pull/1767), but the message will ordinarily include an `m.text` value with text optionally describing the annotation as a fallback. The `m.markup` value will consist of an `m.markup.location`, and an `m.markup.parent` that indicates the room id of the resource with which the annotation message is associated. (The latter is necessary when a room has more than one parent resource.) Until migration to extensible events is complete, annotations will send messages of the type `m.room.message`, for compatibility with non-annotation-aware clients.

### Examples

#### An annotation prior to MSC1767 adoption


```
{
"type": "m.room.message",
"content": {
"msgtype": "m.emote",
"body": "created an annotation",
"org.matrix.msc1767.text": "created an annotation",
"m.markup": {
"m.markup.location": {..}
"m.markup.parent": "!WKZqabcAWoDDNZzupv:matrix.org"
}
}
}
```

#### An annotation after MSC1767 adoption and migration


```
{
"type": "m.markup",
"content": {
"m.text": "created an annotation",
"m.emote": {}
"m.markup": {
"m.markup.location": {..}
"m.markup.parent": "!WKZqabcAWoDDNZzupv:matrix.org"
}
}
}
```

# Potential Issues

There's no notion of "ownership" for state events---anyone who can send `m.space.parent` events can overwrite `m.space.parent` events sent by others. So anyone who can create a conversation concerning a certain resource can also remove conversations created by others. Clients can partly mitigate this by at least discouraging accidental deletions and encouraging courtesy. A more robust mitigation might be to introduce subspaces of resources, within which less-trusted users could still create conversations concerning a given resource. However, this seems undesirably complicated for an initial implementation. If it turns out to be necessary in practice, it could be added in a future MSC.

# Alternatives

## Greater generality

The idea of attaching conversations to locations might be construed even more broadly, to incorporate spaces representing resources that aren't easily associated with mimetypes and urls. For example, someone might want to create a space with rooms located at some sort of geospatial region, or located during some time-slice of an event.

However, these more abstract cases can be subsumed under the design here. Geospatial data can be represented using something like [geojson](https://en.wikipedia.org/wiki/GeoJSON) or some other standard, and time-slices of events can be represented as locations within a recording of the event (or locations within some other representation of the event, if no recording is available).

## Resources as a space type or subtype

Resources could be designated as such using an `m.purpose` event, as in [Room subtyping (MSC3088)](https://github.com/matrix-org/matrix-doc/blob/travis/msc/mutable-subtypes/proposals/3088-room-subtyping.md), or with an `m.room.type` event as in [Room Types (MSC1840)](https://github.com/matrix-org/matrix-doc/pull/1840).

However,

1. Indicating an associated resource in the room creation event makes it possible to inspect an invitation to a new space, allowing annotation-oriented clients to ignore irrelevant invitations.
2. If `m.purpose` or `m.room.type` are integrated into the spec and turn out to be useful for, e.g. filtering, then it would be straightforward to designate one or more `m.purpose` values or `m.room.type` values for resource rooms.

## Standalone `m.annotation.location` state events

Rather than being represented by `m.space.child` events, annotations that open a conversation concerning a part of a resource could be introduced as a new kind of state event. This has the disadvange of not making relationships between a resource and conversations about its parts visible to clients which are space-aware but not annotation-aware.

# Security Considerations

None.

# Unstable Prefix

| Proposed Final Identifier | Purpose | Development Identifier |
| ------------------------- | ---------------------------------------------------------- | ----------------------------------------- |
| `m.markup.location` | key in `m.space.child`, `m.space.parent` and `m.annotation`| `com.open-tower.msc3574.markup.location` |
| `m.markup.resource` | key in `m.create` | `com.open-tower.msc3574.markup.resource` |
| `m.markup` | extensible event schema | `com.open-tower.msc3574.markup` |
| `m.markup.parent` | key in `m.annotation` | `com.open-tower.msc3574.markup.parent` |