Skip to content

Commit f63a997

Browse files
hannahhowardmvdanacruikshankStebalienwillscott
authored
IPLD Prime In IPFS: Target Merge Branch (#7976)
* feat: switch to using go-ipld-prime for codecs, path resolution, and the `dag put/get` commands * fix: `dag put/get` not roundtripping due to an extra new line being added (#3503) More detailed information is in the CHANGELOG.md file. Very high level: * IPLD codecs (and their plugins) must use go-ipld-prime * Added support for the dag-json codec * `dag get/put` use IPLD codec names from the multicodec table * `dag get` defaults to dag-json output instead of json, but may output with other codecs * Data model pathing can be achieved using the /ipld prefix. For example, you can use `/ipld/QmFoo/Links/0/Hash` to traverse through a DagPB node * With `dag get/put` the DagPB field names have been changed to match the ones in the protobuf listed in the specification Co-authored-by: hannahhoward <[email protected]> Co-authored-by: Daniel Martí <[email protected]> Co-authored-by: acruikshank <[email protected]> Co-authored-by: Steven Allen <[email protected]> Co-authored-by: Will Scott <[email protected]> Co-authored-by: Will Scott <[email protected]> Co-authored-by: Rod Vagg <[email protected]> Co-authored-by: Adin Schmahmann <[email protected]> Co-authored-by: Eric Myhre <[email protected]>
1 parent 360aff4 commit f63a997

File tree

19 files changed

+511
-222
lines changed

19 files changed

+511
-222
lines changed

CHANGELOG.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,28 @@
11
# go-ipfs changelog
22

3+
## v0.10.0 TBD
4+
5+
**IPLD Levels Up**
6+
7+
The handling of data serialization as well as many aspects of DAG traversal and pathing have been migrated from older libraries, including [go-merkledag](https://github.com/ipfs/go-merkledag) and [go-ipld-format](https://github.com/ipfs/go-ipld-format) to the new **[go-ipld-prime](https://github.com/ipld/go-ipld-prime)** library and its components. This allows us to use many of the newer tools afforded by go-ipld-prime, stricter and more uniform codec implementations, support for additional (pluggable) codecs, and some minor performance improvements.
8+
9+
This is significant refactor of a core component that touches many parts of IPFS, and does come with some **breaking changes**:
10+
11+
* **IPLD plugins**:
12+
* The `PluginIPLD` interface has been changed to utilize go-ipld-prime. There is a demonstration of the change in the [bundled git plugin](./plugin/plugins/git/).
13+
* **The semantics of `dag put` and `dag get` change**:
14+
* `dag get` now takes the `format` option which accepts a multicodec name used to encode the output. By default this is `dag-json`. Users may notice differences from the previously plain Go JSON output, particularly where bytes are concerned which are now encoded using a form similar to CIDs: `{"/":{"bytes":"unpadded-base64-bytes"}}` rather than the previously Go-specific plain padded base64 string. See the [dag-json specification](https://ipld.io/specs/codecs/dag-json/spec/) for an explanation of these forms.
15+
* `dag get` no longer prints an additional new-line character at the end of the encoded block output. This means that the output as presented by `dag get` are the exact bytes of the requested node. A round-trip of such bytes back in through `dag put` using the same codec should result in the same CID.
16+
* `dag put` uses the `input-enc` option to specify the multicodec name of the format data is being provided in, and the `format` option to specify the multicodec multicodec name of the format the data should be stored in. These formerly defaulted to `json` and `cbor` respectively. They now default to `dag-json` and `dag-cbor` respectively but may be changed to any supported codec (bundled or loaded via plugin) by its [multicodec name](https://github.com/multiformats/multicodec/blob/master/table.csv).
17+
* The `json` and `cbor` multicodec names (as used by `input-enc` and `format` options) are now no longer aliases for `dag-json` and `dag-cbor` respectively. Instead, they now refer to their proper [multicodec](https://github.com/multiformats/multicodec/blob/master/table.csv) types. `cbor` refers to a plain CBOR format, which will not encode CIDs and does not have strict deterministic encoding rules. `json` is a plain JSON format, which also won't encode CIDs and will encode bytes in the Go-specific padded base64 string format rather than the dag-json method of byte encoding. See https://ipld.io/specs/codecs/ for more information on IPLD codecs.
18+
* The **dag-pb codec**, which is used to encode UnixFS data for IPFS, is now represented in a form via the `dag` API that mirrors the protobuf schema used to define the binary format and unifies the implementations and specification of dag-pb across the IPLD and IPFS stacks. Previously, additional layers of code within IPFS between protobuf serialization and UnixFS handling for file and directory data, obscured the forms that are described by the protobuf representation. Much of this code has now been replaced and there are fewer layers of transformation. This means that interacting with dag-pb data via the `dag` API will use different forms:
19+
* Previously, using `dag get` on a dag-pb block would present the block serialized as JSON as `{"data":"padded-base64-bytes","links":[{"Name":"foo","Size":100,"Cid":{"/":"Qm..."}},...]}`.
20+
* Using the dag-pb data model specification and the new default dag-json codec for output, this will now be serialized as: `{"Data":{"/":{"bytes":"unpadded-base64-bytes"}},"Links":[{"Name":"foo","Tsize":100,"Hash":{"/":"Qm..."}},...]}`. Aside from the change in byte formatting, most field names have changed: `data` → `Data`, `links` → `Links`, `Size` → `Tsize`, `Cid` → `Hash`. Note that this output can be changed now using the `--format` option to specify an alternative codec.
21+
* Using `dag put` and a `format` option of `dag-pb` now requires that the input conform to this dag-pb specified form. Previously, input using `{"data":"...","links":[...]}` was accepted, now it must be `{"Data":"...","Links":[...]}`.
22+
* Previously it was not possible to use paths to navigate to any of these properties of a dag-pb node, the only possible paths were named links, e.g. `dag get QmFoo/NamedLink` where `NamedLink` was one of the links whose name was `NamedLink`. This functionality remains the same, but by prefixing the path with `/ipld/` we enter data model pathing semantics and can `dag get /ipld/QmFoo/Links/0/Hash` to navigate to links or `/ipld/QmFoo/Data` to simply retrieve the data section of the node, for example.
23+
* See the [dag-pb specification](https://ipld.io/specs/codecs/dag-pb/) for details on the codec and its data model representation.
24+
* See this [detailed write-up](https://github.com/ipld/ipld/blob/master/design/tricky-choices/dag-pb-forms-impl-and-use.md) for further background on these changes.
25+
326
## v0.9.1 2021-07-20
427

528
This is a small bug fix release resolving the following issues:

core/commands/dag/dag.go

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -77,10 +77,10 @@ into an object of the specified format.
7777
cmds.FileArg("object data", true, true, "The object to put").EnableStdin(),
7878
},
7979
Options: []cmds.Option{
80-
cmds.StringOption("format", "f", "Format that the object will be added as.").WithDefault("cbor"),
81-
cmds.StringOption("input-enc", "Format that the input object will be.").WithDefault("json"),
80+
cmds.StringOption("format", "f", "Format that the object will be added as.").WithDefault("dag-cbor"),
81+
cmds.StringOption("input-enc", "Format that the input object will be.").WithDefault("dag-json"),
8282
cmds.BoolOption("pin", "Pin this object when adding."),
83-
cmds.StringOption("hash", "Hash function to use").WithDefault(""),
83+
cmds.StringOption("hash", "Hash function to use").WithDefault("sha2-256"),
8484
},
8585
Run: dagPut,
8686
Type: OutputObject{},
@@ -108,6 +108,9 @@ format.
108108
Arguments: []cmds.Argument{
109109
cmds.StringArg("ref", true, false, "The object to get").EnableStdin(),
110110
},
111+
Options: []cmds.Option{
112+
cmds.StringOption("format", "f", "Format that the object will be serialized as.").WithDefault("dag-json"),
113+
},
111114
Run: dagGet,
112115
}
113116

core/commands/dag/get.go

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,18 @@
11
package dagcmd
22

33
import (
4-
"strings"
4+
"fmt"
5+
"io"
56

67
"github.com/ipfs/go-ipfs/core/commands/cmdenv"
8+
ipldlegacy "github.com/ipfs/go-ipld-legacy"
79
"github.com/ipfs/interface-go-ipfs-core/path"
810

11+
"github.com/ipld/go-ipld-prime"
12+
"github.com/ipld/go-ipld-prime/multicodec"
13+
"github.com/ipld/go-ipld-prime/traversal"
14+
mc "github.com/multiformats/go-multicodec"
15+
916
cmds "github.com/ipfs/go-ipfs-cmds"
1017
)
1118

@@ -15,6 +22,12 @@ func dagGet(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) e
1522
return err
1623
}
1724

25+
format, _ := req.Options["format"].(string)
26+
var fCodec mc.Code
27+
if err := fCodec.Set(format); err != nil {
28+
return err
29+
}
30+
1831
rp, err := api.ResolvePath(req.Context, path.New(req.Arguments[0]))
1932
if err != nil {
2033
return err
@@ -25,14 +38,34 @@ func dagGet(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) e
2538
return err
2639
}
2740

28-
var out interface{} = obj
41+
universal, ok := obj.(ipldlegacy.UniversalNode)
42+
if !ok {
43+
return fmt.Errorf("%T is not a valid IPLD node", obj)
44+
}
45+
46+
finalNode := universal.(ipld.Node)
47+
2948
if len(rp.Remainder()) > 0 {
30-
rem := strings.Split(rp.Remainder(), "/")
31-
final, _, err := obj.Resolve(rem)
49+
remainderPath := ipld.ParsePath(rp.Remainder())
50+
51+
finalNode, err = traversal.Get(finalNode, remainderPath)
3252
if err != nil {
3353
return err
3454
}
35-
out = final
3655
}
37-
return cmds.EmitOnce(res, &out)
56+
57+
encoder, err := multicodec.LookupEncoder(uint64(fCodec))
58+
if err != nil {
59+
return fmt.Errorf("invalid encoding: %s - %s", format, err)
60+
}
61+
62+
r, w := io.Pipe()
63+
go func() {
64+
defer w.Close()
65+
if err := encoder(finalNode, w); err != nil {
66+
_ = w.CloseWithError(err)
67+
}
68+
}()
69+
70+
return res.Emit(r)
3871
}

core/commands/dag/put.go

Lines changed: 64 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,28 @@
11
package dagcmd
22

33
import (
4+
"bytes"
45
"fmt"
5-
"math"
66

7+
blocks "github.com/ipfs/go-block-format"
8+
"github.com/ipfs/go-cid"
79
"github.com/ipfs/go-ipfs/core/commands/cmdenv"
8-
"github.com/ipfs/go-ipfs/core/coredag"
10+
ipldlegacy "github.com/ipfs/go-ipld-legacy"
11+
"github.com/ipld/go-ipld-prime/multicodec"
12+
basicnode "github.com/ipld/go-ipld-prime/node/basic"
913

1014
cmds "github.com/ipfs/go-ipfs-cmds"
1115
files "github.com/ipfs/go-ipfs-files"
1216
ipld "github.com/ipfs/go-ipld-format"
13-
mh "github.com/multiformats/go-multihash"
17+
mc "github.com/multiformats/go-multicodec"
18+
19+
// Expected minimal set of available format/ienc codecs.
20+
_ "github.com/ipld/go-codec-dagpb"
21+
_ "github.com/ipld/go-ipld-prime/codec/cbor"
22+
_ "github.com/ipld/go-ipld-prime/codec/dagcbor"
23+
_ "github.com/ipld/go-ipld-prime/codec/dagjson"
24+
_ "github.com/ipld/go-ipld-prime/codec/json"
25+
_ "github.com/ipld/go-ipld-prime/codec/raw"
1426
)
1527

1628
func dagPut(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) error {
@@ -24,16 +36,33 @@ func dagPut(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) e
2436
hash, _ := req.Options["hash"].(string)
2537
dopin, _ := req.Options["pin"].(bool)
2638

27-
// mhType tells inputParser which hash should be used. MaxUint64 means 'use
28-
// default hash' (sha256 for cbor, sha1 for git..)
29-
mhType := uint64(math.MaxUint64)
39+
var icodec mc.Code
40+
if err := icodec.Set(ienc); err != nil {
41+
return err
42+
}
43+
var fcodec mc.Code
44+
if err := fcodec.Set(format); err != nil {
45+
return err
46+
}
47+
var mhType mc.Code
48+
if err := mhType.Set(hash); err != nil {
49+
return err
50+
}
3051

31-
if hash != "" {
32-
var ok bool
33-
mhType, ok = mh.Names[hash]
34-
if !ok {
35-
return fmt.Errorf("%s in not a valid multihash name", hash)
36-
}
52+
cidPrefix := cid.Prefix{
53+
Version: 1,
54+
Codec: uint64(fcodec),
55+
MhType: uint64(mhType),
56+
MhLength: -1,
57+
}
58+
59+
decoder, err := multicodec.LookupDecoder(uint64(icodec))
60+
if err != nil {
61+
return err
62+
}
63+
encoder, err := multicodec.LookupEncoder(uint64(fcodec))
64+
if err != nil {
65+
return err
3766
}
3867

3968
var adder ipld.NodeAdder = api.Dag()
@@ -48,22 +77,36 @@ func dagPut(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) e
4877
if file == nil {
4978
return fmt.Errorf("expected a regular file")
5079
}
51-
nds, err := coredag.ParseInputs(ienc, format, file, mhType, -1)
80+
81+
node := basicnode.Prototype.Any.NewBuilder()
82+
if err := decoder(node, file); err != nil {
83+
return err
84+
}
85+
n := node.Build()
86+
87+
bd := bytes.NewBuffer([]byte{})
88+
if err := encoder(n, bd); err != nil {
89+
return err
90+
}
91+
92+
blockCid, err := cidPrefix.Sum(bd.Bytes())
93+
if err != nil {
94+
return err
95+
}
96+
blk, err := blocks.NewBlockWithCid(bd.Bytes(), blockCid)
5297
if err != nil {
5398
return err
5499
}
55-
if len(nds) == 0 {
56-
return fmt.Errorf("no node returned from ParseInputs")
100+
ln := ipldlegacy.LegacyNode{
101+
Block: blk,
102+
Node: n,
57103
}
58104

59-
for _, nd := range nds {
60-
err := b.Add(req.Context, nd)
61-
if err != nil {
62-
return err
63-
}
105+
if err := b.Add(req.Context, &ln); err != nil {
106+
return err
64107
}
65108

66-
cid := nds[0].Cid()
109+
cid := ln.Cid()
67110
if err := res.Emit(&OutputObject{Cid: cid}); err != nil {
68111
return err
69112
}

core/core.go

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,14 @@ import (
1717
"github.com/ipfs/go-ipfs-pinner"
1818

1919
bserv "github.com/ipfs/go-blockservice"
20+
"github.com/ipfs/go-fetcher"
2021
"github.com/ipfs/go-graphsync"
2122
bstore "github.com/ipfs/go-ipfs-blockstore"
2223
exchange "github.com/ipfs/go-ipfs-exchange-interface"
2324
"github.com/ipfs/go-ipfs-provider"
2425
ipld "github.com/ipfs/go-ipld-format"
2526
logging "github.com/ipfs/go-log"
2627
mfs "github.com/ipfs/go-mfs"
27-
resolver "github.com/ipfs/go-path/resolver"
2828
goprocess "github.com/jbenet/goprocess"
2929
connmgr "github.com/libp2p/go-libp2p-core/connmgr"
3030
ic "github.com/libp2p/go-libp2p-core/crypto"
@@ -70,18 +70,19 @@ type IpfsNode struct {
7070
PNetFingerprint libp2p.PNetFingerprint `optional:"true"` // fingerprint of private network
7171

7272
// Services
73-
Peerstore pstore.Peerstore `optional:"true"` // storage for other Peer instances
74-
Blockstore bstore.GCBlockstore // the block store (lower level)
75-
Filestore *filestore.Filestore `optional:"true"` // the filestore blockstore
76-
BaseBlocks node.BaseBlocks // the raw blockstore, no filestore wrapping
77-
GCLocker bstore.GCLocker // the locker used to protect the blockstore during gc
78-
Blocks bserv.BlockService // the block service, get/add blocks.
79-
DAG ipld.DAGService // the merkle dag service, get/add objects.
80-
Resolver *resolver.Resolver // the path resolution system
81-
Reporter *metrics.BandwidthCounter `optional:"true"`
82-
Discovery discovery.Service `optional:"true"`
83-
FilesRoot *mfs.Root
84-
RecordValidator record.Validator
73+
Peerstore pstore.Peerstore `optional:"true"` // storage for other Peer instances
74+
Blockstore bstore.GCBlockstore // the block store (lower level)
75+
Filestore *filestore.Filestore `optional:"true"` // the filestore blockstore
76+
BaseBlocks node.BaseBlocks // the raw blockstore, no filestore wrapping
77+
GCLocker bstore.GCLocker // the locker used to protect the blockstore during gc
78+
Blocks bserv.BlockService // the block service, get/add blocks.
79+
DAG ipld.DAGService // the merkle dag service, get/add objects.
80+
IPLDFetcherFactory fetcher.Factory `name:"ipldFetcher"` // fetcher that paths over the IPLD data model
81+
UnixFSFetcherFactory fetcher.Factory `name:"unixfsFetcher"` // fetcher that interprets UnixFS data
82+
Reporter *metrics.BandwidthCounter `optional:"true"`
83+
Discovery discovery.Service `optional:"true"`
84+
FilesRoot *mfs.Root
85+
RecordValidator record.Validator
8586

8687
// Online
8788
PeerHost p2phost.Host `optional:"true"` // the network host (server+client)

core/coreapi/coreapi.go

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,12 @@ import (
1919
"fmt"
2020

2121
bserv "github.com/ipfs/go-blockservice"
22-
"github.com/ipfs/go-ipfs-blockstore"
23-
"github.com/ipfs/go-ipfs-exchange-interface"
22+
"github.com/ipfs/go-fetcher"
23+
blockstore "github.com/ipfs/go-ipfs-blockstore"
24+
exchange "github.com/ipfs/go-ipfs-exchange-interface"
2425
offlinexch "github.com/ipfs/go-ipfs-exchange-offline"
25-
"github.com/ipfs/go-ipfs-pinner"
26-
"github.com/ipfs/go-ipfs-provider"
26+
pin "github.com/ipfs/go-ipfs-pinner"
27+
provider "github.com/ipfs/go-ipfs-provider"
2728
offlineroute "github.com/ipfs/go-ipfs-routing/offline"
2829
ipld "github.com/ipfs/go-ipld-format"
2930
dag "github.com/ipfs/go-merkledag"
@@ -55,13 +56,14 @@ type CoreAPI struct {
5556
baseBlocks blockstore.Blockstore
5657
pinning pin.Pinner
5758

58-
blocks bserv.BlockService
59-
dag ipld.DAGService
60-
61-
peerstore pstore.Peerstore
62-
peerHost p2phost.Host
63-
recordValidator record.Validator
64-
exchange exchange.Interface
59+
blocks bserv.BlockService
60+
dag ipld.DAGService
61+
ipldFetcherFactory fetcher.Factory
62+
unixFSFetcherFactory fetcher.Factory
63+
peerstore pstore.Peerstore
64+
peerHost p2phost.Host
65+
recordValidator record.Validator
66+
exchange exchange.Interface
6567

6668
namesys namesys.NameSystem
6769
routing routing.Routing
@@ -167,8 +169,10 @@ func (api *CoreAPI) WithOptions(opts ...options.ApiOption) (coreiface.CoreAPI, e
167169
baseBlocks: n.BaseBlocks,
168170
pinning: n.Pinning,
169171

170-
blocks: n.Blocks,
171-
dag: n.DAG,
172+
blocks: n.Blocks,
173+
dag: n.DAG,
174+
ipldFetcherFactory: n.IPLDFetcherFactory,
175+
unixFSFetcherFactory: n.UnixFSFetcherFactory,
172176

173177
peerstore: n.Peerstore,
174178
peerHost: n.PeerHost,

core/coreapi/path.go

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ import (
88
"github.com/ipfs/go-namesys/resolve"
99

1010
"github.com/ipfs/go-cid"
11+
"github.com/ipfs/go-fetcher"
1112
ipld "github.com/ipfs/go-ipld-format"
1213
ipfspath "github.com/ipfs/go-path"
13-
"github.com/ipfs/go-path/resolver"
14-
uio "github.com/ipfs/go-unixfs/io"
14+
ipfspathresolver "github.com/ipfs/go-path/resolver"
1515
coreiface "github.com/ipfs/interface-go-ipfs-core"
1616
path "github.com/ipfs/interface-go-ipfs-core/path"
1717
)
@@ -49,23 +49,19 @@ func (api *CoreAPI) ResolvePath(ctx context.Context, p path.Path) (path.Resolved
4949
return nil, err
5050
}
5151

52-
var resolveOnce resolver.ResolveOnce
53-
54-
switch ipath.Segments()[0] {
55-
case "ipfs":
56-
resolveOnce = uio.ResolveUnixfsOnce
57-
case "ipld":
58-
resolveOnce = resolver.ResolveSingle
59-
default:
52+
if ipath.Segments()[0] != "ipfs" && ipath.Segments()[0] != "ipld" {
6053
return nil, fmt.Errorf("unsupported path namespace: %s", p.Namespace())
6154
}
6255

63-
r := &resolver.Resolver{
64-
DAG: api.dag,
65-
ResolveOnce: resolveOnce,
56+
var dataFetcher fetcher.Factory
57+
if ipath.Segments()[0] == "ipld" {
58+
dataFetcher = api.ipldFetcherFactory
59+
} else {
60+
dataFetcher = api.unixFSFetcherFactory
6661
}
62+
resolver := ipfspathresolver.NewBasicResolver(dataFetcher)
6763

68-
node, rest, err := r.ResolveToLastNode(ctx, ipath)
64+
node, rest, err := resolver.ResolveToLastNode(ctx, ipath)
6965
if err != nil {
7066
return nil, err
7167
}

0 commit comments

Comments
 (0)