Skip to content

Commit a9cfd55

Browse files
committed
encoding/xml: replace comments inside directives with a space
A Directive (like <!ENTITY xxx []>) can't have other nodes nested inside it (in our data structure representation), so there is no way to preserve comments. The previous behavior was to just elide them, which however might change the semantic meaning of the surrounding markup. Instead, replace them with a space which hopefully has the same semantic effect of the comment. Directives are not actually a node type in the XML spec, which instead specifies each of them separately (<!ENTITY, <!DOCTYPE, etc.), each with its own grammar. The rules for where and when the comments are allowed are not straightforward, and can't be implemented without implementing custom logic for each of the directives. Simply preserving the comments in the body of the directive would be problematic, as there can be unmatched quotes inside the comment. Whether those quotes are considered meaningful semantically or not, other parsers might disagree and interpret the output differently. This issue was reported by Juho Nurminen of Mattermost as it leads to round-trip mismatches. See #43168. It's not being fixed in a security release because round-trip stability is not a currently supported security property of encoding/xml, and we don't believe these fixes would be sufficient to reliably guarantee it in the future. Fixes CVE-2020-29510 Updates #43168 Change-Id: Icd86c75beff3e1e0689543efebdad10ed5178ce3 Reviewed-on: https://go-review.googlesource.com/c/go/+/277893 Run-TryBot: Filippo Valsorda <[email protected]> TryBot-Result: Go Bot <[email protected]> Trust: Filippo Valsorda <[email protected]> Reviewed-by: Katie Hockman <[email protected]>
1 parent 4d014e7 commit a9cfd55

File tree

2 files changed

+13
-6
lines changed

2 files changed

+13
-6
lines changed

src/encoding/xml/xml.go

+6
Original file line numberDiff line numberDiff line change
@@ -768,6 +768,12 @@ func (d *Decoder) rawToken() (Token, error) {
768768
}
769769
b0, b1 = b1, b
770770
}
771+
772+
// Replace the comment with a space in the returned Directive
773+
// body, so that markup parts that were separated by the comment
774+
// (like a "<" and a "!") don't get joined when re-encoding the
775+
// Directive, taking new semantic meaning.
776+
d.buf.WriteByte(' ')
771777
}
772778
}
773779
return Directive(d.buf.Bytes()), nil

src/encoding/xml/xml_test.go

+7-6
Original file line numberDiff line numberDiff line change
@@ -802,11 +802,11 @@ var directivesWithCommentsInput = `
802802

803803
var directivesWithCommentsTokens = []Token{
804804
CharData("\n"),
805-
Directive(`DOCTYPE [<!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">]`),
805+
Directive(`DOCTYPE [ <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">]`),
806806
CharData("\n"),
807-
Directive(`DOCTYPE [<!ENTITY go "Golang">]`),
807+
Directive(`DOCTYPE [<!ENTITY go "Golang"> ]`),
808808
CharData("\n"),
809-
Directive(`DOCTYPE <!-> <!> [<!ENTITY go "Golang">]`),
809+
Directive(`DOCTYPE <!-> <!> [<!ENTITY go "Golang"> ]`),
810810
CharData("\n"),
811811
}
812812

@@ -1051,9 +1051,10 @@ func testRoundTrip(t *testing.T, input string) {
10511051

10521052
func TestRoundTrip(t *testing.T) {
10531053
tests := map[string]string{
1054-
"leading colon": `<::Test ::foo="bar"><:::Hello></:::Hello><Hello></Hello></::Test>`,
1055-
"trailing colon": `<foo abc:="x"></foo>`,
1056-
"double colon": `<x:y:foo></x:y:foo>`,
1054+
"leading colon": `<::Test ::foo="bar"><:::Hello></:::Hello><Hello></Hello></::Test>`,
1055+
"trailing colon": `<foo abc:="x"></foo>`,
1056+
"double colon": `<x:y:foo></x:y:foo>`,
1057+
"comments in directives": `<!ENTITY x<!<!-- c1 [ " -->--x --> > <e></e> <!DOCTYPE xxx [ x<!-- c2 " -->--x ]>`,
10571058
}
10581059
for name, input := range tests {
10591060
t.Run(name, func(t *testing.T) { testRoundTrip(t, input) })

0 commit comments

Comments
 (0)