Skip to content
This repository was archived by the owner on Apr 3, 2020. It is now read-only.

Commit 5469386

Browse files
tyoshinoCommit bot
authored and
Commit bot
committed
[Regression fix] [Data URI parser] Accept data URI with invalid mediatype data
r272022 introduced grammar check on the mediatype part in a data URI. This broke applications that depend on the behavior that our WebURLLoaderImpl and URLRequestDataJob accept data URIs with incomplete mediatype such as "data:image;base64,xxx". Accept this kind of data URI again for backward compatibility. Instead, fallback to the default mediatype when ParseMimeTypeWithoutParameter() returns false in DataURL::Parse() to avoid generating Content-type header with invalid mediatype. BUG=412479 Review URL: https://codereview.chromium.org/555383003 Cr-Commit-Position: refs/heads/master@{#297593}
1 parent 46e7f9b commit 5469386

File tree

4 files changed

+54
-13
lines changed

4 files changed

+54
-13
lines changed

net/base/data_url.cc

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,10 +66,16 @@ bool DataURL::Parse(const GURL& url, std::string* mime_type,
6666
}
6767

6868
if (mime_type->empty()) {
69-
// fallback to defaults if nothing specified in the URL:
69+
// Fallback to the default if nothing specified in the mediatype part as
70+
// specified in RFC2045. As specified in RFC2397, we use |charset| even if
71+
// |mime_type| is empty.
7072
mime_type->assign("text/plain");
7173
} else if (!ParseMimeTypeWithoutParameter(*mime_type, NULL, NULL)) {
72-
return false;
74+
// Fallback to the default as recommended in RFC2045 when the mediatype
75+
// value is invalid. For this case, we don't respect |charset| but force it
76+
// set to "US-ASCII".
77+
mime_type->assign("text/plain");
78+
charset->assign("US-ASCII");
7379
}
7480
if (charset->empty())
7581
charset->assign("US-ASCII");

net/base/data_url.h

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,23 @@ class NET_EXPORT DataURL {
3535
// decoded data (e.g.., if the data URL specifies base64 encoding, then the
3636
// returned data is base64 decoded, and any %-escaped bytes are unescaped).
3737
//
38-
// If the URL is malformed, then this method will return false, and its
39-
// output variables will remain unchanged. On success, true is returned.
38+
// If the media type value doesn't match the media-type production defined in
39+
// RFC 7231, mime_type will be set to the default value "text/plain". We
40+
// don't simply fail for this grammar violation since Chromium had been
41+
// accepting such invalid values. For example, <img> element with the src
42+
// attribute set to a data URL with an invalid media type "image" (without a
43+
// slash and subtype) had been displayed. However, the value this method will
44+
// store in mime_type argument can be used for generating other headers, etc.
45+
// This could lead to security vulnerability. We don't want to accept
46+
// arbitrary value and ask each caller to validate the return value.
47+
//
48+
// If the charset parameter is specified but its value doesn't match the
49+
// token production defined in RFC 7230, this method simply fails and returns
50+
// false.
51+
//
52+
// If there's any other grammar violation in the URL, then this method will
53+
// return false. Output variables may be changed and contain invalid data. On
54+
// success, true is returned.
4055
//
4156
// OPTIONAL: If |data| is NULL, then the <data> section will not be parsed
4257
// or validated.

net/base/data_url_unittest.cc

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,28 @@ TEST(DataURLTest, Parse) {
6363
"US-ASCII",
6464
"hello world" },
6565

66+
// Allow invalid mediatype for backward compatibility but set mime_type to
67+
// "text/plain" instead of the invalid mediatype.
68+
{ "data:foo,boo",
69+
true,
70+
"text/plain",
71+
"US-ASCII",
72+
"boo" },
73+
74+
// When accepting an invalid mediatype, override charset with "US-ASCII"
75+
{ "data:foo;charset=UTF-8,boo",
76+
true,
77+
"text/plain",
78+
"US-ASCII",
79+
"boo" },
80+
81+
// Invalid mediatype. Includes a slash but the type part is not a token.
82+
{ "data:f(oo/bar;baz=1;charset=kk,boo",
83+
true,
84+
"text/plain",
85+
"US-ASCII",
86+
"boo" },
87+
6688
{ "data:foo/bar;baz=1;charset=kk,boo",
6789
true,
6890
"foo/bar",
@@ -88,13 +110,6 @@ TEST(DataURLTest, Parse) {
88110
"US-ASCII",
89111
"<html><body><b>hello world</b></body></html>" },
90112

91-
// Bad mime type
92-
{ "data:f(oo/bar;baz=1;charset=kk,boo",
93-
false,
94-
"",
95-
"",
96-
"" },
97-
98113
// the comma cannot be url-escaped!
99114
{ "data:%2Cblah",
100115
false,

net/url_request/url_request_data_job_unittest.cc

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,12 +63,17 @@ TEST(BuildResponseTest, InvalidMimeType) {
6363
scoped_refptr<net::HttpResponseHeaders> headers(
6464
new net::HttpResponseHeaders(std::string()));
6565

66-
// MIME type contains delimiters. Must be rejected.
66+
// MIME type contains delimiters. Must be accepted but Content-Type header
67+
// should be generated as if the mediatype was text/plain.
6768
EXPECT_EQ(
68-
net::ERR_INVALID_URL,
69+
net::OK,
6970
URLRequestDataJob::BuildResponse(
7071
GURL("data:f(o/b)r,test"),
7172
&mime_type, &charset, &data, headers.get()));
73+
74+
std::string value;
75+
EXPECT_TRUE(headers->GetNormalizedHeader("Content-Type", &value));
76+
EXPECT_EQ(value, "text/plain;charset=US-ASCII");
7277
}
7378

7479
TEST(BuildResponseTest, InvalidCharset) {

0 commit comments

Comments
 (0)