Skip to content

BlobWriteChannel CRC32C and MD5 hash mismatch error #267

Closed
@mziccard

Description

@mziccard

The following code:

BlobId blobId = BlobId.of("bucket", "blob_name");
BlobInfo blobInfo = BlobInfo.builder(blobId).contentType("test/plain").build();
storage.create(blobInfo, "Hello, Cloud Storage!".getBytes(UTF_8));
BlobInfo updatedBlobInfo = storage.get(blobId);
WritableByteChannel channel = storage.writer(updatedBlobInfo);
channel.write(ByteBuffer.wrap("Updated content".getBytes(UTF_8)));
channel.close();

causes an error:

Caused by: com.google.gcloud.storage.StorageException: 400 Bad Request
{
"error": {
"errors": [
{
"domain": "global",
"reason": "invalid",
"message": "Provided CRC32C "IgKckQ==" doesn't match calculated CRC32C "c5cgUw=="."
},
{
"domain": "global",
"reason": "invalid",
"message": "Provided MD5 hash "paCw+9t7LhjISSAPBaeazA==" doesn't match calculated MD5 hash "SFCAEvYjPzpCjkKQr6MGGg=="."
}
],
"code": 400,
"message": "Provided CRC32C "IgKckQ==" doesn't match calculated CRC32C "c5cgUw=="."
}
}

If a stale BlobInfo object is passed to storage.writer(blobInfo) instead:

// ...
BlobInfo blobInfo = BlobInfo.builder(blobId).contentType("test/plain").build();
storage.create(blobInfo, "Hello, Cloud Storage!".getBytes(UTF_8));
WritableByteChannel channel = storage.writer(blobInfo);
channel.write(ByteBuffer.wrap("Updated content".getBytes(UTF_8)));
channel.close();

No error occurs.

It seems to me that if a "complete" StorageObject is passed to DefaultStorageRpc.open then its md5Hash and crc32c fields are used to request the upload id (see this line). At the end of the upload if that data do not match with the uploaded one the whole upload fails.
Do you think this is a desirable default behavior?
I think it is not, I would rather add some options to storage.writer(...) to allow users to explicitly choose to check md5Hash and crc32c if they have already computed them and put them into the BlobInfo object (i.e. if they know what they're doing).
Thoughts?

Metadata

Metadata

Assignees

Labels

🚨This issue needs some love.api: storageIssues related to the Cloud Storage API.triage meI really want to be triaged.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions