Skip to content

HDFS data transfer encryption support #145

Closed
@mxk1235

Description

@mxk1235

I was attempting to use this library against an hdfs cluster that has hadoop.rpc.protection setting in core-site.xml set to privacy, as well as dfs.encrypt.data.transfer which is true in hdfs-site.xml. i believe those apply to the protobuf rpc interface.

the error message i received was the following.
no available namenodes: SASL handshake: wrong Token ID. Expected 0504, was 6030

after some debugging i think it occurs here https://github.com/colinmarc/hdfs/blob/master/internal/rpc/kerberos.go#L67 . i suspect the namenode is replying with an encrypted message, while the doKerberosHandshake() expects otherwise.

on first look the library just sets the default value for dfs.encrypt.data.transfer property as false

optional bool encryptDataTransfer = 6 [default = false];

and there is no way of creating a client with that property set to true.

hdfs/client.go

Line 122 in f87e1d6

func NewClient(options ClientOptions) (*Client, error) {

there is a fetchDefaults() function, that's only invoked by file_writer, but not by file_reader (e.g. Stat(), Readdir(), and Read() methods).

can you comment if i'm digging in the right place, and whether the encrypted part of the protocol applies to the read functionality?

here are relevant properties from core-site.xml

  <property>
    <name>hadoop.security.authentication</name>
    <value>kerberos</value>
  </property>
  <property>
    <name>hadoop.security.authorization</name>
    <value>true</value>
  </property>
  <property>
    <name>hadoop.rpc.protection</name>
    <value>privacy</value>
  </property>

and hdfs-site.xml

  <property>
    <name>dfs.encrypt.data.transfer.algorithm</name>
    <value>3des</value>
  </property>
  <property>
    <name>dfs.encrypt.data.transfer.cipher.suites</name>
    <value>AES/CTR/NoPadding</value>
  </property>
  <property>
    <name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>
    <value>256</value>
  </property>
  <property>
    <name>dfs.namenode.acls.enabled</name>
    <value>true</value>
  </property>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions