Skip to content

Destination S3: Files generated are not accessible using S3fs-fuse #16576

Open
@yousefhosny1

Description

@yousefhosny1

Environment

  • Airbyte version: 0.40.2
  • OS Version / Instance: AWS EC2
  • Deployment: Docker
  • Source Connector and version: Any source
  • Destination Connector and version: S3 0.3.15
  • Step where error happened: Sync job

Current Behavior

When I create an Airbyte connection that extracts data from any source and loads it to S3 bucket (destination), I am unable to access to files / folders using s3fs-fuse. It seems like the directories created by Airbyte in S3 are files and I am unable to ls / cd them as shown in the image below
image (1)

However, In S3, they are folders and they have content (files) inside them as shown in the images below
image
image

Finally, any folder that I create by myself and populate with data manually (not through airbyte), I am able to query it using S3fs-fuse.

I am note sure if this is because airbyte writes files/data with prefixes instead of nested folders.. more details here

Expected Behavior

Tell us what should happen.
What I am expecting is that when Airbyte loads data in S3, the folders don't get corrupted and appear as files when using S3fs-fuse.
I think the reason behind this is that directory object created by Airbyte do not have x-amz-meta-mode header.

Logs

Uploaded
logs-103.txt

Steps to Reproduce

  1. Create a EL connection between any source and S3 bucket / directory
  2. Run the connection
  3. Install s3fs-fuse -- sudo apt install s3fs or check this for more installation options
  4. Create a password file containing your access key and secret access key echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs
  5. Mount a bucket to your local filesystem -- s3fs bucket_name local_path -o passwd_file=/home/yousef/.passwd-s3fs e.g s3fs landing_zone /home/yousef/landing_zone -o passwd_file=/home/yousef/.passwd-s3fs
  6. cd into the mounted bucket and try to query any folder that was created / populated by airbyte, you'll find that it's unaccessible and that it is considered a file

Are you willing to submit a PR?

No

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions