Skip to content

[RFC] Add Support for File based Service Binding Information #804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 7, 2024
111 changes: 111 additions & 0 deletions toc/rfc/rfc-draft-add-support-for-file-based-service-binding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Meta
[meta]: #meta
- Name: Add Support for File based Service Binding Information
- Start Date: 2024-03-11
- Author(s): @beyhan
- Contributing Authors: @stephanme, @loewenstein
- Status: Draft
- RFC Pull Request: (fill in with PR link after you submit it)



## Summary

The current contract between the CF platform and applications for service binding information is based on an environment variable. The Linux Kernel defines size limit per environment variable and there are also other limitations with this approach. That is why, CF should add support for an alternative option to provide service binding information which can address the limitations of the current approach.

## Problem

The CF platform provides Service Binding Information to hosted applications via the [VCAP_SERVICES environment variable](https://docs.cloudfoundry.org/devguide/services/application-binding.html). There are following challenges with this approach:
- The environment variable length has a hard limit of `131072` bytes per variable which is controlled by the underlying [Linux kernel](https://github.com/torvalds/linux/blob/master/include/uapi/linux/binfmts.h). The environment variable size is defined by `MAX_ARG_STRLEN`, which is a constant with the value `PAGE_SIZE*32` where page size is `4096` bytes. This means that to change the size limit for the CF platform a recompiled kernel with updated value for `MAX_ARG_STRLEN` is required. This limit could be an issue for applications using many services. If the limit is reached by an application, it will fail to stage as discussed in [this issue](https://github.com/cloudfoundry/garden-runc-release/issues/160).
- Updates of the Service Binding Information require restage. This is not optimal for an eventual support of [Binding rotation](https://github.com/openservicebrokerapi/servicebroker/blob/master/spec.md#binding-rotation) specification from the OSBI API spec.


## Proposal

The CFF community should implement an alternative approach for service binding information based on `tmpfs` file(s). Using a file or files to provide service binding information to applications will address the challenges listed in the problem section because:
- The file size limit can be controlled by the CF platform
- Already today CF uses `tmpfs` for [Instance Identity Credentials](https://docs.cloudfoundry.org/devguide/deploy-apps/instance-identity.html) which are rotated without restarting the application every `24h` by default

The two approaches should be supported in parallel. Users should be able to select which approach Cloud Controller should use to deliver the binding information. Applications with binding information >`130KB` have to go with the file option and adopt it. There are two alternatives regarding service binding file organization:
1. The `VCAP_SERVICES` content is stored in a file which location is specified via the `VCAP_SERVICES_FILE_PATH` env var in the same format as the `VCAP_SERVICES` environment variable
* Advantages:
* Less disruptive for applications consuming the `VCAP_SERVICES` env var
* Less implementation effort for the Cloud Controller
* Disadvantages:
* Can’t make use of tools and libraries from the Cloud Native community because K8s specifies a different file structure and format for the service binding information
2. Implement the K8s service binding specification. The environment variable `SERVICE_BINDING_ROOT` defines the location for the service bindings. There is a file per service binding. The name of the file and the format follow the [K8s specification](https://servicebinding.io/):
* Advantages:
* CF community could re-use service binding libraries from the Cloud Native community
* Moving application between CF or K8s deployments will be easier
* Disadvantages
* Higher implementation efforts for the Cloud Controller


The 2) alternatives offers more than just addressing the issue of this RFC. It suggests an option to evolve the CF platform towards a different service binding specification defined by the Cloud Native community. This means higher implementation efforts for the CF platform and application developers, but possible benefits from the Cloud Native community. This RFC has a light preference for the 2) alternative because of the listed advantages but the feedback of the CF community is wanted here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some app developers are going to want to make the minimum changes possible, and 1 definitely works there. Other app developers are going to porting an app from k8s, and 2 makes their path onto CF easier. Is it possible to offer both 1 and 2, and have developers pick which environment variable they'll use to find the file?

Copy link
Contributor

@dmikusa dmikusa Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My read is that users will be able to pick between using the existing env variable approach, which means zero change for their apps, or they can use the file-based approach which would mean some level of change for their app with either approach. So there is control for the user that doesn't want change.

Both of the implementations above sound like there would be work involved for an app to adopt. Minimally, you're going to need to know to read from a file instead. It's small, but if you've got apps that are not actively being developed, even that may be more work than you're willing to do.

Not sure if it needs to go into this proposal, but I would suggest that we target the common libraries that folks are using to read VCAP_SERVICES, like java-cfenv (I believe there are Python & Node.js libraries also) and get support for both formats, env variable and file, and then it's a lot less relevant how it's implemented. The libraries will smooth things over and possibly even make it so that no code change is required at the app level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My read is that users will be able to pick between using the existing env variable approach, which means zero change for their apps, or they can use the file-based approach which would mean some level of change for their app with either approach. So there is control for the user that doesn't want change.

This is correct. @mogul given this clarification do you still see benefits in supporting both options in parallel?

Not sure if it needs to go into this proposal, but I would suggest that we target the common libraries that folks are using to read VCAP_SERVICES, like java-cfenv (I believe there are Python & Node.js libraries also) and get support for both formats, env variable and file, and then it's a lot less relevant how it's implemented. The libraries will smooth things over and possibly even make it so that no code change is required at the app level.

@dmikusa Do you mean here option 1) or both? Most probably both but I wanted to check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both. I think regardless of what we do, we can lean on the libraries to help with smoothing things over for hopefully a lot of users.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to comment on the reason for both options: There is a problem with the length of VCAP and running with existing VCAP + K8s service binding would not simplify to use the libraries to get the variables. So, to run both alternatives in parallel, I would propose to store the variables in both formats in two different files, not only keep the current VCAP in memory. This would solve the existing problem and also prepare for future service bindings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2️⃣

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2️⃣

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2️⃣

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2️⃣

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the votes! I didn't set an end date for the voting but most of the RFC reviewers and the TOC already voted and the winner is clear. That is why, I updated the RFC (88f5fb6) with the winning option. Please review the latest state.


Additionally, the application environment is stored in the `CCDB` and `BBS DB` that is why we should define a limit for the size of it, which makes it possible to be stored in the according DBs and doesn’t impact the performance of the communication between Cloud Controller and the Diego API. That is why this RFC suggests a limit of `1MB`, which is roughly ten times higher than the current one of 130KB. This is subject for evaluation during the implation of this RFC.

### Implementation Overview

Cloud Controller should introduce a new app feature for activating the file-based approach. This means that the [App Features API](https://v3-apidocs.cloudfoundry.org/version/3.159.0/index.html#app-features) could be used here and a new feature flag called “file-based-service-bindings" should be introduced.
The [contract](https://github.com/cloudfoundry/bbs/blob/main/doc/actions.md) between Cloud Controller and Diego should be extended so that file name and file content for the application container can be specified. E.g. the run action could look like this when file approach is selected:

```
action := &models.RunAction{
Path: "/path/to/executable",
Args: []string{"some", "args to", "pass in"},
Dir: "/path/to/working/directory",
User: "username",
EnvironmentVariables: []*models.EnvironmentVariable{
{
Name: "ENVNAME",
Value: "ENVVALUE",
},
},
Files: []*models.Files{
{
Name: "/etc/cf-instance-binding",
Value: "VALUE",
},
},
ResourceLimits: &models.ResourceLimits{
Nofile: 1000,
},
LogSource: "some-log-source",
SuppressLogOutput: false,
}
```

## Workstreams

### App Runtime Interfaces WG

Cloud Controller should be extended with an App Feature to activate the new file base Service Binding option. If the file-based service binding feature is active the Cloud Controller should generate a Run action, which configures the Service Bindings to be stored as tmpfs file(s) in the application container instead of `VCAP_SERVICES` environment variable.

Additionally, the suggest limit for the size should be implemented.
Copy link
Contributor

@rkoster rkoster Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a section describing how the name and tags OSBAPI fields could be mapped to type and provider service binding fields (if these are not specified by the broker), addressing initial broker compatibility. This could for example be done based on well-known tags like postgres and mysql. This would be in line with the service-specific code that is currently used to populate the DATABASE_URL variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In libcnb for Cloud-Native Buildpacks, we implemented a translation from VCAP_SERVICES to CNB bindings. What we did was this:

  1. Credentials map was turned into key/value files (i.e. key is the file name, value is the file contents). We did not go into nested maps, so if a key in Credentials had a nested JSON value, then the JSON is written as the contents of the file.
  2. We used the Label from VCAP_SERVICES as the type.
  3. We used the top-level key from the VCAP_SERVICES map as the provider.

I haven't used it myself, but I know people are using it. @loewenstein - How has this mapping worked out in practice?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We contributed that to libcnb mainly because of the Paketo Dynatrace Buildpack and for that it works - as far as I can tell.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loewenstein is buildpacks/libcnb#228 you are referring to? The mentioned mapping above isn't directly visible in the change but most probably I have to dive deeper into it.

Copy link
Contributor

@dmikusa dmikusa Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there were two or three PRs to arrive at the present implementation. You can see what we have settled on here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into the implementation of the CC the top-level key from the VCAP_SERVICES seems to be also the label from the VCAP_SERVICES. The value is taken from the name attribute of the service offering as defined in the OSBAPI spec. I think it make sense to be consistent and have the same mapping as implemented in libcnb and will update the RFC with this if we agree to go with the 2) approach and don't here anything against the suggested mapping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added details about the VCAP_SERVICES -> K8s bindings mapping with 88f5fb6. Please review and provide feedback.


### App Runtime Platform WG

Diego should add support for the new argument of the Run action to create files with the desired content. Like the [Instance Identity credentials](https://docs.cloudfoundry.org/devguide/deploy-apps/instance-identity.html) implementation, the [Diego Executor](https://github.com/cloudfoundry/executor) should be extended to prepare the `tmpfs` mount and create the required files for an application container. For reference there is a [CredManager](https://github.com/cloudfoundry/executor/blob/db9758c0142ae9c11dad26de672735fb20566105/depot/containerstore/credmanager.go) , InstanceIdentityHandler and the `tmpfs` mount is configured in the [Diego release](https://github.com/cloudfoundry/diego-release/blob/2d7d7c1373f2a61077c74e33a397a5f69b11b131/jobs/rep/templates/setup_mounted_data_dirs.erb#L38-L56) for the current implementation of the Instance Identity Credentials. The files should have security permissions like the Instance Identity Credentilas `644` or even higher `600` owned by the uid running the app if possible.

## Possible Future Work

The App Features API aren’t supported currently in the CF CLI and [app manifest](https://docs.cloudfoundry.org/devguide/deploy-apps/manifest-attributes.html). To make the use of this proposal for CF operators easier this should be addressed.

### App Manifest Attributes Proposal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding app features to the cf app manifest has been requested before for enabling ssh: cloudfoundry/cloud_controller_ng#2996

The proposal should allow to explicitly enable and disable a feature flag (i.e. not just by listing the flag). Otherwise it would be impossible to disable a feature flag because the CF app manifest is not declarative but additive (see https://v3-apidocs.cloudfoundry.org/#apply-a-manifest-to-a-space).

E.g.

applications:
- name: testapp
  features:
  - file-based-service-bindings: true

or, closer to the app-features v3 API,

applications:
- name: testapp
  features:
  - name: file-based-service-bindings
    enabled: true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and tried to address this with 0ae1408. Is that OK?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From concept point of view, it looks ok. But there is one yaml error in the 2nd proposal (see my comment below).


```
---
applications:
- name: test-app
features:
- file-based-service-bindings
```

### CF CLI new Commands Proposal

- `app-feature-flags` Retrieve list of available app feature flags with status
- `app-feature-flag` Retrieve an individual app feature flag with status
- `enable-app-feature-flag` Allow use of an app feature
- `disable-app-feature-flag` Disable use of an app feature