Skip to content

Add a flag to set whether a system index is readable on SystemIndexDescriptor #17296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

cwperks
Copy link
Member

@cwperks cwperks commented Feb 7, 2025

Description

I'm opening up this PR to start a discussion on this approach to a problem that I am seeing while trying to change over plugins to use the replacement for ThreadContext.stashContext for System index access.

This PR introduces the notion of a readable system index. readable means that GET and SEARCH operations would be permitted on the index, but write operations would be reserved for the system.

This PR is a start to some much needed code hygiene cleanup to start removing the this request accesses system indices: [{}], but in a future major version, direct access to system indices will be prevented by default which have been present since the fork.

From the core's perspective, the only behavior this PR will change is that the message above will not be logged for search operations that cover readable system indices since that would be officially supported.

When the security plugin is installed, there would be a more noticeable difference in behavior between readable system indices and non-readable ones.

  • non-readable system indices - This is the current behavior where security scrubs the results of GET and SEARCH operations on these indices. The reason for this is that the security index can contain secrets and this logic is used to scrub any results
  • readable system index - This would be new behavior that would be officially supported where a plugin dev can specify whether a system index is readable when defining system indices using SystemIndexPlugin.getSystemIndexDescriptors. A couple example use-cases:
  1. The security auditlog when using an internal opensearch index. This is an index that the security plugin writes to, but users should be permitted to query it
  2. Anomaly Detection has a notion of a custom results index where they store results information and users use these indices to make dashboards

This PR is an alternative to #15778

Related Issues

Resolves opensearch-project/security#2487

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added >upgrade Label used when upgrading library dependencies (e.g., Lucene) non-issue bugs / unexpected behaviors that end up non issues; audit trail simple changes that aren't issues v2.0.0 Version 2.0.0 labels Feb 7, 2025
@cwperks
Copy link
Member Author

cwperks commented Feb 7, 2025

@reta I wanted to run this idea by you. Opening this in draft because I haven't written tests yet, but would like to know what you think about this approach as an alternative to #15778

@cwperks cwperks removed v2.0.0 Version 2.0.0 non-issue bugs / unexpected behaviors that end up non issues; audit trail simple changes that aren't issues >upgrade Label used when upgrading library dependencies (e.g., Lucene) labels Feb 7, 2025
@github-actions github-actions bot added >upgrade Label used when upgrading library dependencies (e.g., Lucene) non-issue bugs / unexpected behaviors that end up non issues; audit trail simple changes that aren't issues v2.0.0 Version 2.0.0 labels Feb 7, 2025
Copy link
Contributor

github-actions bot commented Feb 7, 2025

✅ Gradle check result for 777fe07: SUCCESS

Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 74.28571% with 9 lines in your changes missing coverage. Please review.

Project coverage is 72.37%. Comparing base (5666982) to head (5a2bc65).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
.../org/opensearch/indices/SystemIndexDescriptor.java 61.53% 3 Missing and 2 partials ⚠️
.../cluster/metadata/IndexNameExpressionResolver.java 83.33% 2 Missing and 1 partial ⚠️
.../example/resthandler/ExampleRestHandlerPlugin.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17296      +/-   ##
============================================
- Coverage     72.49%   72.37%   -0.12%     
+ Complexity    65717    65669      -48     
============================================
  Files          5303     5304       +1     
  Lines        304793   305007     +214     
  Branches      44202    44241      +39     
============================================
- Hits         220966   220757     -209     
- Misses        65693    66166     +473     
+ Partials      18134    18084      -50     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Craig Perkins <[email protected]>
Copy link
Contributor

github-actions bot commented Feb 7, 2025

❌ Gradle check result for 9a54a4e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Craig Perkins <[email protected]>
@cwperks cwperks removed v2.0.0 Version 2.0.0 non-issue bugs / unexpected behaviors that end up non issues; audit trail simple changes that aren't issues >upgrade Label used when upgrading library dependencies (e.g., Lucene) labels Feb 7, 2025
@cwperks cwperks marked this pull request as ready for review February 7, 2025 17:57
Copy link
Contributor

github-actions bot commented Feb 8, 2025

❕ Gradle check result for 00ef4ba: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@reta
Copy link
Contributor

reta commented Feb 9, 2025

@reta I wanted to run this idea by you. Opening this in draft because I haven't written tests yet, but would like to know what you think about this approach as an alternative to #15778

Thanks @cwperks , I am not onboard with yet another flag:

  • it causes conflicts with Context::isSystemIndexAccessAllowed()
  • it does not rely on the permissions model that security plugin has

I think we have touched on this in the past, but we have an opportunity with 3.x release to make hard, breaking decisions regarding sealing system indices, for example:

  • forbid all access to system indices through regular indices APIs
  • introduce dedicated APIs to access system indices for read / write / ... operation

This will give us a clear security boundaries, introducing additional flags is (to me) not solving the root of the problem. I hope it makes sense, thank you

@cwperks
Copy link
Member Author

cwperks commented Feb 9, 2025

I'm a little confused because #15778 was a general attempt to define a new way to supply a list of actions and index patterns that a plugin can perform using its identity - that would be the most general approach to sealing what plugins can perform inside a pluginSubject.runAs(() -> { ... }) block that conforms to the security model.

This will give us a clear security boundaries, introducing additional flags is (to me) not solving the root of the problem.

The new flag would be a flag that the security plugin can use to differentiate system indexes that it should scrub the results from vs. allowing search and get on the index. (This is where the scrubbing logic currently exists in the security plugin).

Introducing this flag would also support all existing plugin use-cases in the default distribution, but is not as generalizable as #15778.

@reta
Copy link
Contributor

reta commented Feb 9, 2025

Introducing this flag would also support all existing plugin use-cases in the default distribution, but is not as generalizable as #15778.

The #15778 and dedicated API to access system indices are not exclusive - those are complementary. I think making the system indices access explicit through APIs, separate from all other indices and not flag(s) driven, would lead to clear and maintainable solution.

@cwperks
Copy link
Member Author

cwperks commented Feb 9, 2025

Maybe I'm just lacking in imagination, but I'm having trouble picture what those APIs may look like.

@dbwiddis
Copy link
Member

dbwiddis commented Feb 9, 2025

Maybe I'm just lacking in imagination, but I'm having trouble picture what those APIs may look like.

When I lack in imagination I ask my favorite LLM.

Question: I currently have a bunch of APIs to access indices, like "PUT foo", "GET foo", and "DELETE foo". Some indices are special (they are called System Indices) and I want them to have a different API so that I can take more actions at the REST layer, such as blocking at a gateway, prior to processing them internally. Can you suggest some alternate APIs?

It suggested:

  • PUT /system-indices/create/{index_name}
  • GET /system-indices/{index_name}
  • DELETE /system-indices/{index_name}

@cwperks
Copy link
Member Author

cwperks commented Feb 9, 2025

The mental model I have for plugins is similar to that of out-of-process extensions. In the security model for out-of-process extensions, an extension could be assigned a service account token that allows an extension to make privileged requests to OpenSearch, such as storing metadata in reserved (system) indices. I use reserved in this to indicate that an extension reserves index patterns that can be given special protections like system indices that prevent regular users from accessing the indices. By default, extensions can perform any action on their own reserved indices.

Service account tokens were also designed in a way that they could be assigned permissions directly to allow an extension to make REST Requests against a cluster "as a daemon". i.e. an enumerated list of actions an extension is allowed to perform with its identity outside of the context of any authenticated user.

In the current plugin architecture, plugins can perform any action without authz checks. With extensions, that security model was deemed too risky assuming that extensions are 3P. I'd like to introduce a similar notion where its enumerated ahead of time what a plugin will perform and present that to a cluster admin on install. It both let's us create better SDKs for plugin devs with clear guidance on how the security model works w/o having to call low level ThreadContext APIs and improves security.

@reta
Copy link
Contributor

reta commented Feb 9, 2025

Maybe I'm just lacking in imagination, but I'm having trouble picture what those APIs may look like.

Let's start from the basic blocks we have now: we have an API for plugins to contribute system indices: SystemIndexPlugin / SystemIndexDescriptor / SystemIndexRegistry. Do we have an API for plugins to access system indices? yes and no. The system indices are not first class citizen - although we do have an API to specifically ask for system index creation, we don't have any API to access them, it is all falls down to regular index APIs. Could we separate access to normal and system indices? Fe instead of dealing with tons of flags, could we only allow accessing system indices through SystemIndicesService fe?

@cwperks
Copy link
Member Author

cwperks commented Feb 10, 2025

Fe instead of dealing with tons of flags

This may be yet another flag in the IndexNameExpressionResolver, but from a system index perspective its a single flag. The system is readable or unreadable. Unreadable means that the deprecation log is still output on read requests (GET + SEARCH) or that results are scrubbed completely in a cluster running with security.

@cwperks
Copy link
Member Author

cwperks commented Feb 11, 2025

For what its worth, I think this introduces a notion that should exist in OpenSearch. An index that's write protected, but readable. The security auditlog using an internal OpenSearch index is a good example of an index that ought to have write protections to only allow writes to it from the system, but allow reads from regular users authorized to read the index.

Copy link
Contributor

❌ Gradle check result for 82ef14b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@cwperks
Copy link
Member Author

cwperks commented Feb 25, 2025

@reta I think your proposal is a good idea, but idk how to push this forward. Any suggestions would be appreciated.

I know that there are already many flags in IndexNameExpressionResolver.Context, but for the purposes of this PR its only a single flag on SystemIndexDescriptor to indicate if the index is readable or not.

Currently, security has no way of making a distinction and it has blanket logic to scrub results for queries on any index marked as a system index. For use cases like using an internal opensearch index for the auditlog, I think it makes sense to introduce the notion of a system index that only the system (or plugins) can write to, but is readable by regular users.

Edit: I can narrow this PR down to only introducing a flag on SystemIndexDescriptor, but I also wanted to showcase how it could be used in the core repo to reduce the deprecation logger statements instead of only being an extension point for plugins to hook into.

I plan to use this flag around here to not use EmptyFilterLeafReader.EmptyDirectoryReader for readable system indices.

Copy link
Contributor

❌ Gradle check result for 5a2bc65: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 5a2bc65: SUCCESS

@cwperks
Copy link
Member Author

cwperks commented Feb 26, 2025

I removed the changes in IndexNameExpressionResolver and only included the change in SystemIndexDescriptor to simplify this PR.

@dbwiddis
Copy link
Member

For what its worth, I think this introduces a notion that should exist in OpenSearch. An index that's write protected, but readable.

I'm not sure there's a strong enough use case to have this. My main concerns are:

  • in order to read a read-only index, the end user needs to know its name. This adds unneeded complexity.
  • it breaks API consistency with other APIs doing similar things on more sensitive indices.

Taking Flow Framework as an example as we probably have a range of different "access" types:

  • Config index contains master keys to decrypt credentials. Highly sensitive, users should never see.
  • Template index contains (some) sensitive information, including encrypted credentials. We redact the "credentials" field with our own Get and Search APIs, and we are considering allowing redaction of more fields. In general we just don't let the user directly see this, and have our own custom API to do all the access, which is handled by the existing plugin REST API patterns.
  • Status/State index would fit your new "read only" pattern, so yeah, it's a use case for it, but I think it's cleaner for us just to have our own Get and Search APIs for this index to maintain API consistency with the template APIs, rather than saying "you can search templates with the flow framework APIs but you have to use the regular search on this system index name (that you didn't even know existed) to get the status".

Copy link
Contributor

❌ Gradle check result for 29b9a3e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@dbwiddis
Copy link
Member

  • but I think it's cleaner for us just to have our own Get and Search APIs for this index to maintain API consistency with the template APIs

Continuing on this, it's not hard to layer an external search around this, see here, and looking at that file, I see that in the case of our plugin where we've enabled multitenancy, that automatically makes it (and pretty much all our system indices) needed to be hidden from the users. So any plugin that adopts multitenancy features won't use this.

@cwperks
Copy link
Member Author

cwperks commented Feb 26, 2025

Thank you Dan, you bring up some good and fair points. The impetus for this change is to support existing plugin use cases where a plugin must be able to write to (currently) non-system indices such as the auditlog index when using an internal opensearch index for the auditlog. This is not a system index, but the security plugin needs to be able to write to it.

In the current security model, plugins use try (ThreadContext.StoredContext ctx = threadContext.stashContext) { ... } to run actions without authz checks such as the security plugin indexing a doc to the auditlog index. I created a separate PR to introduce a generic approach that resembled role definitions and prompted cluster admins at plugin installation time, but received pushback and was looking for an alternative that supports existing plugin use-cases and allows an offramp from the current model where plugins can execute transport actions w/o authz checks.

Re-reading the AD docs for custom results index it looks like they are using Roles Injection through common-utils' InjectSecurity to write to a custom results index using the same roles as the user that created a detector. I think AD can adopt the replacement described in opensearch-project/opensearch-plugins#238 which enforces strict rules on plugins where then can only execute transport actions on their own system indices, but other actions would be blocked. #15778 is an effort to declare what additional actions a plugin could perform outside of the context of the authenticated user that a plugin would be authorized to execute. The same model would not work for the auditlog index from the security plugin. That would require a change like the one introduced in this PR or #15778

@reta
Copy link
Contributor

reta commented Feb 27, 2025

@reta I think your proposal is a good idea, but idk how to push this forward. Any suggestions would be appreciated.

Thanks @cwperks , I will try to find the time for it within next couple of weeks, sorry about that

@cwperks
Copy link
Member Author

cwperks commented Feb 27, 2025

@reta I think your proposal is a good idea, but idk how to push this forward. Any suggestions would be appreciated.

Thanks @cwperks , I will try to find the time for it within next couple of weeks, sorry about that

Thank you! :) I will be out of the office the first 2 weeks of March so that's perfect.

@cwperks cwperks closed this Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-issue bugs / unexpected behaviors that end up non issues; audit trail simple changes that aren't issues >upgrade Label used when upgrading library dependencies (e.g., Lucene) v2.0.0 Version 2.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Deprecate the usage of ThreadContext.stashContext in plugins
3 participants