Skip to content

ITEP-70317 User provided data sanitization #476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tybulewicz
Copy link
Contributor

User data sanitization during password reset request.

📝 Description

User provided data (in headers or payload) may contain new line characters which might be used to spoof log messages.
This PR removes such characters from user provided data before processing.

JIRA: ITEP-70317

✨ Type of Change

Select the type of change your PR introduces:

  • 🐞 Bug fix – Non-breaking change which fixes an issue
  • 🚀 New feature – Non-breaking change which adds functionality
  • 🔨 Refactor – Non-breaking change which refactors the code base
  • 💥 Breaking change – Changes that break existing functionality
  • 📚 Documentation update
  • 🔒 Security update
  • 🧪 Tests

🧪 Testing Scenarios

Describe how the changes were tested and how reviewers can test them too:

  • ✅ Tested manually
  • 🤖 Run automated end-to-end tests

✅ Checklist

Before submitting the PR, ensure the following:

  • 🔍 PR title is clear and descriptive
  • 📝 For internal contributors: If applicable, include the JIRA ticket number (e.g., ITEP-123456) in the PR title. Do not include full URLs
  • 💬 I have commented my code, especially in hard-to-understand areas
  • 📄 I have made corresponding changes to the documentation
  • ✅ I have added tests that prove my fix is effective or my feature works

Host header sanitization

Path argument sanitization

User data sanitization
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses a security issue by sanitizing user provided input to prevent log spoofing during password reset requests.

  • Introduces a helper function (_sanitize_input) to remove newline and carriage return characters
  • Applies sanitization on headers, email, and URL paths in both the password reset request and token validity endpoints
Comments suppressed due to low confidence (2)

platform/services/user_directory/app/endpoints/user_management/password_reset.py:52

  • Consider expanding the docstring for _sanitize_input to clarify that only newline and carriage return characters are removed, and outline any limitations of the sanitization approach.
def _sanitize_input(input_data: str) -> str:

platform/services/user_directory/app/endpoints/user_management/password_reset.py:56

  • Add unit tests for the _sanitize_input function to ensure it correctly handles various combinations of newline and carriage return characters, preventing future regressions.
    return input_data.replace("\n", "").replace("\r", "")

Copy link
Contributor

@AlexanderBarabanov AlexanderBarabanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Tomasz!

try:
UsersHandler.is_email_valid(user_data.email)
except InvalidEmail as msg:
logger.error(msg)
sanitized_msg = _sanitize_input(str(msg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to sanitize the email before checking validity of this email? As I can see - only email included in this message may lead to the error described in the ticket. Thanks this we wouldn't have to have this sanitization twice in the code - once in except clause and then afterwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, root cause is logger.error(msg) in password_reset.py - UsersHandler.is_email_valid does not log anything but just return exception with msg in case of invalid email. If we sanitize the whole msg (that contains user supplied email) before output - i think it also should be fine.
It will also protect us from the theoretical case where returned msg will be enriched with some other data that potentially can be untrusted, so I would propose to leave whole msg sanitization here and optionally add email sanitization before is_email_valid invocation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the current state dynamically - data properly sanitized, so please do not remove existing msg sanitization but if necessary add additional email sanitization before is_email_valid invocation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants