ITEP-70317 User provided data sanitization #476

tybulewicz · 2025-06-18T10:14:26Z

User data sanitization during password reset request.

📝 Description

User provided data (in headers or payload) may contain new line characters which might be used to spoof log messages.
This PR removes such characters from user provided data before processing.

JIRA: ITEP-70317

✨ Type of Change

Select the type of change your PR introduces:

🐞 Bug fix – Non-breaking change which fixes an issue
🚀 New feature – Non-breaking change which adds functionality
🔨 Refactor – Non-breaking change which refactors the code base
💥 Breaking change – Changes that break existing functionality
📚 Documentation update
🔒 Security update
🧪 Tests

🧪 Testing Scenarios

Describe how the changes were tested and how reviewers can test them too:

✅ Tested manually
🤖 Run automated end-to-end tests

✅ Checklist

Before submitting the PR, ensure the following:

🔍 PR title is clear and descriptive
📝 For internal contributors: If applicable, include the JIRA ticket number (e.g., ITEP-123456) in the PR title. Do not include full URLs
💬 I have commented my code, especially in hard-to-understand areas
📄 I have made corresponding changes to the documentation
✅ I have added tests that prove my fix is effective or my feature works

Host header sanitization Path argument sanitization User data sanitization

Copilot

Pull Request Overview

This PR addresses a security issue by sanitizing user provided input to prevent log spoofing during password reset requests.

Introduces a helper function (_sanitize_input) to remove newline and carriage return characters
Applies sanitization on headers, email, and URL paths in both the password reset request and token validity endpoints

Comments suppressed due to low confidence (2)

platform/services/user_directory/app/endpoints/user_management/password_reset.py:52

Consider expanding the docstring for _sanitize_input to clarify that only newline and carriage return characters are removed, and outline any limitations of the sanitization approach.

def _sanitize_input(input_data: str) -> str:

platform/services/user_directory/app/endpoints/user_management/password_reset.py:56

Add unit tests for the _sanitize_input function to ensure it correctly handles various combinations of newline and carriage return characters, preventing future regressions.

    return input_data.replace("\n", "").replace("\r", "")

AlexanderBarabanov

Thank you, Tomasz!

jcchr · 2025-06-18T11:04:04Z

platform/services/user_directory/app/endpoints/user_management/password_reset.py

    try:
        UsersHandler.is_email_valid(user_data.email)
    except InvalidEmail as msg:
-        logger.error(msg)
+        sanitized_msg = _sanitize_input(str(msg))


Wouldn't it be better to sanitize the email before checking validity of this email? As I can see - only email included in this message may lead to the error described in the ticket. Thanks this we wouldn't have to have this sanitization twice in the code - once in except clause and then afterwards.

As far as I understand, root cause is logger.error(msg) in password_reset.py - UsersHandler.is_email_valid does not log anything but just return exception with msg in case of invalid email. If we sanitize the whole msg (that contains user supplied email) before output - i think it also should be fine.
It will also protect us from the theoretical case where returned msg will be enriched with some other data that potentially can be untrusted, so I would propose to leave whole msg sanitization here and optionally add email sanitization before is_email_valid invocation.

I tested the current state dynamically - data properly sanitized, so please do not remove existing msg sanitization but if necessary add additional email sanitization before is_email_valid invocation.

ITEP-70317 User provided data sanitization

cb899de

Host header sanitization Path argument sanitization User data sanitization

tybulewicz requested review from jcchr and piotrgrubicki as code owners June 18, 2025 10:14

tybulewicz requested a review from Copilot June 18, 2025 10:14

Copilot AI reviewed Jun 18, 2025

View reviewed changes

github-actions bot added the Platform label Jun 18, 2025

tybulewicz requested a review from AlexanderBarabanov June 18, 2025 10:15

AlexanderBarabanov previously approved these changes Jun 18, 2025

View reviewed changes

jcchr requested changes Jun 18, 2025

View reviewed changes

tybulewicz added 2 commits June 18, 2025 13:47

Sanitize email at the beginning

8e9cab1

Restore log message sanitization

6b322ff

tybulewicz dismissed AlexanderBarabanov’s stale review via 6b322ff June 18, 2025 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ITEP-70317 User provided data sanitization #476

ITEP-70317 User provided data sanitization #476

Uh oh!

tybulewicz commented Jun 18, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

AlexanderBarabanov left a comment

Uh oh!

jcchr Jun 18, 2025

Uh oh!

AlexanderBarabanov Jun 18, 2025

Uh oh!

AlexanderBarabanov Jun 18, 2025

Uh oh!

Uh oh!

ITEP-70317 User provided data sanitization #476

Are you sure you want to change the base?

ITEP-70317 User provided data sanitization #476

Uh oh!

Conversation

tybulewicz commented Jun 18, 2025

📝 Description

✨ Type of Change

🧪 Testing Scenarios

✅ Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

AlexanderBarabanov left a comment

Choose a reason for hiding this comment

Uh oh!

jcchr Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

AlexanderBarabanov Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

AlexanderBarabanov Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!