Skip to content

Java: Add XSS Sanitizer for HttpServletResponse.setContentType with safe values #18607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 24, 2025

Conversation

owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Jan 28, 2025

This already existed for Jax and Spring, but not the standard library.

Pull Request checklist

All query authors

Internal query authors only

  • Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to .ql, .qll, or .qhelp files. See the documentation (internal access required).
  • Changes are validated at scale (internal access required).
  • Adding a new query? Consider also adding the query to autofix.

@owen-mc owen-mc force-pushed the java/xss-content-type-sanitizer branch from c14ebe8 to 5b7ebef Compare January 28, 2025 15:33
@owen-mc owen-mc force-pushed the java/xss-content-type-sanitizer branch from 5b7ebef to 2d76466 Compare January 28, 2025 15:35
@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 17, 2025

I ran MRVA to see how many sources this affects. From the Java top 1000 repos, 96 had at least one source removed. In total, 623 sources were removed, with the most in any one repo being 117. I spot-checked the results and they all seemed valid.

I ran MRVA to see how many results this removes. I think I made a mistake in my query because some of the results don't match what I was expecting. From the Java top 1000 repos, 17 repos had at least one alert removed. 151 alerts were removed in total, with the most in any one repo being 64. I spot-checked the results and they mostly seemed to be valid.

@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 17, 2025

This kind of situation seems to come up a bit:

    if (useXml) {
      response.setContentType("text/xml");
      response.setCharacterEncoding(Const.XML_ENCODING);
    }
    if (useJson) {
      response.setContentType("application/json");
      response.setCharacterEncoding(Const.XML_ENCODING);
    } else {
      response.setContentType("text/html;charset=UTF-8");
    }

    PrintWriter out = response.getWriter();

Currently the sanitizer is applied because at least one safe content type could be set. Historically CodeQL has erred on the side of trying to minimize FPs, which is why I chose this behaviour. You could argue that we should not sanitize it, so that we don't risk missing a real vulnerability. I wanted to bring this up in case anyone disagrees and thinks we should not sanitize if an unsafe content type could possibly be used.

@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 18, 2025

I ran DCA. There were some failed workflows, which I don't really understand, but there seem to be enough results. Two alerts were lost (duplicated to 4 because of buildless), which I manually confirmed were FPs we wanted to remove. The overall performance seems to be fine. It's hard to tell what is noise - one db was significantly faster, which is possible as this PR is reducing the number of XSS sinks and if we remove all of them then we don't have to do the data flow calculation any more, which would save some time. Or it may just be noise.

@owen-mc owen-mc marked this pull request as ready for review February 18, 2025 10:50
@Copilot Copilot AI review requested due to automatic review settings February 18, 2025 10:50
@owen-mc owen-mc requested a review from a team as a code owner February 18, 2025 10:50
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR addresses an issue with false positive alerts in the "Cross-site scripting" query for Java by ensuring that non-exploitable content types (e.g. "text/plain") are used when writing responses. Key changes include:

  • Adding a change note documenting the fix.
  • Modifying the SpringXSS test to invert the safe content type condition.
  • Updating the XSS test servlet to demonstrate both safe and unsafe response content types via multiple method calls.

Changes

File Description
java/ql/src/change-notes/2025-01-28-fix-xss-content-type-safe.md Documents the resolution of false positives by specifying safe content types.
java/ql/test/query-tests/security/CWE-079/semmle/tests/SpringXSS.java Inverts the safeContentType condition to trigger unsanitized behavior for testing.
java/ql/test/query-tests/security/CWE-079/semmle/tests/XSS.java Modifies the doGet method signature and adds branches for handling safe and unsafe content types.

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more

@@ -106,6 +122,11 @@ class XssVulnerableWriterSource extends MethodCall {
}
}

pragma[nomagic]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nomagic? Inlining this would seem like a good way to make the somewhat-expensive regex test against just those string constants passed to a relevant method, rather than every string constant in the database, which could well get into the millions for big projects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was copying how it is done in other places, as recently updated in #18552

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschackmull can you comment since you're the author there? Wouldn't it be better to actually encourage restricting the strings we're considering (here, to arguments to a particular method) and then regex-match them, rather than match against every string constant value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlining this would seem like a good way to make the somewhat-expensive regex test against just those string constants passed to a relevant method, rather than every string constant in the database, which could well get into the millions for big projects?

True. At least in theory. IIRC the trouble was that as long as at least one of the use-cases fails to provide a properly restrictive context then inlining/magic'ing becomes actively bad as that just increases the number of regex operations. And the optimiser is too unreliable for ensuring that every use-case has a good restrictive context that's also applied before the regex. That's why I went with the somewhat blunt solution of just evaluating the regex on all the values of all string literals (notice that the string literal column is projected away, which is crucial, as that reduces things a lot). In the cases I looked at this was a considerable performance improvement over the previous situation.

We can of course refactor things to do the minimal number of regex operations, but we shouldn't trust the optimiser too much.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fine with me to copy the established pattern of checking a broad range of possible content-type strings.

@owen-mc owen-mc merged commit 74a2495 into github:main Feb 24, 2025
16 checks passed
@owen-mc owen-mc deleted the java/xss-content-type-sanitizer branch February 24, 2025 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants