Java: Add XSS Sanitizer for `HttpServletResponse.setContentType` with safe values #18607

owen-mc · 2025-01-28T11:54:56Z

This already existed for Jax and Spring, but not the standard library.

Pull Request checklist

All query authors

A change note is added if necessary. See the documentation in this repository.
~~All new queries have appropriate .qhelp. See the documentation in this repository.~~
QL tests are added if necessary. See Testing custom queries in the GitHub documentation.
~~New and changed queries have correct query metadata. See the documentation in this repository.~~

Internal query authors only

~~Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to .ql, .qll, or .qhelp files. See the documentation (internal access required).~~
Changes are validated at scale (internal access required).
~~Adding a new query? Consider also adding the query to autofix.~~

owen-mc · 2025-02-17T14:43:01Z

I ran MRVA to see how many sources this affects. From the Java top 1000 repos, 96 had at least one source removed. In total, 623 sources were removed, with the most in any one repo being 117. I spot-checked the results and they all seemed valid.

I ran MRVA to see how many results this removes. I think I made a mistake in my query because some of the results don't match what I was expecting. From the Java top 1000 repos, 17 repos had at least one alert removed. 151 alerts were removed in total, with the most in any one repo being 64. I spot-checked the results and they mostly seemed to be valid.

owen-mc · 2025-02-17T14:53:40Z

This kind of situation seems to come up a bit:

    if (useXml) {
      response.setContentType("text/xml");
      response.setCharacterEncoding(Const.XML_ENCODING);
    }
    if (useJson) {
      response.setContentType("application/json");
      response.setCharacterEncoding(Const.XML_ENCODING);
    } else {
      response.setContentType("text/html;charset=UTF-8");
    }

    PrintWriter out = response.getWriter();

Currently the sanitizer is applied because at least one safe content type could be set. Historically CodeQL has erred on the side of trying to minimize FPs, which is why I chose this behaviour. You could argue that we should not sanitize it, so that we don't risk missing a real vulnerability. I wanted to bring this up in case anyone disagrees and thinks we should not sanitize if an unsafe content type could possibly be used.

owen-mc · 2025-02-18T10:50:20Z

I ran DCA. There were some failed workflows, which I don't really understand, but there seem to be enough results. Two alerts were lost (duplicated to 4 because of buildless), which I manually confirmed were FPs we wanted to remove. The overall performance seems to be fine. It's hard to tell what is noise - one db was significantly faster, which is possible as this PR is reducing the number of XSS sinks and if we remove all of them then we don't have to do the data flow calculation any more, which would save some time. Or it may just be noise.

Copilot

PR Overview

This PR addresses an issue with false positive alerts in the "Cross-site scripting" query for Java by ensuring that non-exploitable content types (e.g. "text/plain") are used when writing responses. Key changes include:

Adding a change note documenting the fix.
Modifying the SpringXSS test to invert the safe content type condition.
Updating the XSS test servlet to demonstrate both safe and unsafe response content types via multiple method calls.

Changes

File	Description
java/ql/src/change-notes/2025-01-28-fix-xss-content-type-safe.md	Documents the resolution of false positives by specifying safe content types.
java/ql/test/query-tests/security/CWE-079/semmle/tests/SpringXSS.java	Inverts the safeContentType condition to trigger unsanitized behavior for testing.
java/ql/test/query-tests/security/CWE-079/semmle/tests/XSS.java	Modifies the doGet method signature and adds branches for handling safe and unsafe content types.

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more

java/ql/test/query-tests/security/CWE-079/semmle/tests/XSS.java

smowton · 2025-02-21T15:50:00Z

java/ql/lib/semmle/code/java/security/XSS.qll

@@ -106,6 +122,11 @@ class XssVulnerableWriterSource extends MethodCall {
  }
 }

+pragma[nomagic]


Why nomagic? Inlining this would seem like a good way to make the somewhat-expensive regex test against just those string constants passed to a relevant method, rather than every string constant in the database, which could well get into the millions for big projects?

I was copying how it is done in other places, as recently updated in #18552

@aschackmull can you comment since you're the author there? Wouldn't it be better to actually encourage restricting the strings we're considering (here, to arguments to a particular method) and then regex-match them, rather than match against every string constant value?

Inlining this would seem like a good way to make the somewhat-expensive regex test against just those string constants passed to a relevant method, rather than every string constant in the database, which could well get into the millions for big projects?

True. At least in theory. IIRC the trouble was that as long as at least one of the use-cases fails to provide a properly restrictive context then inlining/magic'ing becomes actively bad as that just increases the number of regex operations. And the optimiser is too unreliable for ensuring that every use-case has a good restrictive context that's also applied before the regex. That's why I went with the somewhat blunt solution of just evaluating the regex on all the values of all string literals (notice that the string literal column is projected away, which is crucial, as that reduces things a lot). In the cases I looked at this was a considerable performance improvement over the previous situation.

We can of course refactor things to do the minimal number of regex operations, but we shouldn't trust the optimiser too much.

smowton

OK, fine with me to copy the established pattern of checking a broad range of possible content-type strings.

owen-mc added 2 commits January 27, 2025 14:10

Make test easier to understand

05fb22e

Reformat inline expectations (space after $)

9f3572d

github-actions bot added documentation Java labels Jan 28, 2025

Fix XSS FPs when content type is safe

0ccf4ce

owen-mc force-pushed the java/xss-content-type-sanitizer branch from c14ebe8 to 5b7ebef Compare January 28, 2025 15:33

Add change note

2d76466

owen-mc force-pushed the java/xss-content-type-sanitizer branch from 5b7ebef to 2d76466 Compare January 28, 2025 15:35

owen-mc marked this pull request as ready for review February 18, 2025 10:50

Copilot AI review requested due to automatic review settings February 18, 2025 10:50

owen-mc requested a review from a team as a code owner February 18, 2025 10:50

Copilot AI reviewed Feb 18, 2025

View reviewed changes

java/ql/test/query-tests/security/CWE-079/semmle/tests/XSS.java Show resolved Hide resolved

smowton reviewed Feb 21, 2025

View reviewed changes

smowton approved these changes Feb 24, 2025

View reviewed changes

owen-mc merged commit 74a2495 into github:main Feb 24, 2025
16 checks passed

owen-mc deleted the java/xss-content-type-sanitizer branch February 24, 2025 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java: Add XSS Sanitizer for `HttpServletResponse.setContentType` with safe values #18607

Java: Add XSS Sanitizer for `HttpServletResponse.setContentType` with safe values #18607

owen-mc commented Jan 28, 2025 •

edited

Loading

owen-mc commented Feb 17, 2025 •

edited

Loading

owen-mc commented Feb 17, 2025 •

edited

Loading

owen-mc commented Feb 18, 2025

Copilot AI left a comment

smowton Feb 21, 2025

owen-mc Feb 21, 2025

smowton Feb 21, 2025

aschackmull Feb 24, 2025

smowton left a comment

Java: Add XSS Sanitizer for HttpServletResponse.setContentType with safe values #18607

Java: Add XSS Sanitizer for HttpServletResponse.setContentType with safe values #18607

Conversation

owen-mc commented Jan 28, 2025 • edited Loading

Pull Request checklist

All query authors

Internal query authors only

owen-mc commented Feb 17, 2025 • edited Loading

owen-mc commented Feb 17, 2025 • edited Loading

owen-mc commented Feb 18, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

PR Overview

Changes

smowton Feb 21, 2025

Choose a reason for hiding this comment

owen-mc Feb 21, 2025

Choose a reason for hiding this comment

smowton Feb 21, 2025

Choose a reason for hiding this comment

aschackmull Feb 24, 2025

Choose a reason for hiding this comment

smowton left a comment

Choose a reason for hiding this comment

Java: Add XSS Sanitizer for `HttpServletResponse.setContentType` with safe values #18607

Java: Add XSS Sanitizer for `HttpServletResponse.setContentType` with safe values #18607

owen-mc commented Jan 28, 2025 •

edited

Loading

owen-mc commented Feb 17, 2025 •

edited

Loading

owen-mc commented Feb 17, 2025 •

edited

Loading