-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Java: Add XSS Sanitizer for HttpServletResponse.setContentType
with safe values
#18607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c14ebe8
to
5b7ebef
Compare
5b7ebef
to
2d76466
Compare
I ran MRVA to see how many sources this affects. From the Java top 1000 repos, 96 had at least one source removed. In total, 623 sources were removed, with the most in any one repo being 117. I spot-checked the results and they all seemed valid. I ran MRVA to see how many results this removes. I think I made a mistake in my query because some of the results don't match what I was expecting. From the Java top 1000 repos, 17 repos had at least one alert removed. 151 alerts were removed in total, with the most in any one repo being 64. I spot-checked the results and they mostly seemed to be valid. |
This kind of situation seems to come up a bit:
Currently the sanitizer is applied because at least one safe content type could be set. Historically CodeQL has erred on the side of trying to minimize FPs, which is why I chose this behaviour. You could argue that we should not sanitize it, so that we don't risk missing a real vulnerability. I wanted to bring this up in case anyone disagrees and thinks we should not sanitize if an unsafe content type could possibly be used. |
I ran DCA. There were some failed workflows, which I don't really understand, but there seem to be enough results. Two alerts were lost (duplicated to 4 because of buildless), which I manually confirmed were FPs we wanted to remove. The overall performance seems to be fine. It's hard to tell what is noise - one db was significantly faster, which is possible as this PR is reducing the number of XSS sinks and if we remove all of them then we don't have to do the data flow calculation any more, which would save some time. Or it may just be noise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR addresses an issue with false positive alerts in the "Cross-site scripting" query for Java by ensuring that non-exploitable content types (e.g. "text/plain") are used when writing responses. Key changes include:
- Adding a change note documenting the fix.
- Modifying the SpringXSS test to invert the safe content type condition.
- Updating the XSS test servlet to demonstrate both safe and unsafe response content types via multiple method calls.
Changes
File | Description |
---|---|
java/ql/src/change-notes/2025-01-28-fix-xss-content-type-safe.md | Documents the resolution of false positives by specifying safe content types. |
java/ql/test/query-tests/security/CWE-079/semmle/tests/SpringXSS.java | Inverts the safeContentType condition to trigger unsanitized behavior for testing. |
java/ql/test/query-tests/security/CWE-079/semmle/tests/XSS.java | Modifies the doGet method signature and adds branches for handling safe and unsafe content types. |
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more
@@ -106,6 +122,11 @@ class XssVulnerableWriterSource extends MethodCall { | |||
} | |||
} | |||
|
|||
pragma[nomagic] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why nomagic? Inlining this would seem like a good way to make the somewhat-expensive regex test against just those string constants passed to a relevant method, rather than every string constant in the database, which could well get into the millions for big projects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was copying how it is done in other places, as recently updated in #18552
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aschackmull can you comment since you're the author there? Wouldn't it be better to actually encourage restricting the strings we're considering (here, to arguments to a particular method) and then regex-match them, rather than match against every string constant value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlining this would seem like a good way to make the somewhat-expensive regex test against just those string constants passed to a relevant method, rather than every string constant in the database, which could well get into the millions for big projects?
True. At least in theory. IIRC the trouble was that as long as at least one of the use-cases fails to provide a properly restrictive context then inlining/magic'ing becomes actively bad as that just increases the number of regex operations. And the optimiser is too unreliable for ensuring that every use-case has a good restrictive context that's also applied before the regex. That's why I went with the somewhat blunt solution of just evaluating the regex on all the values of all string literals (notice that the string literal column is projected away, which is crucial, as that reduces things a lot). In the cases I looked at this was a considerable performance improvement over the previous situation.
We can of course refactor things to do the minimal number of regex operations, but we shouldn't trust the optimiser too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fine with me to copy the established pattern of checking a broad range of possible content-type strings.
This already existed for Jax and Spring, but not the standard library.
Pull Request checklist
All query authors
All new queries have appropriate.qhelp
. See the documentation in this repository.New and changed queries have correct query metadata. See the documentation in this repository.Internal query authors only
Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to.ql
,.qll
, or.qhelp
files. See the documentation (internal access required).Adding a new query? Consider also adding the query to autofix.