Skip to content

Java: IPA the CFG (second try) #17970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 10, 2024
Merged

Conversation

owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Nov 12, 2024

This converts the control flow graph to an IPA type to give us some more options in how we model control flow. There should be no change in any query results.

This resurrects this PR from almost six years ago. The CFG code hasn't changed much in the interim, thankfully.

The commit history is pretty garbled, so it's probably best not to review commit-by-commit.

@owen-mc owen-mc requested a review from a team as a code owner November 12, 2024 17:04
@owen-mc owen-mc marked this pull request as draft November 12, 2024 17:04
@github-actions github-actions bot added the Java label Nov 12, 2024
private ControlFlowNode mainBranchSucc(ControlFlowNode n, boolean b) {
result = succ(n, BooleanCompletion(_, b))
}
private Node mainBranchSucc(Node n, boolean b) { result = succ(n, BooleanCompletion(_, b)) }

Check warning

Code scanning / CodeQL

Missing QLDoc for parameter Warning

The QLDoc has no documentation for b, or n, but the QLDoc mentions booleanCompletion
@@ -1487,8 +1555,8 @@
* In the latter case, when `n` occurs as the last node in a finally block, there might be
* multiple different such successors.
*/
private ControlFlowNode otherBranchSucc(ControlFlowNode n, boolean b) {
exists(ControlFlowNode main | main = mainBranchSucc(n, b.booleanNot()) |
private Node otherBranchSucc(Node n, boolean b) {

Check warning

Code scanning / CodeQL

Missing QLDoc for parameter Warning

The QLDoc has no documentation for b, but the QLDoc mentions booleanCompletion, and normalCompletion
result = this.(Expr).getEnclosingStmt()
module ControlFlow {
private predicate hasControlFlow(Expr e) {
not e.getEnclosingStmt() instanceof ConstCase and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a line like case "a" -> null; I think this is trying to exclude the "a" but is accidentally excluding the null as well. I guess this wasn't an issue until switch expressions got added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it's the arrow syntax that caused the problem, as extra expressions are now part of this statement.

@owen-mc owen-mc marked this pull request as ready for review November 14, 2024 13:11
@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 14, 2024

The language test failure can only be fixed with an internal PR to update a .expected file. I will prepare one when this PR has been approved.

@owen-mc owen-mc requested a review from aschackmull November 14, 2024 13:31
@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 14, 2024

DCA has finished and it shows a mild analysis time increase (1.6% median, 1.8% mean). I ran some queries on apache/flink using before and after commits and looked at the tuple counts and predicate timings and didn't see anything obviously taking much longer. I don't think it's a bad join order or anything like that. But I guess that adding another layer will make analysis take a little longer.

Comment on lines 168 to 170
*
* This is needed for the equivalence relation on basic blocks in range
* analysis.
Copy link
Contributor

@aschackmull aschackmull Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not its only use. Also, it seems reasonable to have predicate like this without such a specific-sounding constraint.

Suggested change
*
* This is needed for the equivalence relation on basic blocks in range
* analysis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to suggest a different wording?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no, I meant to simply delete it. Edited.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you expect every control flow node to be able to supply this? Or is it optional?

Copy link
Contributor

@aschackmull aschackmull Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, there are two different potentially relevant predicates, one that provides the inverse of the getControlFlowNode on Expr and Stmt, and one that projects every Cfg node to an associated Ast node. I think I'd name the latter something with project, and let this predicate take the semantics of the former. Hence, I'd say that this is optional and that we should postfix its qldoc with the customary "if any".
Edit: Which I see it already has. LGTM then.

@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 21, 2024

@aschackmull This is ready to re-review. I have accepted all your review suggestions. The failing test is because we need an internal PR as well.

@owen-mc owen-mc added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Nov 22, 2024
@owen-mc owen-mc force-pushed the java/lightweight-IR-layer branch from 0df1427 to 42104fd Compare November 22, 2024 10:55
aschackmull
aschackmull previously approved these changes Nov 22, 2024
@owen-mc owen-mc force-pushed the java/lightweight-IR-layer branch from 42104fd to e52f52c Compare December 6, 2024 10:36
@owen-mc owen-mc requested a review from a team as a code owner December 6, 2024 10:36
@github-actions github-actions bot added the Kotlin label Dec 6, 2024
@owen-mc owen-mc force-pushed the java/lightweight-IR-layer branch from 2464d5f to b4009ac Compare December 7, 2024 13:48
aschackmull
aschackmull previously approved these changes Dec 9, 2024
owen-mc and others added 7 commits December 10, 2024 15:26
Co-authored-by: Anders Schack-Mulligen <[email protected]>
The change in results shows that there are now fewer control flow nodes.
We have removed precisely those with no successor or predecessor.
@owen-mc owen-mc force-pushed the java/lightweight-IR-layer branch from 48b2fd2 to ce97813 Compare December 10, 2024 15:26
@owen-mc owen-mc force-pushed the java/lightweight-IR-layer branch from ce97813 to 5b57511 Compare December 10, 2024 15:56
Copy link
Contributor

@egregius313 egregius313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving

@owen-mc owen-mc merged commit ba9d21e into github:main Dec 10, 2024
15 of 16 checks passed
@owen-mc owen-mc deleted the java/lightweight-IR-layer branch December 10, 2024 23:57
@aschackmull aschackmull mentioned this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
depends on internal PR This PR should only be merged in sync with an internal Semmle PR documentation Java Kotlin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants