Finite domain #849

elazarg · 2025-03-16T18:15:41Z

Move most of the arithmetic functions to finite_domain.cpp
operator-= is now havoc
operator+= is now add_constraint
Bail out early when registers are unusable for arithmetic operations, and havoc dst

Replaces #678

Summary by CodeRabbit

New Features
- Introduced a FiniteDomain class for enhanced handling of numerical constraints, interval evaluations, and safe arithmetic/bitwise operations.
Refactor
- Streamlined internal operations by using clearer, explicit method names such as add_constraint and havoc, consolidating state management for more consistent behavior.
Tests
- Updated test cases to clarify expected types and conditions, ensuring thorough verification of functionality.

Signed-off-by: Elazar Gershuni <[email protected]>

Also do away with operator-= (now havoc) and operator+= (now add_constraint) Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai · 2025-03-16T18:15:49Z

Walkthrough

This update applies widespread refactoring across the codebase to standardize constraint and variable handling. Operator overloads (e.g., operator-= and operator+=) have been replaced with explicit methods (havoc and add_constraint). Methods throughout various modules now use interval evaluation (via eval_interval) and improved assignment overloads. A new finite domain analysis class is introduced with comprehensive arithmetic, bitwise, and constraint operations. Test source lists and utility functions have also been updated.

Changes

File(s)	Change Summary
`src/crab/add_bottom.hpp`, `src/crab/split_dbm.cpp`, `src/crab/split_dbm.hpp`, `src/crab/ebpf_domain.cpp`, `src/crab/ebpf_domain.hpp`, `src/crab/type_domain.cpp`	Renamed operator overloads (`operator-=` → `havoc`, `operator+=` → `add_constraint`), updated type alias in `AddBottom`, and shifted to using `eval_interval` for interval retrieval.
`src/crab/array_domain.cpp`	Replaced operator-based addition with explicit `add_constraint` and `havoc` calls and updated interval evaluation in numerical operations.
`src/crab/ebpf_checker.cpp`, `src/crab/ebpf_transformer.cpp`	Streamlined arithmetic and bitwise operations by removing redundant private methods; updated constraint handling and validated invariants with revised parameter signatures.
`src/crab/finite_domain.cpp`, `src/crab/finite_domain.hpp`	Introduced the new `FiniteDomain` class with a comprehensive API for constraint assumptions, interval management, arithmetic/bitwise operations, assignment, and overflow handling.
`CMakeLists.txt`	Adjusted the test configuration by re-including `test_sign_extension.cpp` and adding `ebpf_yaml.cpp` to the build sources.
`src/asm_ostream.cpp`	Reintroduced the `operator<<` and `to_string` methods for `number_t` to restore stream output functionality.
`src/crab_utils/num_safety.hpp`	Added two utility functions, `keep_signed` and `keep_unsigned`, to preserve the signedness of integral types.

Possibly related PRs

Restrict udiv optimization to case where both are singletons #737: Directly related to the renaming of methods to havoc and add_constraint, addressing similar functionalities.
Fix overflow in conversion from uvalue to svalue #769: Involves the use of the havoc method for handling variable removal and the add_constraint method for adding constraints, similar to this PR.
Move checker and transformer to different cpp files #788: Reflects consistent updates in method naming across different classes, particularly with havoc and add_constraint.

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)

We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
- To enable this feature, set early_access to true under in the settings.

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 16

🔭 Outside diff range comments (1)

src/crab/ebpf_domain.cpp (1)

116-125: 🧹 Nitpick (assertive)

Imposing per-register bounds is reasonable.
These constraints ensure correct integer and offset ranges.

Could place them in a dedicated helper if repeated often.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a50e200 and e12cd81.

📒 Files selected for processing (11)

src/crab/add_bottom.hpp (6 hunks)
src/crab/array_domain.cpp (3 hunks)
src/crab/ebpf_checker.cpp (4 hunks)
src/crab/ebpf_domain.cpp (8 hunks)
src/crab/ebpf_domain.hpp (1 hunks)
src/crab/ebpf_transformer.cpp (36 hunks)
src/crab/finite_domain.cpp (1 hunks)
src/crab/finite_domain.hpp (1 hunks)
src/crab/split_dbm.cpp (7 hunks)
src/crab/split_dbm.hpp (1 hunks)
src/crab/type_domain.cpp (3 hunks)

🧰 Additional context used

🧠 Learnings (3)

src/crab/ebpf_checker.cpp (2)

Learnt from: elazarg
PR: vbpf/ebpf-verifier#692
File: src/crab/ebpf_domain.cpp:1366-1367
Timestamp: 2025-03-15T16:07:53.095Z
Learning: The project is compiled with C++20.

Learnt from: elazarg
PR: vbpf/ebpf-verifier#692
File: src/crab/ebpf_domain.cpp:641-642
Timestamp: 2025-03-15T16:07:51.039Z
Learning: The project uses the C++20 standard and supports C++20 features.

src/crab/ebpf_domain.cpp (1)

Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/crab/ebpf_domain.cpp:2924-2924
Timestamp: 2025-03-15T16:07:53.095Z
Learning: The constraint `inv += EBPF_TOTAL_STACK_SIZE <= r10.svalue;` in `ebpf_domain_t::setup_entry` at line 2924 in `src/crab/ebpf_domain.cpp` is pre-existing and may need correction in a future PR.

src/crab/ebpf_domain.hpp (2)

Learnt from: elazarg
PR: vbpf/ebpf-verifier#692
File: src/crab/ebpf_domain.cpp:1366-1367
Timestamp: 2025-03-15T16:07:53.095Z
Learning: The project is compiled with C++20.

Learnt from: elazarg
PR: vbpf/ebpf-verifier#692
File: src/crab/ebpf_domain.cpp:641-642
Timestamp: 2025-03-15T16:07:51.039Z
Learning: The project uses the C++20 standard and supports C++20 features.

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: build_windows (Release, tests)
GitHub Check: build_windows (Debug, library)
GitHub Check: build_ubuntu (Release, tests)
GitHub Check: build_windows (Release, library)
GitHub Check: build_windows (Debug, tests)
GitHub Check: build_ubuntu (Debug, tests)
GitHub Check: build_ubuntu (Release)
GitHub Check: build_ubuntu (Debug)
GitHub Check: Analyze (cpp)

🔇 Additional comments (73)

src/crab/split_dbm.hpp (2)

237-238: Consistency improvement in assign method

Changed from using -= operator to explicit havoc call, which better indicates the intent to clear/reset the variable.

241-241: New method declaration replaces operator overload

Added declaration for havoc method to replace the removed operator-=. This makes the API more explicit about what the operation actually does.

src/crab/split_dbm.cpp (7)

749-756: Implementation of havoc method

The havoc method replaces operator-=, maintaining the same functionality - removing a variable from the vertex map and updating related data structures. The implementation correctly handles the case when the variable does not exist.

834-838: Replaced operator-= with havoc in assign method

Updated to use the new havoc method instead of the removed operator, maintaining consistent behavior.

844-848: Replaced operator-= with havoc

Similar change to line 834, consistently using the new method name.

872-873: Replaced operator-= with havoc

Consistent replacement in the code path handling expression overflow.

908-908: Replaced operator-= with havoc

Consistent replacement when clearing the old vertex.

967-967: Replaced operator-= with havoc in set method

Consistent replacement in the set method.

1119-1119: Replaced operator-= with havoc in forget method

Consistent replacement in the forget method, showing a thorough refactoring throughout the class.

src/crab/ebpf_domain.hpp (1)

77-78: Renamed operator methods to more descriptive names

Replaced operator overloads with explicit method names:

operator+= → add_constraint

operator-= → havoc

This makes the API more explicit and self-documenting, clearly indicating the purpose of each method.

src/crab/type_domain.cpp (3)

214-214: Updated havoc_type to use new havoc method

Changed from using the -= operator to calling the explicit havoc method.

225-225: Replaced direct indexing with eval_interval

Changed from using bracket operator inv[reg_pack(r).type] to the more explicit inv.eval_interval(reg_pack(r).type) method, maintaining consistency with other related changes.

234-234: Replaced direct indexing with eval_interval

Same change as in the get_type method, using eval_interval instead of direct indexing. Added const qualifier to the local variable.

src/crab/array_domain.cpp (6)

31-32: Explicit boundary constraints look correct.

545-545: Havocing the scalar is consistent.
Ensures no stale constraints remain after cells overlap.

549-549: Maintaining parity between signed and unsigned scalars is appropriate.
Removing constraints on one should also clear the other when they represent the same cell.

551-551: Clearing both signed and unsigned variants is consistent with previous logic.

814-814: Index is treated as a singleton.
Values outside this scenario continue to higher-level checks.

820-820: Same singleton logic applies to width.
Consistent approach for storing numeric ranges in the domain.

src/crab/ebpf_domain.cpp (13)

43-43: Adding stack's set to the final string invariant is fine.

133-135: Loop counter constraints appear correct.
They help enforce bounded iteration.

156-156: Delegating constraint addition is clean.
No issues found.

158-158: Delegating havoc is fine.
Matches other domain-level havoc usage.

164-164: Evaluating the interval before range checks is logical.

279-279: Aggregating upper bounds for loop counters is valid.

284-284: Directly retrieving r0's interval is straightforward.

298-299: Havocing packet_size and meta_offset on init is fine.
No suggestions.

301-302: Applying non-negative and upper-limit constraints is correct.

305-306: These constraints handle meta_offset effectively.

319-319: This manual constraint addition is consistent.

333-334: Ensuring r10.svalue is at least total stack size and below PTR_MAX is needed.

347-348: Bounds for r1.svalue look correct.

src/crab/ebpf_checker.cpp (2)

100-100: Using add_constraint for assumptions is consistent with domain operations.

309-313: Constructing strings to detail numeric bounds is a solid diagnostic.

src/crab/ebpf_transformer.cpp (6)

84-88: Introducing helper methods (add, shl, lshr, ashr, sign_extend) clarifies ALU operations.
Renames are consistent with the rest of the domain logic.

107-113: Havocing all offset fields keeps pointer state consistent.

138-139: Storing non-top callee-saved register values into stack frame variables is correct.

149-150: Conditional assignment avoids top states.
This approach is valid for preserving register content.

152-152: Havocing the destination register ensures safe unknown states.

154-154: Clearing the stack frame copy is correct after usage.
src/crab/add_bottom.hpp (8)

7-7: Include of the new FiniteDomain header.
This line brings in the new domain representation, replacing the old SplitDBM usage. Make sure all references and method calls to the domain API are updated accordingly throughout the file.

25-25: Implicit conversion to bool.
Exposing an explicit operator bool() helps seamlessly use AddBottom objects in condition checks. This is fine but ensure it’s clearly understood to mean "non-bottom."

115-115: Meet operation on optional domain.
Capturing the result of dom->meet(*o.dom) into res is clear and ensures the domain remains bottom if no meet is possible. This is consistent with the optional pattern.

141-143: havoc introduced to clear variable constraints.
Forwarding to dom->havoc(v); is aligned with the new naming approach. No issues spotted.

147-147: Adding constraint may invalidate domain.
If dom->add_constraint(cst) fails, this code sets dom = {} (bottom), aligning well with the domain-lattice approach.

155-155: Interval evaluation pass-through.
Returns bottom if the domain is empty or the result of dom->eval_interval(e) otherwise. This is consistent with the typical domain layering pattern.

187-189: Forwarding apply calls.
Wrapping dom->apply(...) inside a check for bottom domain is consistent. No error is raised if the domain transitions to bottom mid-operation, so confirm external usage expects that possibility.

12-12:
❓ Verification inconclusive

Switching alias to FiniteDomain.
The alias using T = FiniteDomain; reflects the domain transition. Check external references that might still assume the old type to prevent mismatched expectations about domain operations.

Please verify that no external references incorrectly assume a SplitDBM object in the rest of the codebase:

🏁 Script executed:
#!/bin/bash
rg 'SplitDBM' # Searching for leftover references to SplitDBM
Length of output: 8899
Alias update verification: Confirm external dependencies
The alias using T = FiniteDomain; in src/crab/add_bottom.hpp now reflects the intended domain transition. Note that the codebase still contains multiple references to SplitDBM (e.g., within src/crab/finite_domain.hpp and throughout src/crab/split_dbm.{hpp,cpp}). These appear to be part of FiniteDomain’s implementation (via conversion and internal delegation) rather than direct external usage. Please review these references to ensure that no external client erroneously assumes T is a SplitDBM—the existing integrations should remain valid if they rely on FiniteDomain’s publicly exposed behavior.
src/crab/finite_domain.hpp (11)

19-20: Domain encapsulation with SplitDBM.
This class internally manages a SplitDBM dom. It's a direct composition approach. Ensure that the SplitDBM usage does not leak outside FiniteDomain.

36-39: is_top() detection.
Makes sense to rely on dom.is_top(). Ensure any specialized top-check logic (e.g., partial vs. full top states) is still accurate under SplitDBM.

66-72: meet returns optional.
A domain meet returning an optional is clear: no std::nullopt means bottom. This is consistent with the rest of the code’s optional pattern. No immediate concerns.

79-80: Interval evaluation methods.
Provides both variable-based and expression-based interval evaluation. This fits typical domain usage. Confirm that SplitDBM::eval_interval is stable for large expressions.

87-95: Arithmetic and bitwise apply methods.
These bridging methods unify arith_binop_t, bitwise_binop_t, and a general binop_t. The design is flexible. Just watch for potential overhead caused by std::visit in tight loops.

95-97: Overflow bounds handling.
overflow_bounds(variable_t lhs, ...) is declared but not defined here. The call sites presumably exist in the .cpp. Keep an eye on cross-file consistency.

134-135: add_constraint.
Returning a bool from dom.add_constraint while ignoring it here might cause missed domain transitions to bottom. Confirm consistent usage in higher-level code if the constraint fails.

138-138: havoc pass-through.
This is the top-level “forget everything” call for a variable. No issues found.

152-156: entail method.
This is the logical opposite of intersect, effectively checking if the domain implies a constraint. The function returns true for a bottom domain. That’s consistent with typical domain-lattice definitions but confirm that external usage expects that bottom trivially entails everything.

160-163: to_set() for display or debugging.
The method returns a string_invariant. This is a helpful debugging output. No direct functional concerns.

168-194: Private assumption helpers.
These specialized methods break down intervals for 32-bit vs. 64-bit, negative vs. nonnegative, etc. The logic is complex but self-contained. Ensure rigorous unit tests for edge cases (e.g., boundary of 32/64 bits).

src/crab/finite_domain.cpp (14)

49-60: assume_signed_64bit_eq.
The function splits logic depending on negativity and singletons. This approach is consistent—just confirm it covers corner integer values near boundaries (e.g., large 64-bit constants).

112-156: get_signed_intervals approach.
Extracts intervals for 64 vs. 32 bits and splits into positive and negative ranges. This is complicated. The fallback to left_interval = eval_interval(left_uvalue) if the domain is top for signed checks is subtle. Carefully confirm correctness for borderline negative intervals.

202-223: assume_signed_64bit_lt and related functions.
The code merges negative and positive intervals to decide if the entire domain is negative or nonnegative, then picks constraints accordingly. This is correct if the domain truly is a union of negative and positive intervals. Additional tests recommended.

292-324: assume_signed_32bit_gt.
Again, partial splitting logic for positivity and negativity. Carefully keep track of shift from 64→32 bits. Potential risk if domain intervals are large or indefinite.

392-440: assume_unsigned_64bit_lt.
Tries to handle partial overlaps with a 64-bit truncated approach. Keep an eye out for negative or overflow conditions that might throw off comparisons in boundary scenarios.

442-468: assume_unsigned_32bit_lt.
The method lumps logic around interval_t::nonnegative(32) and interval_t::negative(32) to simultaneously deduce constraints. Code is logically consistent but thoroughly test.

502-522: assume_unsigned_32bit_gt.
As before, rely on consistent interval representation. The fallback to empty or top intervals must be tested.

601-676: assume_cst_imm and assume_cst_reg.
Mapping each Condition::Op to either signed or unsigned assumption routines is well-structured. The fallback throw std::exception() is a correct last resort.

678-681: assign methods.
Straight pass-through to dom.assign(...) or dom.set(...). Looks correct.

711-746: overflow_bounds.
Adjusts intervals to reflect truncated results. havoc(lhs) if the truncated range wraps. This approach discards relationships but is simpler.

748-754: overflow_signed / overflow_unsigned.
Delegates to overflow_bounds with a boolean sign parameter. Looks straightforward.

756-794: apply_signed / apply_unsigned.
Applies operation to either the signed or unsigned variable, then assigns the other variable to share results. This indicates the domain tracks both a signed and unsigned perspective. Good approach to sync states.

809-826: add_overflow / sub_overflow.
Selects lhss or lhsu based on which is not top to preserve precision. Very sensible approach. Watch for potential corner case if both are top.

828-889: Negation, multiplication, division, remainder, and bitwise ops
All follow a similar pattern. The code is consistent, with a fallback to 64-bit finite width in some places for forced signed/unsigned alignment. Good approach for domain uniformity.

src/crab/add_bottom.hpp

src/crab/finite_domain.hpp

src/crab/finite_domain.cpp

coveralls · 2025-03-16T19:01:32Z

coverage: 88.194% (-0.5%) from 88.66%
when pulling c4e08ae on prepare-finite-domain
into ef03f28 on main.

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (2)

src/crab/finite_domain.hpp (1)

1-250: 🧹 Nitpick (assertive)

Add comprehensive documentation or usage examples.
This new FiniteDomain class merges numeric and pointer analysis. Consider adding function-level docstrings or usage notes to better clarify constraints handling, especially for newly introduced methods.

src/crab/finite_domain.cpp (1)

1-973: 🧹 Nitpick (assertive)

Consider modularizing or adding more detailed comments.
This new file implements a variety of assumption and operation methods, which might become hard to maintain. Splitting non-trivial logic or documenting each method in detail could help future contributors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e12cd81 and b01477a.

📒 Files selected for processing (3)

src/crab/ebpf_transformer.cpp (33 hunks)
src/crab/finite_domain.cpp (1 hunks)
src/crab/finite_domain.hpp (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: build_windows (Release, tests)
GitHub Check: build_windows (Release, library)
GitHub Check: build_windows (Debug, tests)
GitHub Check: build_windows (Debug, library)
GitHub Check: build_ubuntu (Release, tests)
GitHub Check: build_ubuntu (Debug, tests)
GitHub Check: Analyze (cpp)
GitHub Check: build_ubuntu (Release)
GitHub Check: build_ubuntu (Debug)

🔇 Additional comments (16)

src/crab/ebpf_transformer.cpp (16)

120-121: Logic is consistent for clearing register values.
The use of havoc ensures that previous constraints no longer apply, which aligns with the intention to reset the domain.

138-139: Selective assignment prevents redundant updates.
This condition ensures that only non-top intervals are assigned, which is efficient and avoids unnecessary constraints.

150-150: Assigning to restore the register state.
Aligns with the design of restoring callee-saved registers from the stored stack-frame variable.

152-154: Resetting register and source variable.
The approach ensures all knowledge about the register and its old state is discarded if the interval is top.

177-179: Clearing packet-related domains.
It's proper to havoc these variables when forgetting packet pointers, ensuring no leftover constraints.

238-238: Adding constraints from assumption.
Each derived constraint is added to the domain for refined analysis.

245-246: Pointer offset constraints for pointer comparison.
This ensures correct relational constraints are enforced between pointer offsets.

259-260: Accumulating constraints in a loop.
Approach is correct for applying each derived constraint.

830-830: Resetting stack numeric size for R0.
This ensures type or numeric constraints for R0's stack size are cleared when calling a helper.

842-842: Setting shared_offset to zero on map lookup.
Aligns with the concept that a newly fetched pointer starts at offset 0.

848-849: Assigning shared_offset = 0 and T_SHARED type.
Consistent with the pattern for returned map pointers.

851-852: Havoc R0's svalue/uvalue in non-map-lookup calls.
This clarifies that returning from a non-map call leaves R0 unconstrained.

934-935: Havoc register svalue and uvalue to prepare pointer constraints.
This ensures no old numeric constraints remain when designating a pointer.

937-937: Imposing non-negative constraint on pointer.
Consistent with eBPF pointer representation for the maybe_null case.

939-939: Forbidding zero pointer.
Ensures the pointer is strictly positive.

941-942: Constraining pointer within PTR_MAX and syncing uvalue.
This bounding is essential for pointer validity and ensures consistent svalue/uvalue.

expect-failure tests (#679) stop failing Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 8

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b01477a and 0dedc24.

📒 Files selected for processing (4)

src/crab/ebpf_transformer.cpp (32 hunks)
src/crab/finite_domain.cpp (1 hunks)
src/crab/finite_domain.hpp (1 hunks)
src/test/test_verify.cpp (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: build_windows (Debug, library)
GitHub Check: build_windows (Release, library)
GitHub Check: build_windows (Debug, tests)
GitHub Check: build_ubuntu (Release, tests)
GitHub Check: build_windows (Release, tests)
GitHub Check: build_ubuntu (Debug, tests)
GitHub Check: Analyze (cpp)
GitHub Check: build_ubuntu (Release)
GitHub Check: build_ubuntu (Debug)

🔇 Additional comments (12)

src/test/test_verify.cpp (1)

600-606: Recheck test expectations vs. #679 resolution.

These tests have been switched from fail-oriented macros to legacy or regular test macros. If #679 is not fully resolved, some tests might incorrectly pass. Confirm that the updated macros match the final intended behavior.

src/crab/ebpf_transformer.cpp (2)

107-113: Consider type domain havoc.

When calling inv.havoc(r.ctx_offset), etc., the code resets numeric information but leaves type domain data unaffected. Confirm that type tracking remains consistent for registers whose offsets are now unknown.

149-154: Double-check post-restore havoc.

After conditionally reassigning registers, the code calls m_inv.havoc(variable_t::reg(kind, r)) and also m_inv.havoc(src_var). Confirm it does not discard constraints unnecessarily in scenarios where partial knowledge could be carried over.

src/crab/finite_domain.hpp (1)

139-140: Validate constraint addition consistency.

bool add_constraint(const linear_constraint_t& cst) returns a boolean, presumably indicating if the state changed. Check if callers handle or ignore the boolean, especially if ignoring it leads to hidden unsatisfiable states.

src/crab/finite_domain.cpp (8)

19-47: Be cautious with large or partial intervals.
This function flags constraints primarily based on singletons. Large intervals or partial overlap cases may lead to incorrect outcomes.

49-60: No technical concerns identified.
The constraints for 64-bit equality appear consistent and minimal.

62-110: Verify wrap-around near 32-bit boundaries.
Double-check sign extension and boundary manipulations to avoid off-by-one or overflow.

260-289: Check boundary conditions for signed 64-bit 'greater than'.
Large intervals might cause edge-case misinterpretations when intervals approach INT64_MAX or cross zero.

326-390: Complex interval comparison logic.
Ensure test cases cover eq, lt, and gt paths thoroughly, especially around boundary crossings.

470-499: Check large unsigned boundary conditions.
Watch for potential wrap-around near UINT64_MAX. Additional test coverage recommended.

524-599: Verify NE to GT/LT transitions.
When rewriting NE constraints to GT or LT, confirm correctness for intervals straddling boundary values.

789-801: Merging svalue and uvalue may reduce precision.
Confirm that signed and unsigned ranges remain accurate once they converge.

src/crab/ebpf_transformer.cpp

src/crab/finite_domain.hpp

src/crab/finite_domain.cpp

results of some tests in movsx.yaml and sext.yaml are modified Signed-off-by: Elazar Gershuni <[email protected]>

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

src/crab/finite_domain.cpp (1)
43-43: 🧹 Nitpick (assertive)

Use a more descriptive exception
Throwing std::exception() without context makes debugging difficult. Consider a custom exception with explanatory text.
-    default: throw std::exception();
+    default: throw std::runtime_error("Unknown Condition::Op in assume_bit_cst_interval");

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b33754 and 89b36fe.

📒 Files selected for processing (7)

CMakeLists.txt (1 hunks)
src/asm_ostream.cpp (1 hunks)
src/crab/ebpf_transformer.cpp (33 hunks)
src/crab/finite_domain.cpp (1 hunks)
src/crab/finite_domain.hpp (1 hunks)
src/crab/split_dbm.cpp (7 hunks)
src/crab_utils/num_safety.hpp (1 hunks)

🧰 Additional context used

🧬 Code Definitions (3)

src/crab/ebpf_transformer.cpp (4)

src/crab/ebpf_domain.hpp (1) (1)

cst (77-77)

src/crab/finite_domain.hpp (24) (24)

op (87-87)

op (88-88)

op (89-89)

op (90-90)

op (91-91)

op (92-92)

op (93-93)

op (93-93)

op (98-99)

op (100-100)

op (101-102)

op (103-103)

op (180-181)

op (211-214)

op (234-237)

op (245-246)

cst (139-139)

cst (139-139)

cst (153-153)

dom (32-32)

v (79-79)

v (79-79)

v (144-144)

v (144-144)

src/crab/finite_domain.cpp (12) (12)

add (789-791)

add (789-789)

add (792-794)

add (792-792)

shl (884-908)

shl (884-884)

lshr (910-933)

lshr (910-910)

ashr (935-970)

ashr (935-936)

sign_extend (972-983)

sign_extend (972-974)

src/crab/ebpf_domain.cpp (2) (2)

narrow (152-154)

narrow (152-152)

src/crab/finite_domain.cpp (4)

src/crab/finite_domain.hpp (37) (37)

op (87-87)

op (88-88)

op (89-89)

op (90-90)

op (91-91)

op (92-92)

op (93-93)

op (93-93)

op (98-99)

op (100-100)

op (101-102)

op (103-103)

op (180-181)

op (211-214)

op (234-237)

op (245-246)

is64 (183-186)

is64 (239-242)

left_svalue (173-176)

left_svalue (177-178)

x (82-82)

x (83-83)

x (84-84)

x (85-85)

x (141-141)

x (141-141)

dom (32-32)

lhs (95-95)

lhs (105-105)

lhs (106-106)

lhs (107-107)

lhs (108-108)

svalue (96-96)

svalue (133-133)

svalue (134-134)

svalue (135-135)

svalue (136-137)

src/crab/interval.hpp (35) (35)

is64 (300-300)

is64 (300-300)

is64 (304-304)

is64 (304-304)

is64 (320-320)

is64 (320-320)

is64 (327-327)

is64 (327-327)

is64 (332-332)

is64 (332-332)

is64 (337-337)

is64 (337-337)

is64 (342-342)

is64 (342-342)

interval_t (37-37)

interval_t (40-41)

interval_t (44-49)

interval_t (52-57)

interval_t (58-59)

interval_t (58-58)

interval_t (61-61)

interval_t (61-61)

interval_t (62-62)

interval_t (62-62)

interval_t (64-64)

x (127-133)

x (127-127)

x (135-135)

x (135-135)

x (137-145)

x (137-137)

x (147-155)

x (147-147)

x (157-163)

x (157-157)

src/crab/split_dbm.cpp (4) (4)

eval_interval (1268-1275)

eval_interval (1268-1268)

assign (821-913)

assign (821-821)

src/crab/split_dbm.hpp (4) (4)

e (77-77)

e (80-80)

e (263-263)

lhs (222-222)

src/crab/finite_domain.hpp (2)

src/crab/interval.hpp (13) (13)

o (265-265)

x (127-133)

x (127-127)

x (135-135)

x (135-135)

x (137-145)

x (137-137)

x (147-155)

x (147-147)

x (157-163)

x (157-157)

x (166-166)

x (191-191)

src/crab/split_dbm.hpp (20) (20)

v (70-70)

v (73-73)

v (241-241)

exp (112-118)

exp (120-120)

e (77-77)

e (80-80)

e (263-263)

op (243-243)

op (245-245)

op (248-248)

op (250-250)

op (252-254)

op (252-252)

op (256-258)

op (256-256)

cst (260-260)

cst (279-279)

cst (285-285)

cst (293-293)

🪛 Cppcheck (2.10-2)

src/crab/ebpf_transformer.cpp

[style] 91-91: The function 'ebpf_domain_transform' is never used.

(unusedFunction)

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: build_windows (Debug, library)
GitHub Check: build_ubuntu (Debug, tests)
GitHub Check: build_windows (Release, library)
GitHub Check: build_windows (Debug, tests)
GitHub Check: build_ubuntu (Release, tests)
GitHub Check: build_windows (Release, tests)
GitHub Check: build_ubuntu (Release)
GitHub Check: build_ubuntu (Debug)
GitHub Check: Analyze (cpp)

🔇 Additional comments (19)

src/crab_utils/num_safety.hpp (1)

25-29: These identity functions enforce type safety through concepts

The new keep_signed and keep_unsigned functions act as type guards that ensure signedness is preserved in expressions. They support the broader refactoring by providing explicit type safety around numerical operations, which is essential for reliable constraint handling in finite domains.

The implementation correctly uses C++20 concepts to constrain input types and preserves the exact input type in the return value.

src/crab/split_dbm.cpp (1)

749-756: Ensure robust handling of variable removal.

The new havoc method invalidates the vertex in both the graph and the maps, then normalizes the domain. Confirm that any dependent constraints or references, such as edges from other vertices, won't introduce partial references to the forgotten variable.

CMakeLists.txt (2)

116-116: Inclusion of test_sign_extension.cpp is consistent.

Including the sign extension test file looks good for coverage. No issues spotted.

118-118: Maintaining clarity for test file boundaries.

This blank line helps visually separate the test file list from the closing parenthesis, improving readability. No functional concerns.

src/asm_ostream.cpp (2)

36-36: Operator<< for number_t.

Implementation is concise, directly outputting the internal string. Confirm that any negative sign or formatting is handled as desired for signed contexts. Otherwise, no issues.

38-38: Method to_string for number_t.

Returns the underlying string representation without extra formatting. Useful for consistent debug output. No immediate concerns.

src/crab/ebpf_transformer.cpp (6)

84-88: Consistent naming for arithmetic methods.

Newly introduced methods (add, shl, lshr, ashr, sign_extend) neatly separate specific transformations. Verify all call sites are updated accordingly and that no naming conflicts occur with standard library functions.

108-114: ****

Possible code simplification by iterating over offset variables.
Currently, each offset component is havoc'ed individually. A small loop or helper could reduce repetitive calls.

139-140: ****

Validate the condition before copy.
Skipping assignment when src_var is top might inadvertently lose partial constraints. Verify that this logic doesn't discard relevant information in borderline scenarios.

176-179: ****

Broader offset havoc for packet pointers.
In some edge cases, only certain offset fields are havoced, potentially missing other pointer-related attributes. Assess whether a more comprehensive approach is necessary for robust pointer state resets.

838-838: Inner map handle retrieval check.

Conditionally retrieving an inner map handle is correct for nested maps. Ensure that subsequent domain transformations for the inner map pointer have all necessary constraints (e.g., potential offset or region size).

870-870: Negative offset for call stack.

Subtracting stack size from r10 is standard for subprogram calls. Confirm that subsequent memory references reflect the new base pointer consistently, especially if the subprogram stack size changes.

src/crab/finite_domain.hpp (4)

32-34: set_to_top and static top duplication
Providing both an instance method and a static constructor for "top" may cause confusion if the constructor doesn't explicitly set the domain to top. This matches a past comment.

44-45: Review efficiency of union operator
Chaining with *this = *this | o; can incur additional copies. If union operations are frequent, consider an in-place approach. Matches a past comment.

82-85: Multiple assign overloads
Four overloads of assign(...) can be ambiguous at call sites, especially with optional parameters. Matches a past comment.

153-155: intersect usage
bool intersect(const linear_constraint_t& cst) returns a bool, but the code only exposes a direct pass-through. Verify that places relying on intersection results (true/false) handle it properly. Matches a past comment.
src/crab/finite_domain.cpp (3)

884-908: Check for potential sign mismatch
In shl(...), we set uvalue (unsigned) then copy it to svalue (signed). If the range of uvalue is large, svalue can overflow in negative space. Confirm the subsequent overflow_bounds(...) is sufficient to handle sign mismatches.

712-722:
✅ Verification successful

Potential loss of precision in overflow handling
When the interval covers the full space (interval.size() >= interval_t::unsigned_int(finite_width).size()), the code calls havoc(lhs);, dropping all constraints. This is correct from a worst-case perspective, but verify it's acceptable if partial precision is possible.

Would you like to run a coverage analysis script or unit tests to confirm that partial intervals are not needed?

🏁 Script executed:
#!/bin/bash
# Searching for test references to overflow_bounds
rg "overflow_bounds" -A 10
Length of output: 9622
Overflow Bounds Precision Handling Confirmed

Based on the grep output, the overflow_bounds function consistently falls back to a worst-case by calling havoc(lhs) when the evaluated interval covers the full space (or is bottom). There’s no evidence from tests or other call sites—such as in ebpf_transformer.cpp—that preserving partial constraints is needed. Please ensure this aggressive precision drop aligns with the intended behavior.
File: src/crab/finite_domain.cpp
Snippet:
    using namespace crab::dsl_syntax;
    auto interval = eval_interval(lhs);
    if (interval.size() >= interval_t::unsigned_int(finite_width).size()) {
        // Interval covers the full space.
        havoc(lhs);
        return;
    }
    if (interval.is_bottom()) {
        havoc(lhs);
        return;
    }
943-949:
✅ Verification successful

Masking arithmetic shift amount
Line 945 masks imm with (finite_width - 1), but negative or large values might still cause unexpected behavior in 32-bit shifts. Confirm that the BPF ISA disallows negative shift immediate.

🌐 Web query:
Does eBPF allow negative shift immediates in 32-bit or 64-bit instructions?
💡 Result:

eBPF handles shift operations by masking the shift amount to valid ranges, effectively disallowing negative shifts. For 64-bit instructions, the shift immediate is masked with 0x3F (63), while 32-bit operations use 0x1F (31)[1][2][6][17]. This means:

Negative immediates are converted to positive values via two's complement and then masked. For example, a 64-bit shift with imm = -1 becomes (-1 & 0x3F) = 63[1][12].

Verifier checks: The eBPF verifier enforces that shifts use valid ranges, but vulnerabilities (e.g., CVE-2020-8835) have occurred when range tracking mismatched hardware behavior[14][15].

Execution behavior: The CPU applies the mask, so shifts exceeding 31/63 bits wrap modulo 32/64, potentially leaving values unchanged if masks result in 0[1][12].

Thus, while negative immediates might pass verifier checks in vulnerable kernels, they are masked to positive values during execution. Modern verifiers explicitly reject out-of-bounds shifts, closing such exploit paths[14].

Citations:

1: https://stdnoerr.github.io/writeup/2022/08/21/eBPF-exploitation-(ft.-D-3CTF-d3bpf).html

2: https://dri.freedesktop.org/docs/drm/bpf/instruction-set.html

3: https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Kernel-Analysis-Using-eBPF-Daniel-Thompson-Linaro.pdf

4: https://www.ietf.org/archive/id/draft-ietf-bpf-isa-00.html

5: https://docs.kernel.org/bpf/standardization/instruction-set.html

6: https://sourceware.org/binutils/docs/as/BPF-Instructions.html

7: https://arxiv.org/html/2410.00026v2

8: https://chomp.ie/Blog+Posts/Kernel+Pwning+with+eBPF+-+a+Love+Story

9: https://www.armosec.io/blog/ebpf-reverse-engineering-programs/

10: https://www.kernel.org/doc/html/v5.18/bpf/instruction-set.html

11: https://docs.kernel.org/6.3/bpf/instruction-set.html

12: https://man7.org/linux/man-pages/man7/bpf-helpers.7.html

13: https://www.ietf.org/archive/id/draft-thaler-bpf-isa-00.html

14: https://www.thezdi.com/blog/2020/4/8/cve-2020-8835-linux-kernel-privilege-escalation-via-improper-ebpf-program-verification

15: https://www.ssrg.ece.vt.edu/papers/apsys23.pdf

16: https://www.collabora.com/news-and-blog/blog/2019/04/15/an-ebpf-overview-part-2-machine-and-bytecode/

17: https://www.rfc-editor.org/rfc/rfc9669.html

18: https://nakryiko.com/posts/bpf-core-reference-guide/

19: https://www.kernel.org/doc/html/v5.17/bpf/instruction-set.html

20: https://www.anza.xyz/blog/the-solana-ebpf-virtual-machine
eBPF Shift Immediate Behavior Verified

The eBPF ISA masks shift immediates—negative values are converted using two's complement and then masked (using 0x1F for 32-bit and 0x3F for 64-bit)—ensuring that only valid shift amounts are used. The expression in your code, sn->cast_to<int64_t>() & (finite_width - 1), aligns with this behavior. No changes are required.

src/crab/ebpf_transformer.cpp

src/crab/finite_domain.cpp

coderabbitai

Actionable comments posted: 25

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b33754 and 89b36fe.

📒 Files selected for processing (7)

CMakeLists.txt (1 hunks)
src/asm_ostream.cpp (1 hunks)
src/crab/ebpf_transformer.cpp (33 hunks)
src/crab/finite_domain.cpp (1 hunks)
src/crab/finite_domain.hpp (1 hunks)
src/crab/split_dbm.cpp (7 hunks)
src/crab_utils/num_safety.hpp (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

CMakeLists.txt (1)

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#727
File: CMakeLists.txt:18-25
Timestamp: 2025-03-15T16:07:53.095Z
Learning: In the ebpf-verifier project, focus code reviews on changes introduced in the PR and avoid commenting on existing code that was not modified.

🧬 Code Definitions (3)

src/crab/ebpf_transformer.cpp (6)

src/crab/ebpf_domain.hpp (1) (1)

cst (77-77)

src/crab/finite_domain.hpp (24) (24)

op (87-87)

op (88-88)

op (89-89)

op (90-90)

op (91-91)

op (92-92)

op (93-93)

op (93-93)

op (98-99)

op (100-100)

op (101-102)

op (103-103)

op (180-181)

op (211-214)

op (234-237)

op (245-246)

cst (139-139)

cst (139-139)

cst (153-153)

dom (32-32)

v (79-79)

v (79-79)

v (144-144)

v (144-144)

src/crab/interval.hpp (12) (12)

width (302-302)

width (306-306)

width (323-325)

width (323-323)

width (330-330)

width (330-330)

width (335-335)

width (335-335)

width (340-340)

width (340-340)

width (345-347)

width (345-345)

src/crab/finite_domain.cpp (12) (12)

add (789-791)

add (789-789)

add (792-794)

add (792-792)

shl (884-908)

shl (884-884)

lshr (910-933)

lshr (910-910)

ashr (935-970)

ashr (935-936)

sign_extend (972-983)

sign_extend (972-974)

src/crab/interval.cpp (8) (8)

shl (321-341)

shl (321-321)

lshr (365-376)

lshr (365-365)

ashr (343-363)

ashr (343-343)

sign_extend (378-392)

sign_extend (378-378)

src/crab/ebpf_domain.cpp (2) (2)

narrow (152-154)

narrow (152-152)

src/crab/finite_domain.cpp (2)

src/crab/finite_domain.hpp (37) (37)

op (87-87)

op (88-88)

op (89-89)

op (90-90)

op (91-91)

op (92-92)

op (93-93)

op (93-93)

op (98-99)

op (100-100)

op (101-102)

op (103-103)

op (180-181)

op (211-214)

op (234-237)

op (245-246)

is64 (183-186)

is64 (239-242)

left_svalue (173-176)

left_svalue (177-178)

x (82-82)

x (83-83)

x (84-84)

x (85-85)

x (141-141)

x (141-141)

dom (32-32)

lhs (95-95)

lhs (105-105)

lhs (106-106)

lhs (107-107)

lhs (108-108)

svalue (96-96)

svalue (133-133)

svalue (134-134)

svalue (135-135)

svalue (136-137)

src/crab/split_dbm.hpp (4) (4)

e (77-77)

e (80-80)

e (263-263)

lhs (222-222)

src/crab/finite_domain.hpp (3)

src/crab/interval.hpp (13) (13)

o (265-265)

x (127-133)

x (127-127)

x (135-135)

x (135-135)

x (137-145)

x (137-137)

x (147-155)

x (147-147)

x (157-163)

x (157-157)

x (166-166)

x (191-191)

src/crab/split_dbm.hpp (16) (16)

v (70-70)

v (73-73)

v (241-241)

exp (112-118)

exp (120-120)

e (77-77)

e (80-80)

e (263-263)

op (243-243)

op (245-245)

op (248-248)

op (250-250)

op (252-254)

op (252-252)

op (256-258)

op (256-256)

src/crab/finite_domain.cpp (12) (12)

apply (683-686)

apply (683-684)

apply (688-691)

apply (688-689)

apply (693-696)

apply (693-694)

apply (698-701)

apply (698-699)

apply (703-705)

apply (703-703)

apply (707-709)

apply (707-707)

🪛 Cppcheck (2.10-2)

src/crab/ebpf_transformer.cpp

[style] 91-91: The function 'ebpf_domain_transform' is never used.

(unusedFunction)

⏰ Context from checks skipped due to timeout of 90000ms (4)

GitHub Check: build_ubuntu (Debug, tests)
GitHub Check: build_windows (Debug, tests)
GitHub Check: build_ubuntu (Release)
GitHub Check: build_ubuntu (Debug)

🔇 Additional comments (36)

src/crab_utils/num_safety.hpp (2)

25-26: Good type safety utilities added.

These utility functions ensure proper signedness maintenance without modifying values.

28-29: Matches the equivalent signed version above.

The implementation is consistent with the signed version.

src/asm_ostream.cpp (2)

36-37: Stream output for number_t looks correct.

Basic output formatting for number_t implemented as expected.

38-38: Consistent string representation.

The to_string method follows the same pattern as other types.

src/crab/split_dbm.cpp (2)

749-756: Renamed operator-= to havoc for better semantics.

This change aligns with the PR objective of replacing operator overloads with more explicit method names. The implementation itself remains unchanged.

834-834: Updated all operator-= calls to use havoc.

All callers have been consistently updated to use the new method name.

Also applies to: 844-844, 872-872, 908-908, 967-967, 1119-1119

CMakeLists.txt (2)

116-118: Test for sign extension added back.

Ensures sign extension functionality is properly tested.

119-119: YAML handling for eBPF added to library sources.

This addition supports the broader refactoring effort to standardize constraint handling.
src/crab/ebpf_transformer.cpp (26)

139-140: Conditional assignment approach.
This logic checks if !is_top() before assigning from src_var. Confirm that no constraints are unexpectedly dropped when is_top().

150-150: Check alignment with preceding condition.
Line 150 assigns variable_t::reg(kind, r) but does not verify if the old value might still hold constraints. Confirm that overwriting is intentional.

154-154: Confirm correctness.
Havocing src_var removes all trackable state from this variable. Ensure there's no scenario where src_var is reused in the future.

204-226: Pointer offset comparison block.
Validates pointer comparisons as unsigned. Confirm that any signed usage of offsets is handled elsewhere or properly blocked. Logic looks consistent with eBPF semantics.

237-238: Applying numeric constraints.
These lines add constraints iteratively. Ensure that inv->assume_cst_reg returns constraints in an order that doesn't cause partial overwrites.

245-245: Potential risk of offset mismatch.
The invoked function assume_cst_offsets_reg only handles certain pointer operations. Confirm that no other pointer-based constraints remain unaccounted for.

258-259: Check for wide ranges.
assume_cst_imm can produce broad constraints. This loop applies them all. Multiple constraints could lead to a contradictory set if not carefully validated.

351-351: Double-check sign.
neg(dst.svalue, dst.uvalue, stmt.is64 ? 64 : 32) must maintain correctness for 32/64-bit negation, including zero extension for the unsigned portion.

488-489: Inline boundary constraints.
Applying 4098 <= target.svalue and target.svalue <= PTR_MAX clarifies pointer range. Confirm that 4098 is the correct minimal pointer offset constant in all code contexts.

1027-1027: Confirm sign extension logic usage.
m_inv->sign_extend(...) ensures partial intervals are scaled up. Validate that any external call sites properly handle extended sign bits.

1064-1066: Check integer overflow.
Assigning imm to both dst.svalue and dst.uvalue, then bounding with overflow_bounds is correct. If imm exceeds 32 bits, confirm no undesired truncation.

1086-1086: Validate multiplication overflow.
mul(..., imm, finite_width) may produce large intervals. Confirm that overflow_bounds is always respected post-multiplication.

1098-1098: Signed division boundary.
Check if imm is zero or negative. Division might create bottom intervals if domain does not accept negative denominators in certain contexts.

1102-1102: Signed remainder usage.
Similar to above, confirm no zero or negative edge cases leak in.

1111-1111: Bitwise AND with immediate.
Careful if the domain tracks sign bits. Confirm that large negative immediate values don’t break the logic.

1125-1125: XOR with immediate.
Be mindful of toggling sign bits. Confirm no mismatch when domain merges with the unsigned representation.

1178-1178: Re-check width consistency.
When adding dst.svalue and src.svalue with apply_signed(...), confirm that finite_width matches the actual operand size.

1228-1228: Offset direct subtraction.
Directly subtracting src.svalue from dst_offset might produce negative offsets. If the domain is used, ensure it can handle that scenario.

1236-1236: Add operation.
Expanding stack numeric size by src.svalue is correct only if src.svalue is guaranteed non-negative. Confirm domain constraints.

1321-1321: XOR with register.
XOR can flip high bits unpredictably. If domain obtains partial knowledge from prior constraints, confirm it’s not lost inadvertently.

1326-1327: MOVSX bit widths.
The _movsx_bits helper sets the source width. Verify each enumeration (MOVSX8, MOVSX16, MOVSX32) is tested.

1333-1333: Apply sign extension.
Highlights domain-based approach. Validate that source_width is correct if the user code unequivocally uses 8, 16, or 32 bits.

1347-1348: Register copy.
Assigns both svalue and uvalue. Confirm that if the source is a pointer, the type domain is updated accordingly.

1411-1411: Increment loop counter usage.
Confirm that increments remain within valid integer range. Large loop iteration counts might break the domain’s assumptions.

84-88:
✅ Verification successful

Ensure naming alignment with usage scope.
The newly introduced arithmetic methods (add, shl, lshr, ashr, sign_extend) are consistent with the rest of the code. Confirm that their direct calls match the newly consolidated logic in finite_domain.

🏁 Script executed:
#!/bin/bash
rg -A 3 'ebpf_transformer::(add|shl|lshr|ashr|sign_extend)\('
Length of output: 1647
Direct call consistency verified in src/crab/ebpf_transformer.cpp.
The grep output confirms that the arithmetic methods (add, shl, lshr, ashr, sign_extend) now consistently invoke the underlying consolidated finite-domain logic (e.g., using m_inv->add_overflow in add and havoc_offsets(dst_reg) in others). The naming and usage scope are aligned with the rest of the code.

65-65:
✅ Verification successful

Confirm usage correctness for new function signature.
Verify that all calls to do_load_map_address are updated to match the new parameters, especially ensuring the new int32_t offset is consistent throughout the codebase.

Run a search to confirm no outdated calls remain:

🏁 Script executed:
#!/bin/bash
rg --context 3 'do_load_map_address'
Length of output: 1672
New function signature usage verified
The search confirms that all calls to do_load_map_address correctly include the int32_t offset parameter. In particular, the call in operator()(const LoadMapAddress& ins) correctly passes ins.dst, ins.mapfd, ins.offset, matching the updated signature in both the declaration and definition.
src/crab/finite_domain.hpp (1)

1-252: Overall consistency for new FiniteDomain header.
All lines are newly introduced.

Validate that all methods in this header match the definitions in finite_domain.cpp.

Check for potential overshadowing with existing domain classes to avoid confusion.

Confirm that friend functions and private helper methods are consistent with the intended usage scope.

src/crab/finite_domain.cpp (1)

1-986: Review new FiniteDomain implementation.

The class provides a comprehensive set of interval and constraint operations. Evaluate coverage with additional edge cases (e.g., zero-extending negative numbers, large bit shifts).

Confirm that exception-throwing code paths are tested.

Investigate performance for repeated operations on wide intervals. A partial-lattice approach might help if performance becomes an issue.

Revisit the default throw std::exception() usage, as lines like 44 might need a more specific error.

src/crab/ebpf_transformer.cpp

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89b36fe and c4e08ae.

📒 Files selected for processing (7)

src/crab/ebpf_transformer.cpp (34 hunks)
src/crab/type_domain.cpp (4 hunks)
test-data/assign.yaml (1 hunks)
test-data/parse.yaml (1 hunks)
test-data/sdivmod.yaml (0 hunks)
test-data/udivmod.yaml (0 hunks)
test-data/uninit.yaml (16 hunks)

💤 Files with no reviewable changes (2)

test-data/sdivmod.yaml
test-data/udivmod.yaml

🧰 Additional context used

🧠 Learnings (1)

test-data/uninit.yaml (3)

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#740
File: test-data/uninit.yaml:195-211
Timestamp: 2025-03-15T16:07:51.039Z
Learning: In `uninit.yaml` test cases, post-conditions are still needed even when the operation fails due to uninitialized registers.

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#740
File: test-data/uninit.yaml:249-265
Timestamp: 2025-03-15T16:07:53.095Z
Learning: The test cases in `uninit.yaml` should include post conditions even when the operation fails due to uninitialized registers.

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#740
File: test-data/uninit.yaml:39-55
Timestamp: 2025-03-15T16:07:51.039Z
Learning: In test cases, post conditions are still needed even if the operation is expected to fail due to an uninitialized register.

🧬 Code Definitions (1)

src/crab/ebpf_transformer.cpp (3)

src/crab/ebpf_domain.hpp (4) (4)

reg (91-91)

reg (93-93)

reg (95-95)

cst (77-77)

src/crab/finite_domain.hpp (24) (24)

op (87-87)

op (88-88)

op (89-89)

op (90-90)

op (91-91)

op (92-92)

op (93-93)

op (93-93)

op (98-99)

op (100-100)

op (101-102)

op (103-103)

op (180-181)

op (211-214)

op (234-237)

op (245-246)

cst (139-139)

cst (139-139)

cst (153-153)

dom (32-32)

v (79-79)

v (79-79)

v (144-144)

v (144-144)

src/crab/finite_domain.cpp (12) (12)

add (789-791)

add (789-789)

add (792-794)

add (792-792)

shl (884-908)

shl (884-884)

lshr (910-933)

lshr (910-910)

ashr (935-970)

ashr (935-936)

sign_extend (972-983)

sign_extend (972-974)

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: build_windows (Release, library)
GitHub Check: build_windows (Debug, library)
GitHub Check: build_windows (Debug, tests)
GitHub Check: build_ubuntu (Release, tests)
GitHub Check: build_windows (Release, tests)
GitHub Check: build_ubuntu (Debug, tests)
GitHub Check: build_ubuntu (Release)
GitHub Check: build_ubuntu (Debug)
GitHub Check: Analyze (cpp)

🔇 Additional comments (14)

src/crab/type_domain.cpp (4)

214-214: Method updated from operator-= to use havoc

This change replaces the previously used operator-= with an explicit havoc method call, which aligns with the PR's objective of standardizing how variables are manipulated.

225-225: Use of eval_interval improves consistency

Changed from direct indexing to using eval_interval() for type retrieval, standardizing type evaluation across the codebase.

234-234: Standardized interval evaluation for type checking

Like the change in get_type(), this modification uses eval_interval() instead of direct indexing, maintaining consistent access patterns for type information.

249-249: Explicit check for T_UNINIT instead of top state

The condition has been updated to explicitly check if types contains T_UNINIT instead of checking for a top state. This makes the code more readable by being explicit about what values are being checked.

test-data/assign.yaml (1)

19-19: Enforcing Explicit Type Constraints in 'assign register' Test
The update now requires the pre-condition to assert r2.type=number and the post conditions explicitly set both r1 and r2 to type number. Confirm that this stricter type enforcement is consistent with the overall expectations for register assignments.

Also applies to: 26-27

test-data/parse.yaml (4)

5-8: Explicit Invariant Assertions for 'parse 0 invariant'
Both the pre and post conditions now unambiguously require r1 to be a number with zero values. This clarity helps ensure that the invariant is checked correctly.

11-14: Invariant Update for 'parse -1 invariant'
The pre and post conditions explicitly set r1 to represent -1 (with the unsigned value 18446744073709551615), reinforcing the proper handling of signed/unsigned interpretation.

17-20: Invariant Check for 'parse INT_MIN+1 invariant'
By enforcing r1.type=number, r1.svalue=-9223372036854775807, and r1.uvalue=9223372036854775809 in both pre and post conditions, the boundary condition is clearly defined. Ensure this accurately reflects the intended minimal non-underflowing value.

23-26: Invariant Check for 'parse INT_MAX-6 invariant'
The explicit constraints now confirm that r1 consistently holds the expected large value close to the overflow limit. This should help prevent ambiguity in the invariant check.

src/crab/ebpf_transformer.cpp (5)

205-226: Verify pointer comparison logic for unsigned semantics.

The approach to reversing some operators (e.g., Op::LT → src_offset > dst_offset) may cause confusion. Confirm that this aligns with unsigned pointer comparison requirements and consider clarifying with inline comments or a separate function.

228-230: Early return check.

This pattern of verifying is_bottom() before proceeding is consistent with the rest of the code. No issues found.

845-845: Confirm nullability of inner map pointer.

Passing true for the maybe_null parameter implies that inner map pointers can be null. Ensure this accurately reflects runtime behavior, or change it to false if null pointers are not possible.

879-879: Check subprogram stack alignment.

Subtracting the subprogram stack size from R10 might risk negative bounds if used incorrectly. Confirm that any potential out-of-bounds scenarios are handled or ruled out by upstream checks.

1063-1064: Consistent bottom check.

The new early return follows the same defensive pattern used elsewhere to bail out on a bottom state. Looks fine to proceed.

test-data/uninit.yaml

src/crab/ebpf_transformer.cpp

elazarg added 2 commits March 16, 2025 16:59

minor cleanup

b37b2d8

Signed-off-by: Elazar Gershuni <[email protected]>

move arithmetic functions to finite_domain.cpp

e12cd81

Also do away with operator-= (now havoc) and operator+= (now add_constraint) Signed-off-by: Elazar Gershuni <[email protected]>

elazarg mentioned this pull request Mar 16, 2025

Add FiniteDomain #678

Closed

coderabbitai bot reviewed Mar 16, 2025

View reviewed changes

move shifts to finite_domain.cpp

b01477a

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Mar 16, 2025

View reviewed changes

move sign extend to finite_domain.cpp

0dedc24

expect-failure tests (#679) stop failing Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Mar 16, 2025

View reviewed changes

elazarg added 14 commits March 17, 2025 01:29

new sign-extend algorithm

2b33754

results of some tests in movsx.yaml and sext.yaml are modified Signed-off-by: Elazar Gershuni <[email protected]>

move sign extend logic to interval.cpp

bd9765a

Signed-off-by: Elazar Gershuni <[email protected]>

yet another sign extension algorithm

dc32e11

Signed-off-by: Elazar Gershuni <[email protected]>

simplify sign_extend

fd0d73d

Signed-off-by: Elazar Gershuni <[email protected]>

exhaustive tests for bits<=3

aaad3d9

Signed-off-by: Elazar Gershuni <[email protected]>

exhaustive tests pass

f147304

Signed-off-by: Elazar Gershuni <[email protected]>

fix sign-extension implementation and tests

26b39e3

Signed-off-by: Elazar Gershuni <[email protected]>

arbitrary width sign-extension for number_t

f756a5b

Signed-off-by: Elazar Gershuni <[email protected]>

arbitrary width sign-extension for interval_t

f5fe305

Signed-off-by: Elazar Gershuni <[email protected]>

more general max_int

a789874

Signed-off-by: Elazar Gershuni <[email protected]>

more general zero_extend

72083ed

Signed-off-by: Elazar Gershuni <[email protected]>

avoid full<>

ca30e48

Signed-off-by: Elazar Gershuni <[email protected]>

streamline

bff6970

Signed-off-by: Elazar Gershuni <[email protected]>

merge main into #849

89b36fe

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Mar 19, 2025

View reviewed changes

src/crab/ebpf_transformer.cpp Show resolved Hide resolved

src/crab/ebpf_transformer.cpp Outdated Show resolved Hide resolved

src/crab/ebpf_transformer.cpp Show resolved Hide resolved

src/crab/finite_domain.cpp Show resolved Hide resolved

coderabbitai bot reviewed Mar 19, 2025

View reviewed changes

abort early when variables are not usable

c4e08ae

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Mar 19, 2025

View reviewed changes

test-data/uninit.yaml Show resolved Hide resolved

src/crab/ebpf_transformer.cpp Show resolved Hide resolved

src/crab/ebpf_transformer.cpp Show resolved Hide resolved

src/crab/ebpf_transformer.cpp Show resolved Hide resolved

src/crab/ebpf_transformer.cpp Show resolved Hide resolved

elazarg merged commit 9305b23 into main Mar 19, 2025
18 of 19 checks passed

elazarg deleted the prepare-finite-domain branch March 19, 2025 23:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finite domain #849

Finite domain #849

elazarg commented Mar 16, 2025 •

edited

Loading

coderabbitai bot commented Mar 16, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coveralls commented Mar 16, 2025 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

Finite domain #849

Finite domain #849

Conversation

elazarg commented Mar 16, 2025 • edited Loading

Summary by CodeRabbit

coderabbitai bot commented Mar 16, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coveralls commented Mar 16, 2025 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

elazarg commented Mar 16, 2025 •

edited

Loading

coderabbitai bot commented Mar 16, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coveralls commented Mar 16, 2025 •

edited

Loading