Skip to content

Mixed-size accesses #840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 112 commits into
base: development
Choose a base branch
from
Open

Mixed-size accesses #840

wants to merge 112 commits into from

Conversation

xeren
Copy link
Collaborator

@xeren xeren commented Mar 27, 2025

This PR adds basic support for mixed-size and misaligned accesses, i.e. memory instructions with imperfectly overlapping access ranges. Such an instruction is partitioned into multiple events with distinct access ranges. The events then address properly slotted byte ranges. The generated events are si-related (see below).

Memory model

  • The new base relation si ("Same Instruction") is an equivalence relation. Each equivalent class denotes an instruction.
  • cat/aarch64-mixed.cat: Application-level ARM with mixed-size support. Modified copy fetched from aarch64.cat.

Verification

  • Static analysis of where a memory access should be split in order to properly alias other accesses.
  • Program processor Tearing performs the splitting of events according to the above analysis results. This process is sensisitive to the byte ordering of the program. For litmus tests, little endian is assumed. In LLVM IR, the byte order is explicit.

Configuration

  • Option program.analysis.mixedSize: Set to false to disable the feature, if you are confident, that MSAs do not happen. This is recommended, if you are selecting another field-insensitive alias analysis (via program.analysis.alias) or verifying a program with unsupported pointer operations (currently non-linear pointer arithmetic, like bit fiddling).

Tests

Copied from aarch64-tests. The verdicts were generated using DIY 7.58 with the following command:

herd7 -variant mixed -model cat/aarch64-mixed.cat -I cat $LITMUS_TEST

xeren added 30 commits February 5, 2025 19:14
AliasAnalysis.getMixedSizeAccessSet(AliasAnalysis,List)
Add AliasAnalysis.mayMixedSizeAccesses(MemoryCoreEvent)
Remove AliasAnalysis.mayMix(MemoryCoreEvent,MemoryCoreEvent)
Remove AliasAnalysis.getMayMixedSizeAccessesSet(Collection)
ExpressionFactory.makeIntExtract(Expression,int,int)
Signed-off-by: Hernan Ponce de Leon <[email protected]>
Signed-off-by: Hernan Ponce de Leon <[email protected]>
Copy link
Owner

@hernanponcedeleon hernanponcedeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass up to the litmus and assertion parsers

Comment on lines +44 to +46
// ==========================================
// Lockref
// ==========================================

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of keeping both variants (specially since according to Thomas both generated the same dat3m IR). This just make it more confusing for people trying to navigate over the benchmarks in the repo with no clear benefit.

I added another method for the API, refactor the code and created some more interesting examples. I would simply remove lockref1 and lockrref2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two versions are very different in terms of LLVM IR. The fact that we create (almost) the same internal IR is not obvious which makes it a valuable benchmark to evaluate our processing pipeline.
I don't think we need the two versions as unit tests, but I would like to keep them around for manual testing, i.e., inspection of logging outputs. We could have a seperate directory for such benchmarks.

@@ -50,6 +50,7 @@ public class OptionNames {
public static final String PROPAGATE_COPY_ASSIGNMENTS = "program.processing.propagateCopyAssignments";
public static final String REMOVE_ASSERTION_OF_TYPE = "program.processing.skipAssertionsOfType";
public static final String NONTERMINATION_INSTRUMENTATION = "program.processing.nonTermination";
public static final String MIXED_SIZE = "program.processing.mixedSize";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option only makes sense when we have modeling.integers=false, right? If so, can we somehow check that the two optiosn match accordingly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, we do no clever encoding in presence of MSAs and encoding.integers=true. The responsible classes are ProcessingManager and EncodingContext, which currently do not directly communicate. We could issue a warning at the first (or k-th) bitwise operation. Otherwise we could add a second declaration of the option, or move one of the options into Program.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantically, the option makes sense independent of whether we do integer encoding or not. The only problem with the combination is that there are going to be more BV operations which will cause slowdown. But the same is true if the program naturally contains such operations.
That being said, I don't think I have implemented extract/concat operations for the integer encoding, so the encoder will likely throw an exception.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also saw a TODO about some missing módulo operations to restrict values (I think in some simplification step) for the case of an integer type. This sounds to me that we can even be unsound if those options are combined

Copy link
Collaborator

@ThomasHaas ThomasHaas May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO you are referring to is in the encoding of BV truncations (e.g. bv64 down to bv32), where we do not reduce the value. I don't think we necessarily need to, because integer encoding is incompatible with overflowing arithmetic anyway. So integer encoding in itself is unsound to begin with.

Comment on lines -205 to 206
final Expression zero = expressions.makeZero(types.getArchType());
for (int offset = 0; offset < size; offset++) {
mem.setInitialValue(offset, zero);
}
mem.setInitialValue(0, expressions.makeZero(types.getIntegerType(8 * size)));
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need this change? Isn't this the responsible of all the fails in the CI that Thomas mentioned were related to init events?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think this change is because before we have never considered the correct type for initial memory.
A 1-byte sized memory object could hold a 8-byte sized initial value, which is just not correct, but didn't result in problems before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was necessary. The former snippet produced overlapping initializations, which would always result in worst-case byte-wise Tearing. A compromise would be to produce initial values for every eighth byte, as below.

final Expression zero = expressions.makeZero(types.getArchType());
final int archSize =  types.getMemorySizeInBytes(types.getArchType());
for (int offset = 0; offset + archSize <= size; offset += archSize) {
    mem.setInitialValue(offset, zero);
}
if (size % archSize != 0) {
    final IntegerType remType = types.getIntegerType(size % archSize);
    mem.setInitialValue(archSize * (size / archSize), expressions.makeZero(remType));
}


private final InstructionBoundary begin;

public InstructionBoundary(Void ignore, InstructionBoundary b) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this first parameter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid collision with the copy constructor.

Comment on lines 147 to 154
detectMixedSizeAccesses ? ProgramProcessor.fromFunctionProcessor(
FunctionProcessor.chain(
performAssignmentInlining ? AssignmentInlining.newInstance() : null,
sccp,
dce,
removeDeadJumps
), Target.THREADS, true
) : null,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we run all these passes again?

Copy link
Collaborator

@ThomasHaas ThomasHaas May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Tearing opens up new possibilities to optimize if you had mixed-size accesses on local variables. At least Mem2Reg and SCCP should run again after Tearing. If the set of optimization is identical to what we already had, we can refactor this into a single local variable and reuse it twice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MemToReg gets no benefit from Tearing. Trying to make it MSA-aware would probably let it solve the local instances on its own. But the assignments introduced by Tearing can be simplified.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following what you are saying. You say that MemToReg is not MSA-aware and therefore probably pessimistic in regards to MSA accesses, no? If so, then why is Tearing not enabling more optimizations?
What happens if you locally allocate an bv64 but use bv32 accesses to it? What will MemToReg do here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To detect MSAs MemToReg relies on allocationType, which Tearing cannot relyably update without copying the infrastructure of MemToReg. I refered to the other approach, bringing a bit of Tearing into MemToReg instead. This approach is now implemented. It results in it promoting the local accesses on the first try, before Tearing.

xeren added 4 commits May 8, 2025 21:44
…ixed-sized-accesses

# Conflicts:
#	dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/CoreCodeVerification.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/Intrinsics.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/ProcessingManager.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/ThreadCreation.java
xeren and others added 6 commits May 13, 2025 15:41
…ixed-sized-accesses

# Conflicts:
#	dartagnan/src/main/java/com/dat3m/dartagnan/parsers/program/visitors/VisitorLlvm.java
# Conflicts:
#	dartagnan/src/main/java/com/dat3m/dartagnan/encoding/ExpressionEncoder.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/encoding/WmmEncoder.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/expression/ExpressionKind.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/program/Program.java
#	dartagnan/src/main/java/com/dat3m/dartagnan/verification/model/ExecutionModelManager.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants