Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double check strings algorithm #1094

Open
mr-tz opened this issue Jan 18, 2025 · 3 comments · May be fixed by #1103
Open

Double check strings algorithm #1094

mr-tz opened this issue Jan 18, 2025 · 3 comments · May be fixed by #1103

Comments

@mr-tz
Copy link
Collaborator

mr-tz commented Jan 18, 2025

See mandiant/capa#2555

@0xRavenspar
Copy link

Hey @mr-tz, I would like to work on this issue if it's still open

@0xRavenspar
Copy link

I had a doubt,
In capa the input is a bytes buffer and ascii value of the character

def buf_filled_with(buf: bytes, character: int) -> bool:
    """Check if the given buffer is filled with the given character, repeatedly.
    Args:
        buf: The bytes buffer to check
        character: The byte value (0-255) to check for
williballenthin marked this conversation as resolved.
    Returns:
        True if all bytes in the buffer match the character, False otherwise.
        The empty buffer contains no bytes, therefore always returns False.
    """

but the tests in test_buf_filled_with.py we're testing for str and mmap.

tests = [
    ("A", True),
    ("AB", False),
    ("A" * 10000, True),
    (("A" * 10000) + "B", False),
    ("B" + ("A" * 5000), False),
    (("A" * 5000) + "B" + ("A" * 2000), False),
    (("A" * 5000) + ("B" * 5000), False),
]


def test_str():
    for test, expectation in tests:
        assert buf_filled_with(test, test[0]) == expectation


def test_mmap():
    for test, expectation in tests:
        f = tempfile.NamedTemporaryFile()
        f.write(test.encode("utf-8"))
        f.flush()
        test_mmap = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
        assert buf_filled_with(test_mmap, test[0].encode("utf-8")) == expectation

Should the function check for str and mmap and change it to bytes or should it only accept a bytes buffer and int

@mr-tz
Copy link
Collaborator Author

mr-tz commented Mar 24, 2025

It should likely be bytes only. Ideally we add missing type hints as part of this fix and verify all usages.

@0xRavenspar 0xRavenspar linked a pull request Mar 28, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants