Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base64 and widebase64 modifiers #1185

Merged
merged 14 commits into from
Feb 17, 2020
Merged
63 changes: 62 additions & 1 deletion docs/writingrules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ keywords are reserved and cannot be used as an identifier:
- uint32be
- wide
- xor
- base64
- base64wide
-

Rules are generally composed of two sections: strings definition and condition.
Expand Down Expand Up @@ -362,7 +364,7 @@ The following rule will search for every single byte xor applied to the string
$xor_string = "This program cannot" xor

condition:
$xor_string
$xor_string
}

The above rule is logically equivalent to:
Expand Down Expand Up @@ -435,6 +437,65 @@ If you want more control over the range of bytes used with the xor modifier use:
The above example will apply the bytes from 0x01 to 0xff, inclusively, to the
string when searching. The general syntax is ``xor(minimum-maximum)``.

base64 strings
^^^^^^^^^^^^^^

The ``base64`` modifier can be used to search for strings that have been base64
encoded. A good explanation of the technique is at:

https://www.leeholmes.com/blog/2019/12/10/searching-for-content-in-base-64-strings-2/

The following rule will search for the three base64 permutations of the string
"This program cannot":

.. code-block:: yara

rule Base64Example1
{
strings:
$a = "This program cannot" base64

condition:
$a
}

This will cause YARA to search for these three permutations:

VGhpcyBwcm9ncmFtIGNhbm5vd
RoaXMgcHJvZ3JhbSBjYW5ub3
UaGlzIHByb2dyYW0gY2Fubm90

The ``base64wide`` modifier works just like the base64 modifier but the results
of the base64 modifier are converted to wide.

The interaction between ``base64`` (or ``base64wide``) and ``wide`` and
``ascii`` is as you might expect. ``wide`` and ``ascii`` are applied to the
string first, and then the ``base64`` and ``base64wide`` modifiers are applied.
At no point is the plaintext of the ``ascii`` or ``wide`` versions of the
strings included in the search. If you want to also include those you can put
them in a secondary string.

The ``base64`` and ``widebas64`` modifiers also support a custom alphabet. For
example:

.. code-block:: yara

rule Base64Example2
{
strings:
$a = "This program cannot" base64("!@#$%^&*(){}[].,|ABCDEFGHIJ\x09LMNOPQRSTUVWXYZabcdefghijklmnopqrstu")

condition:
$a
}

The alphabet must be 64 bytes long.

The ``base64`` and ``base64wide`` modifiers are only supported with text
strings. Using these modifiers with a hexadecmial string or a regular expression
will cause a compiler error. Also, the ``xor`` and ``nocase`` modifiers used in
combination with ``base64`` or ``base64wide`` will cause a compiler error.

Searching for full words
^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
2 changes: 2 additions & 0 deletions libyara/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ yarainclude_HEADERS = \
include/yara/ahocorasick.h \
include/yara/arena.h \
include/yara/atoms.h \
include/yara/base64.h \
include/yara/bitmask.h \
include/yara/compiler.h \
include/yara/error.h \
Expand Down Expand Up @@ -158,6 +159,7 @@ libyara_la_SOURCES = \
ahocorasick.c \
arena.c \
atoms.c \
base64.c \
bitmask.c \
compiler.c \
endian.c \
Expand Down
8 changes: 6 additions & 2 deletions libyara/atoms.c
Original file line number Diff line number Diff line change
Expand Up @@ -812,7 +812,7 @@ static int _yr_atoms_case_insensitive(
// _yr_atoms_xor
//
// For a given list of atoms returns another list after a single byte xor
// has been applied to it (0x01 - 0xff).
// has been applied to it.
//

static int _yr_atoms_xor(
Expand Down Expand Up @@ -1411,7 +1411,11 @@ int yr_atoms_extract_from_re(
*atoms = NULL;
});

if (modifier.flags & STRING_GFLAGS_WIDE)
// Don't do convert atoms to wide here if either base64 modifier is used.
// This is to avoid the situation where we have "base64 wide" because
// the wide has already been applied BEFORE the base64 encoding.
if (modifier.flags & STRING_GFLAGS_WIDE &&
!(modifier.flags & STRING_GFLAGS_BASE64 || modifier.flags & STRING_GFLAGS_BASE64_WIDE))
{
FAIL_ON_ERROR_WITH_CLEANUP(
_yr_atoms_wide(*atoms, &wide_atoms),
Expand Down
Loading