Skip to content

[Analyzer] Debloat #2806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 11, 2025
Merged

[Analyzer] Debloat #2806

merged 16 commits into from
Apr 11, 2025

Conversation

AnshSinghal
Copy link
Contributor

@AnshSinghal AnshSinghal commented Mar 23, 2025

Closes #2521

Description

Added a new analyzer Debloat - tool to remove excess garbage from bloated executables.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality).

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • A new plugin (analyzer, connector, visualizer, playbook, pivot or ingestor) was added or changed, in which case:
    • Advanced-Usage was updated (in case the plugin provides additional optional configuration). A link to the PR to the docs repo has been added as a comment here.
    • I have dumped the configuration from Django Admin using the dumpplugin command and added it in the project as a data migration. ("How to share a plugin with the community")
    • If a File analyzer was added and it supports a mimetype which is not already supported, you added a sample of that type inside the archive test_files.zip and you added the default tests for that mimetype in test_classes.py.
    • If you created a new analyzer and it is free (does not require any API key), please add it in the FREE_TO_USE_ANALYZERS playbook by following this guide.
    • Check if it could make sense to add that analyzer/connector to other freely available playbooks.
    • I have provided the resulting raw JSON of a finished analysis and a screenshot of the results.
    • If the plugin interacts with an external service, I have created an attribute called precisely url that contains this information. This is required for Health Checks.
    • If the plugin requires mocked testing, _monkeypatch() was used in its class to apply the necessary decorators.
    • I have added that raw JSON sample to the MockUpResponse of the _monkeypatch() method. This serves us to provide a valid sample for testing.
  • I have inserted the copyright banner at the start of the file: # This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.
  • If external libraries/packages with restrictive licenses were used, they were added in the Legal Notice section.
  • Linters (Black, Flake, Isort) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.
  • After you had submitted the PR, if DeepSource, Django Doctors or other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.

Important Rules

  • If you miss to compile the Checklist properly, your PR won't be reviewed by the maintainers.
  • Everytime you make changes to the PR and you think the work is done, you should explicitly ask for a review by using GitHub's reviewing system detailed here.

@AnshSinghal
Copy link
Contributor Author

Hi @mlodic
Wanted your advice in this thing. Actually even after encoding, for large files it return a very very long string. Is it necessary to return the bloated file? or if you can suggest any other alternative for the same?

@fgibertoni
Copy link
Contributor

Imho it's necessary to return the entire file in the report because it can then be used with pivots to start file analysis with other playbooks

@AnshSinghal
Copy link
Contributor Author

@fgibertoni the thing is that debloat cannot debloat all the files. it sometimes do not debloat the file. and the issue with debloat library is that it does not return an error for the same instead it does not generate any output file. this is the reason the test are failing. Any suggestions what I can do in this case?

@fgibertoni
Copy link
Contributor

Why does not debloat everytime ? Because it's already debloated or because it's not able to do it?
In the first case there is nothing to do imo, just return the already clean file.
In the second case maybe there are some known limitation to the tool or bugs ? We can compare the input file with output to see if cleaning has been performed.

@AnshSinghal
Copy link
Contributor Author

AnshSinghal commented Apr 2, 2025

because its not able to do it.
INFO:main:No automated method for reducing the size worked. Please consider sharing the
sample for additional analysis.
Email: [email protected]
Twitter: @SquiblydooBlog.

So what I will do is. after debloating i will check with the code and if its 0 (No solution found) I will return the json.

@AnshSinghal
Copy link
Contributor Author

@fgibertoni please review

log_message=log_message,
beginning_file_size=original_size,
)
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not catch generic Exception

except pefile.PEFormatError as e:
raise AnalyzerRunException(f"Invalid PE file: {e}")

# BBOT logger is passing invalid kwargs to logger.info like "end" and "flush"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example for this behavior ? So we can evaluate other option for parsing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


TypeError Traceback (most recent call last)
in <cell line: 0>()
40 output_path = "tempR.exe"
41
---> 42 debloat_code = process_pe(
43 pe,
44 out_path=output_path,

2 frames
/usr/lib/python3.11/logging/init.py in info(self, msg, *args, **kwargs)
1487 """
1488 if self.isEnabledFor(INFO):
-> 1489 self._log(INFO, msg, args, **kwargs)
1490
1491 def warning(self, msg, *args, **kwargs):

TypeError: Logger._log() got an unexpected keyword argument 'end'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how BBOT logger is related to this one, but I understand the problem now.
Is there any way to emulate the flush parameter behavior instead of just dropping it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

"health_check_status": True,
"type": "file",
"docker_based": False,
"maximum_tlp": "RED",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for RED ?

@AnshSinghal AnshSinghal requested a review from fgibertoni April 8, 2025 11:08
Copy link
Contributor

@fgibertoni fgibertoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small change and we're good to go

@AnshSinghal
Copy link
Contributor Author

@fgibertoni can we merge this now?

@fgibertoni fgibertoni marked this pull request as ready for review April 11, 2025 10:17
Copy link
Contributor

@fgibertoni fgibertoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging, thank you for the contribution!

@fgibertoni fgibertoni merged commit 27e8c45 into intelowlproject:develop Apr 11, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants