Skip to content

Agents deserve freedom. Freedom is the path to success! additional_authorized_imports=['*'] #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

joaopauloschuler
Copy link
Contributor

Letting the agents to import everything they want gives interesting results. I run the agents in a virtualized/safe environment. As an example, deep seek sometimes creates a python file to then import it. In a recent experiment, it started creating a database and running SQL. The agents can import os and then start running linux commands (this is so cool).

Anyway, this pull request allows the following:

additional_authorized_imports=['requests', 'os', 'html', 'BeautifulSoup',
  'pprint', 'lxml', 'bs4', 'subprocess', 'sys', 'PyPDF2', 'matplotlib', 'PIL',
  'reportlab','numpy','pandas','sklearn', 'tkinter', 'json', 'typing', '*']

Notice the '*' at the end. This will allow all imports.

The following code produced by DeepSeek fails:

```
def generate_mandelbrot(width, height, x_min, x_max, y_min, y_max, max_iter):                                      
    image = np.zeros((height, width))                                                                              
    for row in range(height):                                                                                      
        for col in range(width):                                                                                   
            x = x_min + (x_max - x_min) * col / width                                                              
            y = y_min + (y_max - y_min) * row / height                                                             
            c = complex(x, y)                                                                                      
            m = mandelbrot(c, max_iter)                                                                            
            color = 1 - m / max_iter                                                                               
            image[row, col] = color                                                                                
    return image 
```
@aymeric-roucher
Copy link
Contributor

Hello @joaopauloschuler, I love your idea to finally free the agents!!!

However I feel like the "*" formulation is not very intuitive. Often if we want to allow all imports we would need to process things differently, for instance in E2B executor we cannot allow this at all (because that would mean we have to install all possible pypi packages on the sandbox). Also since it's quite unsafe I think we should not put it on the same level as yet another import.

So what do you think about making it an additional flag allow_all_imports? This would allow us to raise a specific warning when this flag is used. And also in that case we would allow executing any function, not just the ones from the base list.

If you agree, please propose the implementation, and I'll do the doc!

@joaopauloschuler
Copy link
Contributor Author

Hello @joaopauloschuler, I love your idea to finally free the agents!!!

However I feel like the "*" formulation is not very intuitive. Often if we want to allow all imports we would need to process things differently, for instance in E2B executor we cannot allow this at all (because that would mean we have to install all possible pypi packages on the sandbox). Also since it's quite unsafe I think we should not put it on the same level as yet another import.

So what do you think about making it an additional flag allow_all_imports? This would allow us to raise a specific warning when this flag is used. And also in that case we would allow executing any function, not just the ones from the base list.

If you agree, please propose the implementation, and I'll do the doc!

@aymeric-roucher ,
Sounds good. I'll give a try with your idea and then we'll see how it goes. I'll FUP.

@devlux76
Copy link

devlux76 commented Jan 9, 2025

Try this instead maybe...

def import_modules(expression, state, authorized_imports):
    def check_module_authorized(module_name):
        module_path = module_name.split(".")
        module_subpaths = [
            ".".join(module_path[:i]) for i in range(1, len(module_path) + 1)
        ]
        return any(subpath in authorized_imports for subpath in module_subpaths)

    if isinstance(expression, ast.Import):
        for alias in expression.names:
            if check_module_authorized(alias.name):
                module = import_module(alias.name)
                state[alias.asname or alias.name] = module
            else:
                raise InterpreterError(
                    f"Import of {alias.name} is not allowed. Authorized imports are: {str(authorized_imports)}"
                )
        return None
    elif isinstance(expression, ast.ImportFrom):
        if check_module_authorized(expression.module):
            module = __import__(
                expression.module, fromlist=[alias.name for alias in expression.names]
            )
            for alias in expression.names:
                state[alias.asname or alias.name] = getattr(module, alias.name)
        else:
            raise InterpreterError(f"Import from {expression.module} is not allowed.")
        return None

Explanation of Changes:
1. Whitelist Check:
• If * is in authorized_imports, it allows all modules unless explicitly blacklisted.
• If authorized_imports is non-empty and does not contain "*", only those explicitly listed are allowed.
2. Blacklist Check:
• A blacklist is checked first. If a module or any of its subpaths is in the blacklist, it is denied.
3. Defaults:
• authorized_imports defaults to an empty list ([]), meaning no whitelist.
• blacklisted_imports also defaults to an empty list ([]), meaning no blacklist.
4. Error Messages:
• Updated to reflect the presence of both whitelists and blacklists.

@joaopauloschuler joaopauloschuler force-pushed the smolagents_joaopauloschuler branch 2 times, most recently from 6422ff5 to efc9d2f Compare January 12, 2025 05:49
@joaopauloschuler
Copy link
Contributor Author

Hello @joaopauloschuler, I love your idea to finally free the agents!!!

However I feel like the "*" formulation is not very intuitive. Often if we want to allow all imports we would need to process things differently, for instance in E2B executor we cannot allow this at all (because that would mean we have to install all possible pypi packages on the sandbox). Also since it's quite unsafe I think we should not put it on the same level as yet another import.

So what do you think about making it an additional flag allow_all_imports? This would allow us to raise a specific warning when this flag is used. And also in that case we would allow executing any function, not just the ones from the base list.

If you agree, please propose the implementation, and I'll do the doc!

@aymeric-roucher ,
In the commit f5845cc , I coded allow_all_imports.

Regarding the E2B executor, installing packages is not the same thing as allowing packages to be imported. The E2B does not restrict importing. Therefore, I think that allow_all_imports should not touch the E2B executor.

I haven't touched "allow executing any function". Therefore, the flag does exactly what is intended: allowing all imports.

@aymeric-roucher
Copy link
Contributor

Thank you @joaopauloschuler ! Integrating this PR, it's a great addition.

@aymeric-roucher aymeric-roucher merged commit a0b4350 into huggingface:main Jan 13, 2025
@joaopauloschuler
Copy link
Contributor Author

@aymeric-roucher ,
smolagents is a great library.

I'll eventually open source my own agents built with smolagents. DeepSeek + smolagents look like a perfect match to me.

@aymeric-roucher
Copy link
Contributor

@joaopauloschuler thank you for this addition. i've added some subsequent changes in #175. And actually I came back to your initial implementation with just a "*" in the imports list, since duplicating arguments added too much complexity in code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants