All-Hands-AI
diff --git a/‎agenthub/micro/_instructions/actions/finish.md
Lines changed: 1 addition & 1 deletion b/‎agenthub/micro/_instructions/actions/finish.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎agenthub/micro/agent.py
Lines changed: 1 addition & 2 deletions b/‎agenthub/micro/agent.py
Lines changed: 1 addition & 2 deletions
diff --git a/‎agenthub/micro/coder/agent.yaml
Lines changed: 1 addition & 1 deletion b/‎agenthub/micro/coder/agent.yaml
Lines changed: 1 addition & 1 deletion
diff --git a/‎agenthub/micro/coder/prompt.md
Lines changed: 1 addition & 1 deletion b/‎agenthub/micro/coder/prompt.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎agenthub/micro/math_agent/prompt.md
Lines changed: 1 addition & 1 deletion b/‎agenthub/micro/math_agent/prompt.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎agenthub/micro/postgres_agent/prompt.md
Lines changed: 1 addition & 1 deletion b/‎agenthub/micro/postgres_agent/prompt.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎agenthub/micro/registry.py
Lines changed: 4 additions & 1 deletion b/‎agenthub/micro/registry.py
Lines changed: 4 additions & 1 deletion
diff --git a/‎agenthub/micro/study_repo_for_task/prompt.md
Lines changed: 44 additions & 6 deletions b/‎agenthub/micro/study_repo_for_task/prompt.md
Lines changed: 44 additions & 6 deletions
diff --git a/‎agenthub/micro/typo_fixer_agent/agent.yaml
Lines changed: 2 additions & 1 deletion b/‎agenthub/micro/typo_fixer_agent/agent.yaml
Lines changed: 2 additions & 1 deletion
diff --git a/‎agenthub/micro/typo_fixer_agent/prompt.md
Lines changed: 11 additions & 3 deletions b/‎agenthub/micro/typo_fixer_agent/prompt.md
Lines changed: 11 additions & 3 deletions
diff --git a/‎agenthub/micro/verifier/prompt.md
Lines changed: 3 additions & 2 deletions b/‎agenthub/micro/verifier/prompt.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎opendevin/controller/agent_controller.py
Lines changed: 81 additions & 22 deletions b/‎opendevin/controller/agent_controller.py
Lines changed: 81 additions & 22 deletions
diff --git a/‎opendevin/controller/state/state.py
Lines changed: 2 additions & 0 deletions b/‎opendevin/controller/state/state.py
Lines changed: 2 additions & 0 deletions
@@ -1,2 +1,2 @@
-* `finish` - if you're absolutely certain that you've completed your task and have tested your work, use the finish action to stop working. Arguments:
+* `finish` - if you're absolutely certain that you've completed your task, use the finish action to stop working. Arguments:
   * `outputs` - a dictionary representing the outputs of your task, if any
@@ -55,14 +55,13 @@ def __init__(self, llm: LLM):
         del self.delegates[self.agent_definition['name']]
 
     def step(self, state: State) -> Action:
-        latest_user_message = state.get_current_user_intent()
         prompt = self.prompt_template.render(
             state=state,
             instructions=instructions,
             to_json=to_json,
             history_to_json=history_to_json,
             delegates=self.delegates,
-            latest_user_message=latest_user_message,
+            latest_user_message=state.get_current_user_intent(),
         )
         messages = [{'content': prompt, 'role': 'user'}]
         resp = self.llm.do_completion(messages=messages)
 
@@ -2,5 +2,5 @@ name: CoderAgent
 description: Given a particular task, and a detailed description of the codebase, accomplishes the task
 inputs:
   task: string
-  codebase_summary: string
+  summary: string
 outputs: {}
@@ -2,7 +2,7 @@
 You are a software engineer. You've inherited an existing codebase, which you
 need to modify to complete this task:
 
-{{ latest_user_message }}
+{{ state.inputs.task }}
 
 {% if state.inputs.summary %}
 Here's a summary of the codebase, as it relates to this task:
 
@@ -1,7 +1,7 @@
 # Task
 You are a brilliant mathematician and programmer. You've been given the following problem to solve:
 
-{{ latest_user_message }}
+`{{ state.inputs.task }}`
 
 Please write a python script that solves this problem, and prints the answer to stdout.
 ONLY print the answer to stdout, nothing else.
 
@@ -2,7 +2,7 @@
 You are a database engineer. You are working on an existing Postgres project, and have been given
 the following task:
 
-{{ latest_user_message }}
+{{ state.inputs.task }}
 
 You must:
 * Investigate the existing migrations to understand the current schema
 
@@ -4,7 +4,10 @@
 
 all_microagents = {}
 
-for dir in os.listdir(os.path.dirname(__file__)):
+# Get the list of directories and sort them to preserve determinism
+dirs = sorted(os.listdir(os.path.dirname(__file__)))
+
+for dir in dirs:
     base = os.path.dirname(__file__) + '/' + dir
     if os.path.isfile(base):
         continue
 
@@ -1,25 +1,63 @@
 # Task
-You are a software engineer. You've inherited an existing codebase, which you're
-learning about for the first time. You need to study the codebase to find all
-the information needed to complete this task:
+You are a software architect. Your team has inherited an existing codebase, and
+need to finish a project:
 
-{{ latest_user_message }}
+{{ state.inputs.task }}
+
+As an architect, you need to study the codebase to find all the information that
+might be helpful for your software engineering team.
 
 ## Available Actions
 {{ instructions.actions.run }}
 {{ instructions.actions.read }}
 {{ instructions.actions.message }}
 {{ instructions.actions.finish }}
 
-You must ONLY `run` commands that have no side-effects, like `ls` and `grep`.
+You must ONLY `run` commands that have no side-effects, like `ls` and `grep`. You
+MUST NOT modify or write to any file.
 
 Do NOT finish until you have a complete understanding of which parts of the
-codebase are relevant to the task, including particular files, functions, and classes.
+codebase are relevant to the project, including particular files, functions, and classes.
 When you're done, put your summary in `outputs.summary` in the `finish` action.
+Remember, your task is to explore and study the current repository, not actually
+implement the solution. If the codebase is empty, you shoud call the `finish` action.
 
 ## History
 {{ instructions.history_truncated }}
 {{ history_to_json(state.history[-10:]) }}
 
 ## Format
 {{ instructions.format.action }}
+
+## Examples
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000?
+
+ASSISTANT:
+{
+  "action": "run",
+  "args": {
+    "command": "ls",
+    "background": false
+  }
+}
+
+USER:
+OBSERVATION:
+[]
+
+ASSISTANT:
+{
+  "action": "finish",
+  "args": {
+    "outputs": {
+      "summary": "The codebase appears to be empty. Engineers should start everything from scratch."
+    }
+  }
+}
+
+--- END OF EXAMPLE ---
@@ -1,5 +1,6 @@
 name: TypoFixerAgent
 description: Fixes typos in files in the current working directory
-inputs: {}
+inputs:
+  task: string
 outputs:
   summary: string
@@ -1,5 +1,13 @@
 # Task
-You are a proofreader tasked with fixing typos in the files in your current working directory. Your goal is to:
+You are a proofreader tasked with fixing typos in the files in your current working directory.
+
+{% if state.inputs.task %}
+Specifically, your task is:
+{{ state.inputs.task }}
+{% endif %}
+
+To achieve this goal, you should:
+
 1. Scan the files for typos
 2. Overwrite the files with the typos fixed
 3. Provide a summary of the typos fixed
@@ -13,10 +21,10 @@ You are a proofreader tasked with fixing typos in the files in your current work
 
 To complete this task:
 1. Use the `read` action to read the contents of the files in your current working directory. Make sure to provide the file path in the format `'./file_name.ext'`.
-2. Use the `think` action to analyze the contents and identify typos.
+2. Use the `message` action to analyze the contents and identify typos.
 3. Use the `write` action to create new versions of the files with the typos fixed.
   - Overwrite the original files with the corrected content. Make sure to provide the file path in the format `'./file_name.ext'`.
-4. Use the `think` action to generate a summary of the typos fixed, including the original and fixed versions of each typo, and the file(s) they were found in.
+4. Use the `message` action to generate a summary of the typos fixed, including the original and fixed versions of each typo, and the file(s) they were found in.
 5. Use the `finish` action to return the summary in the `outputs.summary` field.
 
 Do NOT finish until you have fixed all the typos and generated a summary.
 
@@ -2,9 +2,10 @@
 You are a quality assurance engineer. Another engineer has made changes to the
 codebase which are supposed to solve this task:
 
-{{ latest_user_message }}
+{{ state.inputs.task }}
 
-Your goal is to verify that the changes are correct and bug-free.
+Note the changes might have already been applied in-line. You should focus on
+validating if the task is solved, nothing else.
 
 ## Available Actions
 {{ instructions.actions.run }}
 
@@ -47,6 +47,7 @@ class AgentController:
     event_stream: EventStream
     state: State
     agent_task: Optional[asyncio.Task] = None
+    parent: 'AgentController | None' = None
     delegate: 'AgentController | None' = None
     _pending_action: Action | None = None
 
@@ -58,7 +59,8 @@ def __init__(
         max_iterations: int = MAX_ITERATIONS,
         max_chars: int = MAX_CHARS,
         max_budget_per_task: float | None = MAX_BUDGET_PER_TASK,
-        inputs: dict | None = None,
+        initial_state: State | None = None,
+        is_delegate: bool = False,
     ):
         """Initializes a new instance of the AgentController class.
 
@@ -69,25 +71,30 @@ def __init__(
             max_iterations: The maximum number of iterations the agent can run.
             max_chars: The maximum number of characters the agent can output.
             max_budget_per_task: The maximum budget (in USD) allowed per task, beyond which the agent will stop.
-            inputs: The initial inputs to the agent.
+            initial_state: The initial state of the controller.
+            is_delegate: Whether this controller is a delegate.
         """
+        self._step_lock = asyncio.Lock()
         self.id = sid
         self.agent = agent
-        self.state = State(inputs=inputs or {}, max_iterations=max_iterations)
+        self.max_chars = max_chars
+        if initial_state is None:
+            self.state = State(inputs={}, max_iterations=max_iterations)
+        else:
+            self.state = initial_state
         self.event_stream = event_stream
         self.event_stream.subscribe(
-            EventStreamSubscriber.AGENT_CONTROLLER, self.on_event
+            EventStreamSubscriber.AGENT_CONTROLLER, self.on_event, append=is_delegate
         )
-        self.max_iterations = max_iterations
-        self.max_chars = max_chars
         self.max_budget_per_task = max_budget_per_task
-        self.agent_task = asyncio.create_task(self._start_step_loop())
+        if not is_delegate:
+            self.agent_task = asyncio.create_task(self._start_step_loop())
 
     async def close(self):
         if self.agent_task is not None:
             self.agent_task.cancel()
-        self.event_stream.unsubscribe(EventStreamSubscriber.AGENT_CONTROLLER)
         await self.set_agent_state_to(AgentState.STOPPED)
+        self.event_stream.unsubscribe(EventStreamSubscriber.AGENT_CONTROLLER)
 
     def update_state_before_step(self):
         self.state.iteration += 1
@@ -117,6 +124,7 @@ async def add_history(self, action: Action, observation: Observation):
         self.state.updated_info.append((action, observation))
 
     async def _start_step_loop(self):
+        logger.info(f'[Agent Controller {self.id}] Starting step loop...')
         while True:
             try:
                 await self._step()
@@ -164,13 +172,16 @@ async def on_event(self, event: Event):
             elif isinstance(event, CmdOutputObservation):
                 await self.add_history(NullAction(), event)
                 logger.info(event, extra={'msg_type': 'OBSERVATION'})
+            elif isinstance(event, AgentDelegateObservation):
+                await self.add_history(NullAction(), event)
+                logger.info(event, extra={'msg_type': 'OBSERVATION'})
 
     def reset_task(self):
         self.agent.reset()
 
     async def set_agent_state_to(self, new_state: AgentState):
         logger.info(
-            f'Setting agent({type(self.agent).__name__}) state from {self.state.agent_state} to {new_state}'
+            f'[Agent Controller {self.id}] Setting agent({type(self.agent).__name__}) state from {self.state.agent_state} to {new_state}'
         )
 
         if new_state == self.state.agent_state:
@@ -195,45 +206,85 @@ def get_agent_state(self):
     async def start_delegate(self, action: AgentDelegateAction):
         AgentCls: Type[Agent] = Agent.get_cls(action.agent)
         agent = AgentCls(llm=self.agent.llm)
+        state = State(
+            inputs=action.inputs or {},
+            iteration=0,
+            max_iterations=self.state.max_iterations,
+            num_of_chars=self.state.num_of_chars,
+            delegate_level=self.state.delegate_level + 1,
+        )
+        logger.info(f'[Agent Controller {self.id}]: start delegate')
         self.delegate = AgentController(
             sid=self.id + '-delegate',
             agent=agent,
             event_stream=self.event_stream,
-            max_iterations=self.max_iterations,
+            max_iterations=self.state.max_iterations,
             max_chars=self.max_chars,
-            inputs=action.inputs,
+            initial_state=state,
+            is_delegate=True,
         )
+        await self.delegate.set_agent_state_to(AgentState.RUNNING)
 
     async def _step(self):
+        logger.debug(f'[Agent Controller {self.id}] Entering step method')
         if self.get_agent_state() != AgentState.RUNNING:
-            logger.debug('waiting for agent to run...')
+            logger.info(f'[Agent Controller {self.id}] waiting for agent to run...')
             await asyncio.sleep(1)
             return
 
         if self._pending_action:
-            logger.debug('waiting for pending action: ' + str(self._pending_action))
+            logger.info(
+                f'[Agent Controller {self.id}] waiting for pending action: {self._pending_action}'
+            )
             await asyncio.sleep(1)
             return
 
-        logger.info(f'STEP {self.state.iteration}', extra={'msg_type': 'STEP'})
-        if self.state.iteration >= self.max_iterations:
-            await self.report_error('Agent reached maximum number of iterations')
-            await self.set_agent_state_to(AgentState.ERROR)
-            return
-
         if self.delegate is not None:
-            delegate_done = await self.delegate._step()
+            logger.debug(f'[Agent Controller {self.id}] Delegate not none, awaiting...')
+            assert self.delegate != self
+            await self.delegate._step()
+            logger.debug(f'[Agent Controller {self.id}] Delegate step done')
+            assert self.delegate is not None
+            delegate_state = self.delegate.get_agent_state()
+            if delegate_state == AgentState.ERROR:
+                # close the delegate upon error
+                await self.delegate.close()
+                await self.report_error('Delegator agent encounters an error')
+                # propagate error state until an agent or user can handle it
+                await self.set_agent_state_to(AgentState.ERROR)
+                return
+            delegate_done = delegate_state == AgentState.FINISHED
             if delegate_done:
+                logger.info(
+                    f'[Agent Controller {self.id}] Delegate agent has finished execution'
+                )
+                # retrieve delegate result
                 outputs = self.delegate.state.outputs if self.delegate.state else {}
-                obs: Observation = AgentDelegateObservation(content='', outputs=outputs)
-                await self.event_stream.add_event(obs, EventSource.AGENT)
+
+                # close delegate controller: we must close the delegate controller before adding new events
+                await self.delegate.close()
+
+                # clean up delegate status
                 self.delegate = None
                 self.delegateAction = None
+
+                # update delegate result observation
+                obs: Observation = AgentDelegateObservation(outputs=outputs, content='')
+                await self.event_stream.add_event(obs, EventSource.AGENT)
             return
 
         if self.state.num_of_chars > self.max_chars:
             raise MaxCharsExceedError(self.state.num_of_chars, self.max_chars)
 
+        logger.info(
+            f'{type(self.agent).__name__} LEVEL {self.state.delegate_level} STEP {self.state.iteration}',
+            extra={'msg_type': 'STEP'},
+        )
+        if self.state.iteration >= self.state.max_iterations:
+            await self.report_error('Agent reached maximum number of iterations')
+            await self.set_agent_state_to(AgentState.ERROR)
+            return
+
         self.update_state_before_step()
         action: Action = NullAction()
         try:
@@ -335,6 +386,14 @@ def _is_stuck(self):
 
         return False
 
+    def __repr__(self):
+        return (
+            f'AgentController(id={self.id}, agent={self.agent!r}, '
+            f'event_stream={self.event_stream!r}, '
+            f'state={self.state!r}, agent_task={self.agent_task!r}, '
+            f'delegate={self.delegate!r}, _pending_action={self._pending_action!r})'
+        )
+
     def _eq_no_pid(self, obj1, obj2):
         if isinstance(obj1, CmdOutputObservation) and isinstance(
             obj2, CmdOutputObservation
 
@@ -40,6 +40,8 @@ class State:
     agent_state: AgentState = AgentState.LOADING
     resume_state: AgentState | None = None
     metrics: Metrics = Metrics()
+    # root agent has level 0, and every delegate increases the level by one
+    delegate_level: int = 0
 
     def save_to_session(self, sid: str):
         fs = get_file_store()
Original file line number	Diff line number	Diff line change
`@@ -1,2 +1,2 @@`
`1`		-* `finish` - if you're absolutely certain that you've completed your task and have tested your work, use the finish action to stop working. Arguments:
	`1`	+* `finish` - if you're absolutely certain that you've completed your task, use the finish action to stop working. Arguments:
`2`	`2`	* `outputs` - a dictionary representing the outputs of your task, if any