Skip to content

AI Plugin doesn't work: got unexpected status: failed #1489

Open
@Keyruu

Description

@Keyruu

Description

I tried to get the AI plugin running, but I can't seem to get it working. I followed the steps in the documentation, but I receive these debug logs when typing @Botkube ai scan:

{"level":"debug","logger":"stdout","msg":"{\"level\":\"info\",\"msg\":\"Handling external request for source: ai-brain\",\"time\":\"2025-05-06T13:07:03Z\"}","plugin":"botkubeExtra/ai-brain","time":"2025-05-06T13:07:03Z"}
{"level":"debug","logger":"stdout","msg":"{\"level\":\"debug\",\"msg\":\"Creating a new thread\",\"time\":\"2025-05-06T13:07:03Z\"}","plugin":"botkubeExtra/ai-brain","time":"2025-05-06T13:07:03Z"}
{"level":"debug","logger":"stdout","msg":"{\"level\":\"info\",\"messageId\":\"1746536820.115239\",\"msg\":\"created a new assistant run\",\"prompt\":\"Scan the Kubernetes cluster for critical issues that could significantly impact the cluster's health, stability, or security.\\nFocus on problems that may not be immediately apparent through events or standard monitoring.\\nUse Kubescape and kubectl to
ols to scan the cluster, and then aggregate the results based on the instructions.\\nPrioritize Kubescape scan results over the kubectl tools results. Include links for Kubescape controls which you got them from Kubescape scan results.\\n\\nProvide a concise overview of the scan results, including the total number of issues found.\\nIf there were no issues found for a specific check, do not include t
hat section in the report.\\nList the Kubernetes objects directly affected by the issue.\\nMake sure that your checks are relevant to the current state of the cluster, do not include resources that no longer exist.\\n\\nSummary section needs to be at the top of the report, followed by specific checks.\\nSummary outlines what are the issues and how many of them were found, and one line sentence about
the overall cluster state based on the results.\\nUse emojis for the severity of the issues in the summary (critical/high/medium/low), and also for the headlines of the checks to distinguish them.\\nUse a separator \\\"\\\\n\\\\n---\\\\n\\\\n\\\" to split the message into TWO logical sections, no more.\\n\\nSpecific checks: \\n\\nPod Health:\\nIdentify pods in a crash-loop backoff state with a high r
estart count.\\nIdentify pods that have been OOMKilled (Out of Memory Killed) multiple times.\\nLook for pods stuck in a pending state for an extended period.\\nResource Utilization:\\nIdentify nodes or pods with critically high CPU or memory usage. By critically high we mean over 90% or more. \\nCheck for critical resource starvation issues affecting multiple pods or namespaces.\\nConfiguration:\\nL
ook for pods running with very insecure capabilities (e.g., ALL, NET_RAW, SYS_ADMIN).\\nIdentify pods using deprecated or insecure container images.\\nCheck for misconfigured network policies that could expose sensitive services.\\nNetworking:\\nIdentify pods or services experiencing significant network latency or packet loss.\\nCheck for network partitions or connectivity issues between critical com
ponents.\\nSecurity\\nUnder this section, include Security posture from Kubescape scan.\\n\\nAdditional Guidance for the LLM Agent:\\n\\nPrioritize issues that pose the most immediate threat to the cluster's stability, performance, or security.\\nSkip the check output if there are no issues found for a given check. Filter out informational issues.\\nBe as specific as possible in the descriptions. Do
not exceed 3000 characters in your response.\\nDon't show kubescape commands.\\nAt the end of the message, add \\\"Feel free to ask me to provide additional details, or help on how to resolve found issues!\\\", without a separator, in any form you like.\",\"runId\":\"run_vc1ihZuXmpIM4c9kuQU9KIO5\",\"threadId\":\"thread_awzEM9y7SQcHbuIb66zxlrZq\",\"time\":\"2025-05-06T13:07:05Z\"}","plugin":"botkubeEx
tra/ai-brain","time":"2025-05-06T13:07:05Z"}
{"level":"debug","logger":"stdout","msg":"{\"level\":\"debug\",\"messageId\":\"1746536820.115239\",\"msg\":\"retrieved assistant thread run\",\"prompt\":\"Scan the Kubernetes cluster for critical issues that could significantly impact the cluster's health, stability, or security.\\nFocus on problems that may not be immediately apparent through events or standard monitoring.\\nUse Kubescape and kubect
l tools to scan the cluster, and then aggregate the results based on the instructions.\\nPrioritize Kubescape scan results over the kubectl tools results. Include links for Kubescape controls which you got them from Kubescape scan results.\\n\\nProvide a concise overview of the scan results, including the total number of issues found.\\nIf there were no issues found for a specific check, do not inclu
de that section in the report.\\nList the Kubernetes objects directly affected by the issue.\\nMake sure that your checks are relevant to the current state of the cluster, do not include resources that no longer exist.\\n\\nSummary section needs to be at the top of the report, followed by specific checks.\\nSummary outlines what are the issues and how many of them were found, and one line sentence ab
out the overall cluster state based on the results.\\nUse emojis for the severity of the issues in the summary (critical/high/medium/low), and also for the headlines of the checks to distinguish them.\\nUse a separator \\\"\\\\n\\\\n---\\\\n\\\\n\\\" to split the message into TWO logical sections, no more.\\n\\nSpecific checks: \\n\\nPod Health:\\nIdentify pods in a crash-loop backoff state with a hi
gh restart count.\\nIdentify pods that have been OOMKilled (Out of Memory Killed) multiple times.\\nLook for pods stuck in a pending state for an extended period.\\nResource Utilization:\\nIdentify nodes or pods with critically high CPU or memory usage. By critically high we mean over 90% or more. \\nCheck for critical resource starvation issues affecting multiple pods or namespaces.\\nConfiguration:
\\nLook for pods running with very insecure capabilities (e.g., ALL, NET_RAW, SYS_ADMIN).\\nIdentify pods using deprecated or insecure container images.\\nCheck for misconfigured network policies that could expose sensitive services.\\nNetworking:\\nIdentify pods or services experiencing significant network latency or packet loss.\\nCheck for network partitions or connectivity issues between critical
 components.\\nSecurity\\nUnder this section, include Security posture from Kubescape scan.\\n\\nAdditional Guidance for the LLM Agent:\\n\\nPrioritize issues that pose the most immediate threat to the cluster's stability, performance, or security.\\nSkip the check output if there are no issues found for a given check. Filter out informational issues.\\nBe as specific as possible in the descriptions.
 Do not exceed 3000 characters in your response.\\nDon't show kubescape commands.\\nAt the end of the message, add \\\"Feel free to ask me to provide additional details, or help on how to resolve found issues!\\\", without a separator, in any form you like.\",\"runId\":\"run_vc1ihZuXmpIM4c9kuQU9KIO5\",\"runStatus\":\"in_progress\",\"threadId\":\"thread_awzEM9y7SQcHbuIb66zxlrZq\",\"time\":\"2025-05-06
T13:07:08Z\"}","plugin":"botkubeExtra/ai-brain","time":"2025-05-06T13:07:08Z"}
{"level":"debug","logger":"stdout","msg":"{\"level\":\"debug\",\"messageId\":\"1746536820.115239\",\"msg\":\"retrieved assistant thread run\",\"prompt\":\"Scan the Kubernetes cluster for critical issues that could significantly impact the cluster's health, stability, or security.\\nFocus on problems that may not be immediately apparent through events or standard monitoring.\\nUse Kubescape and kubect
l tools to scan the cluster, and then aggregate the results based on the instructions.\\nPrioritize Kubescape scan results over the kubectl tools results. Include links for Kubescape controls which you got them from Kubescape scan results.\\n\\nProvide a concise overview of the scan results, including the total number of issues found.\\nIf there were no issues found for a specific check, do not inclu
de that section in the report.\\nList the Kubernetes objects directly affected by the issue.\\nMake sure that your checks are relevant to the current state of the cluster, do not include resources that no longer exist.\\n\\nSummary section needs to be at the top of the report, followed by specific checks.\\nSummary outlines what are the issues and how many of them were found, and one line sentence ab
out the overall cluster state based on the results.\\nUse emojis for the severity of the issues in the summary (critical/high/medium/low), and also for the headlines of the checks to distinguish them.\\nUse a separator \\\"\\\\n\\\\n---\\\\n\\\\n\\\" to split the message into TWO logical sections, no more.\\n\\nSpecific checks: \\n\\nPod Health:\\nIdentify pods in a crash-loop backoff state with a hi
gh restart count.\\nIdentify pods that have been OOMKilled (Out of Memory Killed) multiple times.\\nLook for pods stuck in a pending state for an extended period.\\nResource Utilization:\\nIdentify nodes or pods with critically high CPU or memory usage. By critically high we mean over 90% or more. \\nCheck for critical resource starvation issues affecting multiple pods or namespaces.\\nConfiguration:
\\nLook for pods running with very insecure capabilities (e.g., ALL, NET_RAW, SYS_ADMIN).\\nIdentify pods using deprecated or insecure container images.\\nCheck for misconfigured network policies that could expose sensitive services.\\nNetworking:\\nIdentify pods or services experiencing significant network latency or packet loss.\\nCheck for network partitions or connectivity issues between critical
 components.\\nSecurity\\nUnder this section, include Security posture from Kubescape scan.\\n\\nAdditional Guidance for the LLM Agent:\\n\\nPrioritize issues that pose the most immediate threat to the cluster's stability, performance, or security.\\nSkip the check output if there are no issues found for a given check. Filter out informational issues.\\nBe as specific as possible in the descriptions.
 Do not exceed 3000 characters in your response.\\nDon't show kubescape commands.\\nAt the end of the message, add \\\"Feel free to ask me to provide additional details, or help on how to resolve found issues!\\\", without a separator, in any form you like.\",\"runId\":\"run_vc1ihZuXmpIM4c9kuQU9KIO5\",\"runStatus\":\"in_progress\",\"threadId\":\"thread_awzEM9y7SQcHbuIb66zxlrZq\",\"time\":\"2025-05-06
T13:07:10Z\"}","plugin":"botkubeExtra/ai-brain","time":"2025-05-06T13:07:10Z"}
{"level":"debug","logger":"stdout","msg":"{\"level\":\"debug\",\"messageId\":\"1746536820.115239\",\"msg\":\"retrieved assistant thread run\",\"prompt\":\"Scan the Kubernetes cluster for critical issues that could significantly impact the cluster's health, stability, or security.\\nFocus on problems that may not be immediately apparent through events or standard monitoring.\\nUse Kubescape and kubect
l tools to scan the cluster, and then aggregate the results based on the instructions.\\nPrioritize Kubescape scan results over the kubectl tools results. Include links for Kubescape controls which you got them from Kubescape scan results.\\n\\nProvide a concise overview of the scan results, including the total number of issues found.\\nIf there were no issues found for a specific check, do not inclu
de that section in the report.\\nList the Kubernetes objects directly affected by the issue.\\nMake sure that your checks are relevant to the current state of the cluster, do not include resources that no longer exist.\\n\\nSummary section needs to be at the top of the report, followed by specific checks.\\nSummary outlines what are the issues and how many of them were found, and one line sentence ab
out the overall cluster state based on the results.\\nUse emojis for the severity of the issues in the summary (critical/high/medium/low), and also for the headlines of the checks to distinguish them.\\nUse a separator \\\"\\\\n\\\\n---\\\\n\\\\n\\\" to split the message into TWO logical sections, no more.\\n\\nSpecific checks: \\n\\nPod Health:\\nIdentify pods in a crash-loop backoff state with a hi
gh restart count.\\nIdentify pods that have been OOMKilled (Out of Memory Killed) multiple times.\\nLook for pods stuck in a pending state for an extended period.\\nResource Utilization:\\nIdentify nodes or pods with critically high CPU or memory usage. By critically high we mean over 90% or more. \\nCheck for critical resource starvation issues affecting multiple pods or namespaces.\\nConfiguration:
\\nLook for pods running with very insecure capabilities (e.g., ALL, NET_RAW, SYS_ADMIN).\\nIdentify pods using deprecated or insecure container images.\\nCheck for misconfigured network policies that could expose sensitive services.\\nNetworking:\\nIdentify pods or services experiencing significant network latency or packet loss.\\nCheck for network partitions or connectivity issues between critical
 components.\\nSecurity\\nUnder this section, include Security posture from Kubescape scan.\\n\\nAdditional Guidance for the LLM Agent:\\n\\nPrioritize issues that pose the most immediate threat to the cluster's stability, performance, or security.\\nSkip the check output if there are no issues found for a given check. Filter out informational issues.\\nBe as specific as possible in the descriptions.
 Do not exceed 3000 characters in your response.\\nDon't show kubescape commands.\\nAt the end of the message, add \\\"Feel free to ask me to provide additional details, or help on how to resolve found issues!\\\", without a separator, in any form you like.\",\"runId\":\"run_vc1ihZuXmpIM4c9kuQU9KIO5\",\"runStatus\":\"failed\",\"threadId\":\"thread_awzEM9y7SQcHbuIb66zxlrZq\",\"time\":\"2025-05-06T13:0
7:12Z\"}","plugin":"botkubeExtra/ai-brain","time":"2025-05-06T13:07:12Z"}
{"level":"debug","logger":"stdout","msg":"{\"error\":\"got unexpected status: failed\",\"level\":\"error\",\"messageID\":\"1746536820.115239\",\"msg\":\"Failed to handle user prompt\",\"time\":\"2025-05-06T13:07:12Z\"}","plugin":"botkubeExtra/ai-brain","time":"2025-05-06T13:07:12Z"}

And a message in slack: I am sorry, something went wrong, please try again. :pensive:

Expected behavior

It should output a proper AI scan in the cluster.

Actual behavior

It throws an error.

Steps to reproduce

Install botkube in version 1.14.0.
Follow this guide to enable AI https://docs.botkube.io/plugins/ai-assistant.
Type in @Botkube ai scan

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions