Custom Anomaly Detection Model with ML Commons – Real-time Inference in Ingest Pipelines?

### Summary
 
I am developing multiple machine learning models as a Flask-based API that detect various log anomalies, such as:
- Unusual status codes,
- Spikes in error rates,
- Error categorization related to databases.
 
The Flask app exposes a `/predict` endpoint, which accepts log data and returns predictions. I want to integrate this setup into OpenSearch to enable real-time anomaly detection.
 
---
 
### Intended Architecture
 
I aim to:
1. Create an OpenSearch ingest pipeline that:
    - Sends incoming log data to the `/predict` endpoint of my external model.
    - Receives predictions (anomalies), and
    - Routes anomaly logs to a separate index for visualization and dashboarding.
 
2. Reduce infrastructure costs by enabling real-time anomaly detection **during ingestion**, rather than batch processing.
 
---
 
### Current Issue
 
While exploring [ML Commons](https://opensearch.org/docs/latest/ml-commons-plugin/) and the [`ml_inference`](https://opensearch.org/docs/latest/ingest-pipelines/processors/ml-inference/) ingest processor, I noticed:
- It supports specific types of models: text embedding models, sparse encoding, cross-encoders, and question-answering.
- It appears to only work with **registered ML Commons models**, not arbitrary external APIs like my Flask `/predict`.
- There is no clear support for invoking a **custom anomaly detection model hosted externally** in real-time from within an ingest pipeline.
- Even if I allow private IP's into the cluster setting there are errors in input/output mapping int the ingestion pipeline and also with accessing that endpoint.

<img width="728" height="288" alt="Image" src="https://github.com/user-attachments/assets/16961791-867c-458b-ba6b-3afb705a90c2" />

 
---
 
### Questions
 
1. **Is it currently possible** to:
    - Register and invoke a **custom anomaly detection model** via ML Commons (hosted externally),
    - And use it **within an ingest pipeline** to enrich incoming documents in real time?
 
2. **Which architecture is more suitable and cost-effective** for this use case:
    - Option A: Deploying the model on OpenSearch and using ML Commons / ingest pipeline,
    - Option B: Hosting the model externally and using a scheduled job to:
        - Fetch logs from the past 5 minutes,
        - Run inference,
        - Index anomalies into a different index.
 
---
 
### My Goal
 
To build a robust, low-latency anomaly detection pipeline integrated with OpenSearch for logs analysis and dashboarding—while keeping infra cost and maintenance complexity low.
 
Any guidance on the supported approach or roadmap plans for ML Commons and ingest pipelines would be highly appreciated.
 
Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Custom Anomaly Detection Model with ML Commons – Real-time Inference in Ingest Pipelines? #3973

Summary

Intended Architecture

Current Issue

Questions

My Goal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom Anomaly Detection Model with ML Commons – Real-time Inference in Ingest Pipelines? #3973

Description

Summary

Intended Architecture

Current Issue

Questions

My Goal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions