Skip to content

[doc] update doc, usecase blog and help doc #3286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions home/docs/help/ai_config.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
id: aiConfig
id: ai_config
title: AI QuickStart
sidebar_label: AI QuickStartr
sidebar_label: AI QuickStart
keywords: [AI]
---

Expand Down
4 changes: 2 additions & 2 deletions home/docs/help/alarm_silence.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
id: alarm_silence
title: Alert Silence
sidebar_label: Alert Silence
title: Alarm Silence
sidebar_label: Alarm Silence
keywords: [ Open Source Monitoring System, Alert Silence ]
---

Expand Down
56 changes: 22 additions & 34 deletions home/docs/help/alert_threshold.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,43 @@
---
id: alert_threshold
title: Threshold Alert Configuration
sidebar_label: Threshold Alert Configuration
title: Alarm Threshold Configuration
sidebar_label: Alarm Threshold
---

> Configure alert thresholds for monitoring metrics (warning alert, critical alert, emergency alert). The system triggers alerts based on threshold configuration and collected metric data.
:::tip
Alarm Threshold are the core function of `HertzBeat`, users can configure the trigger conditions of the alarm through the threshold rules.
Support real-time threshold and scheduled threshold, real-time threshold can directly trigger the alarm when monitoring data is collected, scheduled threshold supports PromQL and other expressions to calculate the trigger alarm within a specified time period.
Support visual page configuration or more flexible expression rule configuration, support configuring trigger times, alarm levels, notification templates, associated specified monitoring and so on.
:::

## Operational Steps
![threshold](/img/docs/help/alert-threshold-1.png)

### 1. Setting Labels for Monitoring Services (Optional)
## Real-time Threshold

If you need to categorize alerts, you can set labels for the monitored targets. For example: If you have multiple Linux systems to monitor, and each system has different monitoring metrics, such as: Server A has available memory greater than 1G, Server B has available memory greater than 2G, then you can set labels for Server A and Server B respectively, and then configure alerts based on these labels.

#### Creating Labels

Navigate to **Label Management -> Add Label**

![threshold](/img/docs/help/alert-threshold-2-en.png)

As shown in the image above, add a new label. Here we set the label as: linux:dev (Linux used in development environment).

#### Configuring Labels

TODO Update image name
![threshold](/img/docs/help/alert-threshold-3-en.png)

As shown in the image above, click on `Add Label`.

![threshold](/img/docs/help/alert-threshold-4-en.png)

Select our label, here demonstrated as selecting the `linux:dev` label.
> Real-time threshold means that the alarm is triggered directly when the monitoring data is collected, which is suitable for scenarios with high real-time requirements.

### Creating Threshold Rules

Navigate to **[Threshold Rules] -> [Add Threshold Rule] -> [Confirm Configuration]**
> HertzBeat Page -> Alerting -> Threshold -> New Threshold -> ReadTime Threshold Rule

Configure the threshold, for example: Select the SSL certificate metric object, configure the alarm expression-triggered when the metric `expired` is `true`, that is, `equals(expired,"true")`, set the alarm level notification template information, etc.

![threshold](/img/docs/help/alert-threshold-1-en.png)
![HertzBeat](/img/docs/start/ssl_5.png)

The above image explains the configuration details:
Configuration item details:

- **Rule Name**:Unique name defining this threshold rule
- **Metric Object**: Select the monitoring metric object for which we need to configure the threshold. For example: Under website monitoring type -> under the summary metric set -> responseTime metric.
- **Threshold Rule**: Use this expression to calculate whether to trigger the threshold. Expression variables and operators are provided on the page for reference. For example: Set an alert to trigger if response time is greater than 50, the expression would be `responseTime > 50`. For detailed help on threshold expressions, see [Threshold Expression Help](alert_threshold_expr).
- **Threshold Rule**: Configure the alarm trigger rules for specific indicators, support graphical interface and expression rules. For expression environment variables and operators, see the page prompts. For detailed help on threshold expressions, see [Threshold Expression Help](alert_threshold_expr).
- **Associated Monitors**:Apply this threshold rule to the specified monitoring object (support direct binding and label association). If not configured, it will be applied to all monitoring objects that meet this threshold type rule.
- **Alert Level**: The alert level triggered by the threshold, from low to high: warning, critical, emergency.
- **Trigger Count**: Set how many times the threshold must be triggered before the alert is actually triggered.
- **Notification Template**: The template for the notification message sent after the alert is triggered. Template variables are provided on the page. For example: `${app}.${metrics}.${metric} metric value is ${responseTime}, which is greater than 50 triggering the alert`.
- **Bind Label**: Select the label we need to apply. If no label is selected, it will apply to all services corresponding to the set metric object.
- **Apply Globally**: Set whether this threshold applies globally to all such metrics, default is no. After adding a threshold, it needs to be associated with the monitoring object for the threshold to take effect.
- **Recovery Notification**: Whether to send a recovery notification after the alert is triggered, default is not to send.
- **Bing Annotation**:Add annotation information to this threshold rule (the annotation content supports environment variables). When an alarm is generated, this annotation information will be rendered and attached to the alarm.
- **Enable Alert**: Enable or disable this alert threshold configuration.

**The threshold alert configuration is complete, and alerts that have been successfully triggered can be viewed in the [Alert Center].**
**If you need to send alert notifications via email, WeChat, DingTalk, or Feishu, you can configure it in [Alert Notifications].**
**The threshold alert configuration is complete, and alerts that have been successfully triggered can be viewed in the [Alarm Center].**
**If you need to send alert notifications via email, WeChat, DingTalk, or Feishu, you can configure it in [Notification].**

For other issues, you can provide feedback through the community chat group or issue tracker!
## Scheduled Threshold
74 changes: 47 additions & 27 deletions home/docs/introduce.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,6 @@ slug: /

> A real-time monitoring system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building capabilities.

[![Discord](https://img.shields.io/badge/Chat-Discord-7289DA?logo=discord)](https://discord.gg/Fb6M73htGr)
[![Reddit](https://img.shields.io/badge/Reddit-Community-7289DA?logo=reddit)](https://www.reddit.com/r/hertzbeat/)
[![Twitter](https://img.shields.io/twitter/follow/hertzbeat1024?logo=twitter)](https://x.com/hertzbeat1024)
[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/8139/badge)](https://www.bestpractices.dev/projects/8139)
[![Docker Pulls](https://img.shields.io/docker/pulls/apache/hertzbeat?style=%20for-the-badge&logo=docker&label=DockerHub%20Download)](https://hub.docker.com/r/apache/hertzbeat)
[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/hertzbeat)](https://artifacthub.io/packages/search?repo=hertzbeat)
[![QQ](https://img.shields.io/badge/QQ-630061200-orange)](https://qm.qq.com/q/FltGGGIX2m)
[![YouTube Channel Subscribers](https://img.shields.io/youtube/channel/subscribers/UCri75zfWX0GHqJFPENEbLow?logo=youtube&label=YouTube%20Channel)](https://www.youtube.com/channel/UCri75zfWX0GHqJFPENEbLow)

**Home: [hertzbeat.apache.org](https://hertzbeat.apache.org)**

## 🎡 <font color="green">Introduction</font>
Expand All @@ -36,7 +27,7 @@ slug: /

---

### Powerful Monitoring Templates
### Powerful Monitoring Template

> Before we discuss the customizable monitoring capabilities of HertzBeat, which we mentioned at the beginning, let's introduce the different monitoring templates of HertzBeat. And it is because of this monitoring template design that the advanced features come later.

Expand Down Expand Up @@ -91,7 +82,7 @@ Do you believe that users can just write a monitoring template on the UI page, c
* And More Your Custom Template.
* Notified Support `Discord` `Slack` `Telegram` `Email` `Dingtalk` `WeChat` `FeiShu` `Webhook` `SMS` `ServerChan`.

### Powerful Customization
### Customization

> From the previous introduction of **Monitoring Templates**, it is clear that `HertzBeat` has powerful customization features.
> Each monitor type is considered as a monitor template, no matter it is built-in or user-defined. You can easily add, modify and delete indicators by modifying the monitoring template.
Expand Down Expand Up @@ -156,7 +147,7 @@ In an isolated network where multiple networks are not connected, we need to dep

---

## Quickly Start
## 🥐 Experience Now

Just run a single command in a Docker environment: `docker run -d -p 1157:1157 -p 1158:1158 --name hertzbeat apache/hertzbeat`
Browser access `http://localhost:1157` default account password `admin/hertzbeat`
Expand All @@ -171,7 +162,7 @@ Browser access `http://localhost:1157` default account password `admin/hertzbeat

* The global overview page shows the distribution of current monitoring categories, users can visualize the current monitoring types and quantities and click to jump to the corresponding monitoring types for maintenance and management.
* Show the status of currently registered collector clusters, including collector on-line status, monitoring tasks, startup time, IP address, name and so on.
* Show the list of recent alarm messages, alarm level distribution and alarm processing rate.
* Show the list of recent alarm messages, alarm level distribution etc.

![HertzBeat](/img/home/1.png)

Expand Down Expand Up @@ -222,7 +213,7 @@ Built-in support for monitoring types include:

![HertzBeat](/img/home/2.png)

### Add and Modify Surveillance
### New Monitor

* You can add or modify monitoring instances of a specific monitoring type, configure the IP, port and other parameters of the monitoring on the other end, set the collection period, collection task scheduling method, support detecting availability in advance, etc. The monitoring instances on the page are defined by the corresponding monitoring templates.
* The monitoring parameters configured on the page are defined by the monitoring template of the corresponding monitoring type, and users can modify the configuration parameters on the page by modifying the monitoring template.
Expand All @@ -235,7 +226,7 @@ Built-in support for monitoring types include:
* The monitoring data detail page shows the basic parameter information of the current monitoring, and the monitoring indicator data information.
* Monitor Real-time Data Report displays the real-time values of all the currently monitored indicators in the form of a list of small cards, and users can configure alarm threshold rules based on the real-time values for reference.
* Monitor Historical Data Report displays the historical values of the currently monitored metrics in the form of trend charts, supports querying hourly, daily and monthly historical data, and supports configuring the page refresh time.
* ⚠️ Note that the monitoring history charts need to be configured with an external timing database in order to get the full functionality, timing database support: IOTDB, TDengine, InfluxDB, GreptimeDB
* ⚠️ Note that the monitoring history charts need to be configured with an external timing database in order to get the full functionality.

![HertzBeat](/img/home/3.png)

Expand All @@ -248,24 +239,34 @@ Built-in support for monitoring types include:

![HertzBeat](/img/home/7.png)

### Threshold Rules
### Alarm Threshold

* Threshold rules can be configured for monitoring the availability status, and alerts can be issued when the value of a particular metric exceeds the expected range.
* There are three levels of alerts: notification alerts, critical alerts, and emergency alerts.
* Threshold rules support visual page configuration or expression rule configuration for more flexibility.
* It supports configuring the number of triggers, alarm levels, notification templates, associated with a specific monitor and so on.
* Alarm Threshold are the core function of `HertzBeat`, users can configure the trigger conditions of the alarm through the threshold rules.
* Support real-time threshold and scheduled threshold, real-time threshold can directly trigger the alarm when monitoring data is collected, scheduled threshold supports PromQL and other expressions to calculate the trigger alarm within a specified time period.
* Support visual page configuration or more flexible expression rule configuration, support configuring trigger times, alarm levels, notification templates, associated specified monitoring and so on.

![HertzBeat](/img/home/6.png)

![HertzBeat](/img/docs/start/ssl_5.png)

### Alarm Integration

* Integration with third-party monitoring systems such as `Prometheus`, `WebHook`, `Skywalking`, `AlertManager`, etc. to receive alarm messages from these systems and perform alarm processing.

![HertzBeat](/img/home/11.png)

### Alarm Convergence
### Alarm Grouping

* When the alarm is triggered by the threshold rule, it will enter into the alarm convergence, the alarm convergence will be based on the rules of the specific time period of the duplicate alarm message de-emphasis convergence, to avoid a large number of repetitive alarms lead to the receiver alarm numbness.
* Alarm convergence rules support duplicate alarm effective time period, label matching and alarm level matching filter.
* Group convergence supports merging alarms for specified group labels by grouping. It deduplicates and converges the same repeated alarms in a time period.
* When the threshold rule triggers an alarm or an external alarm is reported, it will enter the grouping convergence for alarm grouping and alarm deduplication to avoid alarm storms caused by a large number of alarm messages.

![HertzBeat](/img/home/12.png)

### Alarm Inhibition

* Alarm suppression is used to configure the suppression relationship between alarms. For example, high-level alarms suppress low-level alarms under the same instance.
* When an alarm occurs, it can suppress the occurrence of other alarms. For example, when a server crashes, it can suppress all alarms on that server.

![HertzBeat](/img/home/13.png)

### Alarm Silence
Expand All @@ -274,8 +275,6 @@ Built-in support for monitoring types include:
* This application scenario, such as users in the system maintenance, do not need to send known alarms. Users will only receive alarm messages on weekdays, and users need to avoid disturbances at night.
* Alarm silence rules support one-time time period or periodic time period, support label matching and alarm level matching.

![HertzBeat](/img/home/14.png)

![HertzBeat](/img/home/15.png)

### Message Notification
Expand All @@ -292,17 +291,38 @@ Built-in support for monitoring types include:

![HertzBeat](/img/home/8.png)

### Monitoring Templates
![HertzBeat](/img/home/14.png)

### Monitoring Template

* HertzBeat makes `Http, Jmx, Ssh, Snmp, Jdbc, Prometheus` and other protocols configurable so that you can customize the metrics you want to collect using these protocols by simply configuring the monitoring template `YML` in your browser. Would you believe that you can instantly adapt a new monitoring type such as `K8s` or `Docker` just by configuring it?
* All our built-in monitoring types (mysql, website, jvm, k8s) are also mapped to corresponding monitoring templates, so you can add and modify monitoring templates to customize your monitoring functions.

![HertzBeat](/img/home/9.png)

### Collector Cluster

* Users can configure collector clusters to achieve distributed collection of large-scale monitoring tasks.
* The collector cluster supports multi-node deployment, automatic load balancing, automatic failover, etc.
* Supports unified management of multiple isolated networks, cloud-edge collaboration.

![HertzBeat](/img/home/18.png)

### Status Page

* Based on HertzBeat, quickly build an external status page for your own product and easily convey the real-time status of your product service to users. For example, the service status page provided by Github <https://www.githubstatus.com>.
* Support synchronization between the status of the status page component and the monitoring status, as well as the fault event maintenance management mechanism, etc. Improve your transparency, professionalism, and user trust, and reduce communication costs.

![HertzBeat](/img/home/19.png)

![HertzBeat](/img/home/status.png)

---

**There's so much more to discover. Have Fun!**
**More functions are welcome to be explored. Have Fun!**

---

**Github: <https://github.com/apache/hertzbeat>**

**Home: <https://hertzbeat.apache.org/>**
8 changes: 6 additions & 2 deletions home/docs/start/update-1.6.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@ sidebar_label: Update to 1.6.0 guide

## HertzBeat 1.6.0 Upgrade Guide

**Note: This guide is applicable for upgrading from 1.5.0 to 1.6.0 to version 1.6.0.**
**If you are using an older version, it is recommended to reinstall using the export function, or upgrade to 1.5.0 and then follow this guide to 1.6.0.**
:::note
This guide is applicable for upgrading from 1.5.0 to 1.6.0 to version 1.6.0.
If you are using an older version, it is recommended to reinstall using the export function, or upgrade to 1.5.0 and then follow this guide to 1.6.0.
:::

Follow the [HertzBeat New Version Upgrade](upgrade)

## Binary Installation Package Upgrade

Expand Down
41 changes: 41 additions & 0 deletions home/docs/start/update-1.7.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
id: 1.7.0-update
title: How to update to 1.7.0
sidebar_label: Update to 1.7.0 guide
---

## HertzBeat 1.7.0 Upgrade Guide

:::note
This guide is applicable for upgrading from 1.6.x to version 1.7.0.
If you are using an older version, it is recommended to reinstall using the export function, or upgrade to 1.6.0 and then follow this guide to 1.7.0.
:::

Follow the [HertzBeat New Version Upgrade](upgrade)

## Installation Upgrade

### Upgrade Database

In 1.7.0, we use the `label` instead of `tag`, in some environment, we need drop or delete the table `hzb_tag_monitor_bind` in database.

```sql
DELETE FROM hzb_tag_monitor_bind;
```

### Upgrade Alarm Threshold

In 1.7.0, we redesign the new alarm threshold, include the Real-Time Threshold and Scheduled Threshold.
We need reconfigure the alarm threshold, alarm group by manual.

:::tip
There are no default built-in threshold rules, such as the previous availability threshold.
So if you find that there is no alarm after the monitoring is down, you need to configure the corresponding availability threshold yourself.
:::

## Upgrade via Export and Import

If you do not want to go through the tedious script upgrade method mentioned above, you can directly export and import the monitoring tasks and threshold information from the old environment.

- Deploy a new environment with the latest version.
- Export the monitoring tasks and threshold information from the old environment on the page
Loading
Loading