You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Configure alert thresholds for monitoring metrics (warning alert, critical alert, emergency alert). The system triggers alerts based on threshold configuration and collected metric data.
7
+
:::tip
8
+
Alarm Threshold are the core function of `HertzBeat`, users can configure the trigger conditions of the alarm through the threshold rules.
9
+
Support real-time threshold and scheduled threshold, real-time threshold can directly trigger the alarm when monitoring data is collected, scheduled threshold supports PromQL and other expressions to calculate the trigger alarm within a specified time period.
10
+
Support visual page configuration or more flexible expression rule configuration, support configuring trigger times, alarm levels, notification templates, associated specified monitoring and so on.
### 1. Setting Labels for Monitoring Services (Optional)
15
+
##Real-time Threshold
12
16
13
-
If you need to categorize alerts, you can set labels for the monitored targets. For example: If you have multiple Linux systems to monitor, and each system has different monitoring metrics, such as: Server A has available memory greater than 1G, Server B has available memory greater than 2G, then you can set labels for Server A and Server B respectively, and then configure alerts based on these labels.
Select our label, here demonstrated as selecting the `linux:dev` label.
17
+
> Real-time threshold means that the alarm is triggered directly when the monitoring data is collected, which is suitable for scenarios with high real-time requirements.
Configure the threshold, for example: Select the SSL certificate metric object, configure the alarm expression-triggered when the metric `expired` is `true`, that is, `equals(expired,"true")`, set the alarm level notification template information, etc.
The above image explains the configuration details:
27
+
Configuration item details:
41
28
29
+
-**Rule Name**:Unique name defining this threshold rule
42
30
-**Metric Object**: Select the monitoring metric object for which we need to configure the threshold. For example: Under website monitoring type -> under the summary metric set -> responseTime metric.
43
-
-**Threshold Rule**: Use this expression to calculate whether to trigger the threshold. Expression variables and operators are provided on the page for reference. For example: Set an alert to trigger if response time is greater than 50, the expression would be `responseTime > 50`. For detailed help on threshold expressions, see [Threshold Expression Help](alert_threshold_expr).
31
+
-**Threshold Rule**: Configure the alarm trigger rules for specific indicators, support graphical interface and expression rules. For expression environment variables and operators, see the page prompts. For detailed help on threshold expressions, see [Threshold Expression Help](alert_threshold_expr).
32
+
-**Associated Monitors**:Apply this threshold rule to the specified monitoring object (support direct binding and label association). If not configured, it will be applied to all monitoring objects that meet this threshold type rule.
44
33
-**Alert Level**: The alert level triggered by the threshold, from low to high: warning, critical, emergency.
45
34
-**Trigger Count**: Set how many times the threshold must be triggered before the alert is actually triggered.
46
35
-**Notification Template**: The template for the notification message sent after the alert is triggered. Template variables are provided on the page. For example: `${app}.${metrics}.${metric} metric value is ${responseTime}, which is greater than 50 triggering the alert`.
47
36
-**Bind Label**: Select the label we need to apply. If no label is selected, it will apply to all services corresponding to the set metric object.
48
-
-**Apply Globally**: Set whether this threshold applies globally to all such metrics, default is no. After adding a threshold, it needs to be associated with the monitoring object for the threshold to take effect.
49
-
-**Recovery Notification**: Whether to send a recovery notification after the alert is triggered, default is not to send.
37
+
-**Bing Annotation**:Add annotation information to this threshold rule (the annotation content supports environment variables). When an alarm is generated, this annotation information will be rendered and attached to the alarm.
50
38
-**Enable Alert**: Enable or disable this alert threshold configuration.
51
39
52
-
**The threshold alert configuration is complete, and alerts that have been successfully triggered can be viewed in the [Alert Center].**
53
-
**If you need to send alert notifications via email, WeChat, DingTalk, or Feishu, you can configure it in [Alert Notifications].**
40
+
**The threshold alert configuration is complete, and alerts that have been successfully triggered can be viewed in the [Alarm Center].**
41
+
**If you need to send alert notifications via email, WeChat, DingTalk, or Feishu, you can configure it in [Notification].**
54
42
55
-
For other issues, you can provide feedback through the community chat group or issue tracker!
> Before we discuss the customizable monitoring capabilities of HertzBeat, which we mentioned at the beginning, let's introduce the different monitoring templates of HertzBeat. And it is because of this monitoring template design that the advanced features come later.
42
33
@@ -91,7 +82,7 @@ Do you believe that users can just write a monitoring template on the UI page, c
91
82
* And More Your Custom Template.
92
83
* Notified Support `Discord``Slack``Telegram``Email``Dingtalk``WeChat``FeiShu``Webhook``SMS``ServerChan`.
93
84
94
-
### Powerful Customization
85
+
### Customization
95
86
96
87
> From the previous introduction of **Monitoring Templates**, it is clear that `HertzBeat` has powerful customization features.
97
88
> Each monitor type is considered as a monitor template, no matter it is built-in or user-defined. You can easily add, modify and delete indicators by modifying the monitoring template.
@@ -156,7 +147,7 @@ In an isolated network where multiple networks are not connected, we need to dep
156
147
157
148
---
158
149
159
-
## Quickly Start
150
+
## 🥐 Experience Now
160
151
161
152
Just run a single command in a Docker environment: `docker run -d -p 1157:1157 -p 1158:1158 --name hertzbeat apache/hertzbeat`
* The global overview page shows the distribution of current monitoring categories, users can visualize the current monitoring types and quantities and click to jump to the corresponding monitoring types for maintenance and management.
173
164
* Show the status of currently registered collector clusters, including collector on-line status, monitoring tasks, startup time, IP address, name and so on.
174
-
* Show the list of recent alarm messages, alarm level distribution and alarm processing rate.
165
+
* Show the list of recent alarm messages, alarm level distribution etc.
175
166
176
167

177
168
@@ -222,7 +213,7 @@ Built-in support for monitoring types include:
222
213
223
214

224
215
225
-
### Add and Modify Surveillance
216
+
### New Monitor
226
217
227
218
* You can add or modify monitoring instances of a specific monitoring type, configure the IP, port and other parameters of the monitoring on the other end, set the collection period, collection task scheduling method, support detecting availability in advance, etc. The monitoring instances on the page are defined by the corresponding monitoring templates.
228
219
* The monitoring parameters configured on the page are defined by the monitoring template of the corresponding monitoring type, and users can modify the configuration parameters on the page by modifying the monitoring template.
@@ -235,7 +226,7 @@ Built-in support for monitoring types include:
235
226
* The monitoring data detail page shows the basic parameter information of the current monitoring, and the monitoring indicator data information.
236
227
* Monitor Real-time Data Report displays the real-time values of all the currently monitored indicators in the form of a list of small cards, and users can configure alarm threshold rules based on the real-time values for reference.
237
228
* Monitor Historical Data Report displays the historical values of the currently monitored metrics in the form of trend charts, supports querying hourly, daily and monthly historical data, and supports configuring the page refresh time.
238
-
* ⚠️ Note that the monitoring history charts need to be configured with an external timing database in order to get the full functionality, timing database support: IOTDB, TDengine, InfluxDB, GreptimeDB
229
+
* ⚠️ Note that the monitoring history charts need to be configured with an external timing database in order to get the full functionality.
239
230
240
231

241
232
@@ -248,24 +239,34 @@ Built-in support for monitoring types include:
248
239
249
240

250
241
251
-
### Threshold Rules
242
+
### Alarm Threshold
252
243
253
-
* Threshold rules can be configured for monitoring the availability status, and alerts can be issued when the value of a particular metric exceeds the expected range.
254
-
* There are three levels of alerts: notification alerts, critical alerts, and emergency alerts.
255
-
* Threshold rules support visual page configuration or expression rule configuration for more flexibility.
256
-
* It supports configuring the number of triggers, alarm levels, notification templates, associated with a specific monitor and so on.
244
+
* Alarm Threshold are the core function of `HertzBeat`, users can configure the trigger conditions of the alarm through the threshold rules.
245
+
* Support real-time threshold and scheduled threshold, real-time threshold can directly trigger the alarm when monitoring data is collected, scheduled threshold supports PromQL and other expressions to calculate the trigger alarm within a specified time period.
246
+
* Support visual page configuration or more flexible expression rule configuration, support configuring trigger times, alarm levels, notification templates, associated specified monitoring and so on.
257
247
258
248

259
249
250
+

251
+
252
+
### Alarm Integration
253
+
254
+
* Integration with third-party monitoring systems such as `Prometheus`, `WebHook`, `Skywalking`, `AlertManager`, etc. to receive alarm messages from these systems and perform alarm processing.
255
+
260
256

261
257
262
-
### Alarm Convergence
258
+
### Alarm Grouping
263
259
264
-
*When the alarm is triggered by the threshold rule, it will enter into the alarm convergence, the alarm convergence will be based on the rules of the specific time period of the duplicate alarm message de-emphasis convergence, to avoid a large number of repetitive alarms lead to the receiver alarm numbness.
265
-
*Alarm convergence rules support duplicate alarm effective time period, label matching and alarm level matching filter.
260
+
*Group convergence supports merging alarms for specified group labels by grouping. It deduplicates and converges the same repeated alarms in a time period.
261
+
*When the threshold rule triggers an alarm or an external alarm is reported, it will enter the grouping convergence for alarm grouping and alarm deduplication to avoid alarm storms caused by a large number of alarm messages.
266
262
267
263

268
264
265
+
### Alarm Inhibition
266
+
267
+
* Alarm suppression is used to configure the suppression relationship between alarms. For example, high-level alarms suppress low-level alarms under the same instance.
268
+
* When an alarm occurs, it can suppress the occurrence of other alarms. For example, when a server crashes, it can suppress all alarms on that server.
269
+
269
270

270
271
271
272
### Alarm Silence
@@ -274,8 +275,6 @@ Built-in support for monitoring types include:
274
275
* This application scenario, such as users in the system maintenance, do not need to send known alarms. Users will only receive alarm messages on weekdays, and users need to avoid disturbances at night.
275
276
* Alarm silence rules support one-time time period or periodic time period, support label matching and alarm level matching.
276
277
277
-

278
-
279
278

280
279
281
280
### Message Notification
@@ -292,17 +291,38 @@ Built-in support for monitoring types include:
292
291
293
292

294
293
295
-
### Monitoring Templates
294
+

295
+
296
+
### Monitoring Template
296
297
297
298
* HertzBeat makes `Http, Jmx, Ssh, Snmp, Jdbc, Prometheus` and other protocols configurable so that you can customize the metrics you want to collect using these protocols by simply configuring the monitoring template `YML` in your browser. Would you believe that you can instantly adapt a new monitoring type such as `K8s` or `Docker` just by configuring it?
298
299
* All our built-in monitoring types (mysql, website, jvm, k8s) are also mapped to corresponding monitoring templates, so you can add and modify monitoring templates to customize your monitoring functions.
299
300
300
301

301
302
303
+
### Collector Cluster
304
+
305
+
* Users can configure collector clusters to achieve distributed collection of large-scale monitoring tasks.
306
+
* The collector cluster supports multi-node deployment, automatic load balancing, automatic failover, etc.
307
+
* Supports unified management of multiple isolated networks, cloud-edge collaboration.
308
+
309
+

310
+
311
+
### Status Page
312
+
313
+
* Based on HertzBeat, quickly build an external status page for your own product and easily convey the real-time status of your product service to users. For example, the service status page provided by Github <https://www.githubstatus.com>.
314
+
* Support synchronization between the status of the status page component and the monitoring status, as well as the fault event maintenance management mechanism, etc. Improve your transparency, professionalism, and user trust, and reduce communication costs.
315
+
316
+

317
+
318
+

319
+
302
320
---
303
321
304
-
**There's so much more to discover. Have Fun!**
322
+
**More functions are welcome to be explored. Have Fun!**
Copy file name to clipboardExpand all lines: home/docs/start/update-1.6.0.md
+6-2Lines changed: 6 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,12 @@ sidebar_label: Update to 1.6.0 guide
6
6
7
7
## HertzBeat 1.6.0 Upgrade Guide
8
8
9
-
**Note: This guide is applicable for upgrading from 1.5.0 to 1.6.0 to version 1.6.0.**
10
-
**If you are using an older version, it is recommended to reinstall using the export function, or upgrade to 1.5.0 and then follow this guide to 1.6.0.**
9
+
:::note
10
+
This guide is applicable for upgrading from 1.5.0 to 1.6.0 to version 1.6.0.
11
+
If you are using an older version, it is recommended to reinstall using the export function, or upgrade to 1.5.0 and then follow this guide to 1.6.0.
12
+
:::
13
+
14
+
Follow the [HertzBeat New Version Upgrade](upgrade)
This guide is applicable for upgrading from 1.6.x to version 1.7.0.
11
+
If you are using an older version, it is recommended to reinstall using the export function, or upgrade to 1.6.0 and then follow this guide to 1.7.0.
12
+
:::
13
+
14
+
Follow the [HertzBeat New Version Upgrade](upgrade)
15
+
16
+
## Installation Upgrade
17
+
18
+
### Upgrade Database
19
+
20
+
In 1.7.0, we use the `label` instead of `tag`, in some environment, we need drop or delete the table `hzb_tag_monitor_bind` in database.
21
+
22
+
```sql
23
+
DELETEFROM hzb_tag_monitor_bind;
24
+
```
25
+
26
+
### Upgrade Alarm Threshold
27
+
28
+
In 1.7.0, we redesign the new alarm threshold, include the Real-Time Threshold and Scheduled Threshold.
29
+
We need reconfigure the alarm threshold, alarm group by manual.
30
+
31
+
:::tip
32
+
There are no default built-in threshold rules, such as the previous availability threshold.
33
+
So if you find that there is no alarm after the monitoring is down, you need to configure the corresponding availability threshold yourself.
34
+
:::
35
+
36
+
## Upgrade via Export and Import
37
+
38
+
If you do not want to go through the tedious script upgrade method mentioned above, you can directly export and import the monitoring tasks and threshold information from the old environment.
39
+
40
+
- Deploy a new environment with the latest version.
41
+
- Export the monitoring tasks and threshold information from the old environment on the page
0 commit comments