Skip to content

Add Prometheus Metrics Support #505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2025

Conversation

Tigger2014
Copy link
Contributor

@Tigger2014 Tigger2014 commented May 30, 2025

Add Prometheus Metrics Support

Closes #497

This PR implements Prometheus metrics support for Mailpit.
The same metrics as exposed on the info API endpoint are exposed here.

Integrated Mode: When --enable-prometheus is set to "true", metrics are served via the existing web server at /metrics. This automatically inherits all the existing server configurations.

Separate Server Mode: When set to an address like ":9090", it runs a dedicated metrics server for users who need that separation.

The flag design is:

--enable-prometheus "true" # Uses existing web server (inherits HTTPS config)
--enable-prometheus ":9090" # Separate server on port 9090
--enable-prometheus "false" # Disabled (default)

New Configuration Option:

  • --enable-prometheus : Enable Prometheus metrics server

New Environment Variable:

  • MP_ENABLE_PROMETHEUS: Enable Prometheus metrics collection

-Prometheus metrics are disabled by default - no impact on existing installations

@axllent
Copy link
Owner

axllent commented May 30, 2025

Thanks again @Tigger2014 ! While I'm away I thought I'd ask a couple of questions to reduce later delays. Clearly you have experience with Prometheus, so I expect it'll be easy to answer. These three questions are linked, I know, but I think it's easier to break down:

  1. Is it normal to have Prometheus listening only on HTTP, not HTTPS (or even provide support for HTTPS)?
  2. Why would we not reuse the existing HTTP(S) server (web UI/ API, ie: default 1025)?
  3. I realise that in some (or maybe all) instances one would want to restrict access to an IP, which could be part of the controller rather than defining a separate port / interface.

To me it would make more sense to conditionally enable the API route to the existing http(s) via the flag, which would then also allow the HTTP and HTTPS (without having to "duplicate" flags), and then optionally limit it by op or subnet. In fact, a single flag could be sufficient with just --enable-prometheus "*" or --enable-prometheus 1.2.3.0/24 etc to enable and define who can connect.

Keen to hear your thoughts?

@Tigger2014
Copy link
Contributor Author

Tigger2014 commented May 30, 2025

Thanks for the feedback! I've updated the implementation since my initial PR description to address exactly these concerns.

The current implementation now supports both approaches:

Integrated Mode (Default): When --enable-prometheus is set to "true", metrics are served via the existing web server at /metrics. This automatically inherits all the existing server configurations.

Separate Server Mode: When set to an address like ":9090", it runs a dedicated metrics server for users who need that separation.

The flag design is:

--enable-prometheus "true"      # Uses existing web server (inherits HTTPS config)
--enable-prometheus ":9090"     # Separate server on port 9090  
--enable-prometheus ""          # Disabled (default)

This addresses your concerns:

  1. HTTPS support: Integrated mode automatically uses the existing TLS configuration
  2. Reusing existing server: The default integrated mode does exactly this
  3. Simplified configuration: Single flag with clear options

The implementation defaults to the integrated approach (reusing existing infrastructure) while maintaining the separate server option for advanced use cases where network isolation is needed.

Regarding IP restrictions - Likely not something Mailpit should handle directly. It's best left to other parts of the infrastructure (reverse proxies, firewalls, network policies) for environments where that's necessary.

It's quite common to have Prometheus exposed over HTTP without auth within Kubernetes, for example, where each node scrapes pods internally. This adjusted approach provides the best of both worlds.

Let me know your thoughts and I can update the docs/PR description accordingly!

@axllent axllent changed the base branch from develop to feature/prometheus June 2, 2025 02:54
@axllent
Copy link
Owner

axllent commented Jun 2, 2025

Hi @Tigger2014. Thanks for the updates/changes. Yes, I think we can get away with using just a single multi-purpose flag --enable-prometheus / MP_PROMETHEUS_LISTEN rather than two separate flags. It's unfortunate that there isn't a dual-purpose boolean/string option (ie: being able to do just --enable-prometheus or --enable-prometheus <string>), however I think the string --enable-prometheus true works too.

A few comments/requests:

  1. Please rebase your branch - there is a conflict with develop (I haven't looked into what exactly, but it'll be something really small, and related to your previous MR + structural changes I made)
  2. I noted an issue where Mailpit's middleWareFunc() adds conditional gzip compression, and so does prometheus.GetHandler().ServeHTTP(w, r) (if my guess is correct). I have no objection to an additional check in Mailpit's middleWareFunc() to bypass gzip if the route matches config.Webroot+"metrics" as that is already being handled (I think) in prometheus.GetHandler().ServeHTTP(w, r).
  3. I assume your current MR still contains some redundant code which needs to be removed or modified (eg: MP_PROMETHEUS_LISTEN) relating to your original proposal?
  4. It would be a good idea to handle a "false" value in --enable-prometheus false / MP_ENABLE_PROMETHEUS=false in the promethius.GetMode() method, as well as potentially the uppercase variant - if config.PrometheusListen == "" || strings.ToLower(config.PrometheusListen) == "false" {. The same for "true".

I intend to add both this feature and your other MR to v1.26.0 once ready (just so you're aware re: docs). I already reviewed, tested and merged your other MR, so that release will depend a bit on how fast we can get this one finalised, tested etc - however I do not want to put any unnecessary pressure on you, so please do not stress. I am really grateful for what you have done! Thanks so much, and I'm looking forward to your revisions.

@axllent
Copy link
Owner

axllent commented Jun 2, 2025

Also I think there may be some issues with some of the metrics (float64):

# HELP mailpit_database_size_bytes Size of the database in bytes
# TYPE mailpit_database_size_bytes gauge
mailpit_database_size_bytes 3.100672e+06
# HELP mailpit_memory_usage_bytes Memory usage in bytes
# TYPE mailpit_memory_usage_bytes gauge
mailpit_memory_usage_bytes 1.43774768e+08
...
# HELP mailpit_smtp_accepted_size_bytes_total Total size of accepted SMTP messages in bytes
# TYPE mailpit_smtp_accepted_size_bytes_total counter
mailpit_smtp_accepted_size_bytes_total 2.9468166e+07
...

@Tigger2014 Tigger2014 force-pushed the prom branch 2 times, most recently from 389c3e9 to b373b9f Compare June 2, 2025 18:19
@Tigger2014
Copy link
Contributor Author

Hi @axllent, thank you for taking the time to review my changes thoroughly.

I've gone ahead and made the suggested changes.

Regarding the float64 issue, I'm not sure that it is in fact, an issue; scientific notation should be valid.

@axllent
Copy link
Owner

axllent commented Jun 3, 2025

Thank you @Tigger2014. I'm still a bit concerned about the gauge format, as it looks to me like the output from a calculator which can't handle a large number :) , This is most likely because I'm simply inexperienced with Prometheus, so I am relying heavily on your feedback & experience here. I assume you use Prometheus and have tested this on a running Mailpit instance? Does it give meaningful values within Prometheus which can be used to monitor / chart? Assuming "yes" to the previous questions, how confident are you that this is "production ready"?

@Tigger2014
Copy link
Contributor Author

Hello @axllent ,

Prometheus is something I use more than I write, so I had to do some digging into the floats. I've spun up a quick Prometheus server and scraped Mailpit below. It appears to me that everything is working as expected.

This is an example of all the metrics as displayed in Prometheus, the /metrics were in scientific notation.
image
This is the memory usage graphed:
image

This data displays in Grafana without difficulty:
image

Simple Docker compose for testing.

networks:
  monitoring:
    driver: bridge

configs:
  prometheus_config:
    content: |
      global:
        scrape_interval: 10s
      scrape_configs:
        - job_name: 'mailpit'
          static_configs:
            - targets: ['192.168.4.67:8026']
          scrape_interval: 30s
          metrics_path: '/metrics'

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    configs:
      - source: prometheus_config
        target: /etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
    networks:
      - monitoring
    restart: unless-stopped

Do let me know if you have any more questions or if I've been unclear.

Many thanks

@axllent
Copy link
Owner

axllent commented Jun 3, 2025

That's very useful, and reassuring. I have no idea how those scientific notations result in seemingly accurate values, but your screenshots indicate it's working correctly so I won't argue.

I'm afraid I'm going to have to push the release out a few days in order to further test, review and put the release together (I don't like rushing out new features without being as sure as I can be that it works). I hope that's not an issue given all the hard work you've put in?

@axllent axllent merged commit 82d7bdc into axllent:feature/prometheus Jun 6, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature: Metrics endpoint
2 participants