How to Turn Your Homelab Monitoring Setup Into a Paid Alerting Service for Small Businesses
You have Grafana dashboards, Prometheus exporters, and Alertmanager configured to ping your phone at 2am when your NAS runs out of disk space. That same stack β with some hardening and a bit of packaging β is exactly what a small business owner would pay real money for. They just don't have the time, skills, or staff to build it themselves.
The gap between a homelab hobby and a sellable product is smaller than you think. This article walks you through how to bridge it.
What You'll Learn
- How to assess which parts of your homelab monitoring stack are actually sellable
- How to package your setup so it works for multiple clients without turning into a maintenance nightmare
- What SMBs actually want to be alerted about (it's not what you'd monitor for yourself)
- How to price and deliver a managed alerting service
- How to handle the business side: contracts, SLAs, and support boundaries
Why Small Businesses Are an Underserved Market
Enterprise monitoring is a solved problem β it's just solved expensively. Tools like Datadog, PagerDuty, and New Relic are excellent, but their pricing models assume you have a dedicated ops team and a five-figure monthly budget. A plumbing company with ten employees doesn't have that.
What a small business owner actually needs is simple: tell me when something is broken before my customers notice. They want to know when their website is down, when their server is running hot, when their backup failed, or when disk space is about to run out. They don't need distributed tracing or percentile latency graphs. They need a phone call or a text.
This is where you come in. You already know how to build that. The work now is making it repeatable and trustworthy enough to charge for.
Audit Your Homelab Stack First
Before you pitch anyone, take stock of what you actually have. Most homelabs are held together with good intentions and ad-hoc config. That's fine for personal use, but a paying client is another matter.
Go through your setup and ask these questions for each component:
- Is this running in a way I could replicate in under an hour for a new client?
- What happens if this service crashes β does it restart automatically?
- Is any sensitive config (passwords, API keys) stored in plaintext on disk?
- Can I update this without taking everything else down?
The components you'll likely rely on are Prometheus for metrics scraping, Alertmanager for routing alerts, Grafana for dashboards, and one or more exporters (Node Exporter for Linux hosts, Blackbox Exporter for HTTP/ICMP checks, and SNMP Exporter for network gear). If you don't have Blackbox Exporter running yet, add it β uptime and HTTP response checks are the most immediately valuable thing for an SMB client.
Designing a Multi-Tenant Architecture
The biggest architectural mistake you can make early is setting up one big Prometheus instance that scrapes all your clients in a single flat namespace. When client A's server is next to client B's server in the same config file, you're one misconfiguration away from a data leak and a support nightmare.
Isolate each client
The cleanest approach is to run a separate Prometheus instance per client, either as a Docker container or a systemd service. Use a naming convention like prometheus-clientname and store each client's config in its own directory under something like /etc/monitoring/clients/clientname/.
# Directory structure example
/etc/monitoring/
clients/
acme-plumbing/
prometheus.yml
alert.rules.yml
alertmanager.yml
riverside-cafe/
prometheus.yml
alert.rules.yml
alertmanager.ymlGrafana handles multi-tenancy reasonably well through its Organizations feature. Create a separate Organization for each client, add their Prometheus instance as a data source within that org, and they'll never see another client's data if you ever give them dashboard access.
Use a VPN for remote scraping
You'll need to scrape metrics from the client's infrastructure. The cleanest way to do this without poking holes in their firewall is to run a lightweight VPN. WireGuard is the right choice here β it's fast, simple to configure, and widely supported. Install a WireGuard peer on a small VM or Raspberry Pi at the client's site, and have it dial back to your monitoring server. Prometheus then scrapes exporters through the tunnel.
# On the client-site peer, /etc/wireguard/wg0.conf
[Interface]
PrivateKey = <client-private-key>
Address = 10.100.1.2/24
[Peer]
PublicKey = <your-server-public-key>
Endpoint = your.monitoring.server:51820
AllowedIPs = 10.100.1.0/24
PersistentKeepalive = 25This keeps the client's internal network private and gives you a stable, encrypted path to their exporters. It also means you're not asking them to open inbound ports, which removes a common objection from their IT-adjacent person.
What to Monitor for SMB Clients
Resist the urge to replicate your personal homelab monitoring setup for clients. You probably track CPU steal, ZFS ARC hit ratios, and Plex transcoding sessions. None of that is relevant to a client who runs a small e-commerce site.
Start with a standard set of checks that apply to almost every small business:
- Website uptime β HTTP check every 30 seconds with Blackbox Exporter. Alert if down for more than 2 minutes.
- SSL certificate expiry β Alert at 30 days and again at 7 days remaining. Certificate lapses are embarrassingly common and completely preventable.
- Disk usage β Alert at 80% and again at 90%. Node Exporter gives you this out of the box.
- Server CPU and memory β Alert on sustained high usage (e.g., CPU above 90% for 10 minutes), not spikes.
- Backup job success β Write a simple script that touches a file or hits a URL after a successful backup, and alert if that hasn't happened within the expected window.
- Database connectivity β A simple connection check every minute. If the database is unreachable, so is the application.
This list fits on a single page and covers the things that actually cause revenue-impacting downtime for a small business.
Alert Routing: Getting Notifications to the Right Person
Alertmanager's routing config is where most of the delivery logic lives. For each client, you'll configure a receiver that knows how to reach the right person. Email is the baseline. SMS via a gateway like Twilio adds another layer for critical alerts. For clients who are comfortable with it, a Slack or Teams webhook works well.
route:
receiver: 'acme-email'
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receivers:
- name: 'acme-email'
email_configs:
- to: 'owner@acmeplumbing.com'
from: 'alerts@yourdomain.com'
smarthost: 'smtp.yourprovider.com:587'
auth_username: 'alerts@yourdomain.com'
auth_password: '<password>'
subject: 'Alert: {{ .GroupLabels.alertname }}'
body: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'Keep the alert message language simple and non-technical. Instead of node_filesystem_avail_bytes{mountpoint="/"} < 10737418240, write something like:
π€ Share this article
Sign in to saveRelated Articles
Comments (0)
No comments yet. Be the first!