Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can share my experience: monitoring and alerting should be calibrated to the number of users you serve. Early on, we run load/stress tests; if those look good, many ancillary alerts aren’t necessary. Alerts are best reserved for truly critical events—such as outages and other severe incidents. Thresholds should be tuned to real-world conditions and adjusted over time. Hope this helps.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: