And in the age of pretty good long-distance telemetry, I'm sure that 99% of the time there's not much need for the second pilot in a big jet. It's that 1%.
I've never really thought about it, but I guess I'm a little uncomfortable with noncommercial anesthesiologists and I would prefer that they are supervised.
For some definition of "long distances" sure. At a certain point it basically becomes more economical to charge batteries and ship them by truck than to build wires :/
log all major logical branches within code (if/for)
if "request" span multiple machine in cloud infrastructure, include request ID in all so logs can be grouped
if possible make log level dynamically controlled, so grug can turn on/off when need debug issue (many!)
if possible make log level per user, so can debug specific user issue
The only one I'll add is: If your logs are usually read in a log aggregator like Splunk or Grafana instead of in a console or text file, log as JSON objects instead of lines of text. It makes searches easier.
Following this advice, I've seen service where you could say the main function was to produce logs, and the actual response to the user was only a small part of the traffic generated.
What we really need is smart logging: only log the full span when an error is detected, otherwise no need for it. But it's not a very well supported case.
I worked at a place that banned logging levels. Everything sent to to the "log" function gets written to the logs, period. And then the rest was left to the services' maintainers to figure out.
That, combined with a real, genuine devops scheme where the people implementing the system and the people keeping it running in production where on the same team and generally the same actual people, seemed to produce some of the most excellent and usable logging I've ever seen. Without needing a whole bunch of rules to try and force everyone to (still fail to) get there.
One neat thing that I think really facilitated this was the sense of empowerment that came with having exactly one rule (logging isn't configurable) combined with one goal (keep the system up). We did decide we wanted smart logging along those lines. And we did see that existing solutions didn't support this very well. So someone wrote one. And it was so dead simple, and easy to use. The 'user manual' for new hires was basically, "Here's 50 lines of code that you should read."
I wish we could all be so lucky as to only care "when an error is detected." Logging is about creating breadcrumbs that can be searched and cross-referenced to piece together what happened when no error is detected, but the behavior is nevertheless suspect.
My colleagues love to log as little as possible and most of the projects I’ve seen still treat logs as files instead of event streams that could have some search and filtering and categorization and automated alerting.
It’s kind of unfortunate, because for example there’d be pushback against logging branches in code etc., except for trace logs (that others wouldn’t add) that are also off most of the time when problems actually happen. It does help a lot in personal projects though, albeit the limited traffic there kinda minimizes any problems that ample logging might otherwise cause.
At least it’s possible to move in the direction of adding some APM like GlitchTip or Skywalking.
I've had colleagues try this. It rarely works. Logging every if end up introducing a huge amount of overhead, both in terms of processing power, but especially in terms of storage. You almost always end up having to filter based on some sort of log level that you then turn off by default in production.
The problem with that is that you're now required to reproduce the issue after turning on the logging, and if you already have a reproducer, why not just attach a real debugger?
The overlap of "we can reproduce" but "it has to run on the production server" ends up being practically zero.
Probably due to no direct access to the environments, whereas there being less organizational resistance to toggling logging levels... assuming that the issue is even easily reproducible instead of "Okay, this happened once and cost us X$, it was pretty horrible, we don't know exactly why that happened but fix it before it happens again."
Something I've done in the past is send some logs to BigQuery for cheap mass storage and others to Grafana for fast querying and use in live dashboards. Basically a filter rule in our logging agent to send different events to different destinations. I think with some more hacking I could get both datasources into the same Grafana frontend...
The problem you'll run into, if you haven't already, is that you're now just programming against the log stream. The log stream has become part of the operational interface of the application. Changes to logs, and the setup around them, must now be managed with the same care as all the code in production.
The log filter is all of a sudden part of the application, but managed somewhere else. Everybody is now scared of touching the log lines because who knows what filters have been configured. You suddenly have to debug your log setup, and who logs the decisions about the logs that were filtered?
We already have a place to put that logic. It's the application.
It's not a question of number of machines, it's a question of how much load any individual program can serve. Requests per second, per process. If that's O(1k) or less, then sure, do whatever you want, it's trivial load.
"customer" is hard to describe since the buyers are very large organizations. Think multinationals, entities which have air forces and space programs, etc.
If anyone is curious to learn more, look up "profile-guided optimization" which observes the running program and feeds that information back into the compiler
I remember the veggie burgers they're talking about and they weren't black bean patties. The one I remember had potato with peas in it... god, it was delicious
"[ sub by sk cn2 ]"
or
"Anyways, thanks for watching! Please subscribe and like! Thanks for watching! Bye!"
or
"This is the end of the video. Thank you for watching. If you enjoyed this video, please subscribe to the channel. Thank you."