Hacker Newsnew | past | comments | ask | show | jobs | submit | haiku2077's commentslogin

I've noticed this also happens in english Whisper models with the phrases:

"[ sub by sk cn2 ]"

or

"Anyways, thanks for watching! Please subscribe and like! Thanks for watching! Bye!"

or

"This is the end of the video. Thank you for watching. If you enjoyed this video, please subscribe to the channel. Thank you."


Because they train on pirated media and or youtube videos, good method, until you get slop, or get caught


> Should piloting a plane solo be outlawed?

Pretty much every civil aviation authority in the world requires two pilots on commercial flights.


Clearly my reference was to non-commercial flights. See: bush pilot in Alaska.


Many bush pilots charge cash to fly people.

And in the age of pretty good long-distance telemetry, I'm sure that 99% of the time there's not much need for the second pilot in a big jet. It's that 1%.


I've never really thought about it, but I guess I'm a little uncomfortable with noncommercial anesthesiologists and I would prefer that they are supervised.


There’s a lot more to commercial aviation than RPT (Regular Public Transport) operations.

Some of those commercial ops include single pilot IFR in Class G, into dirt airstrips at night.


Certainly something a hell of a lot simpler then x509 - and without assumptions from the 1990s hardcoded into it


Is it really X.509 that is the big “problem”? If so, I fail to see how.


Electricity can be moved long distances over wires.


For some definition of "long distances" sure. At a certain point it basically becomes more economical to charge batteries and ship them by truck than to build wires :/


HVDC losses are roughly 3.5% per 1000 km.


I think referring to the costs of 1 - 3 million per GW-mile.


True, but the quote was actually in a paragraph about self-driving cars.


Easy to say, hard to do. The cost per mile rises non-linearly.


That's not correct. It's basically cost per distance, unless you're talking huge power over huge distances.


No it doesn't.


Best advice I ever got on logging:

    log all major logical branches within code (if/for)
    if "request" span multiple machine in cloud infrastructure, include request ID in all so logs can be grouped
    if possible make log level dynamically controlled, so grug can turn on/off when need debug issue (many!)
    if possible make log level per user, so can debug specific user issue
- https://grugbrain.dev/

The only one I'll add is: If your logs are usually read in a log aggregator like Splunk or Grafana instead of in a console or text file, log as JSON objects instead of lines of text. It makes searches easier.


{ "timestamp": "2025-07-21 11:31:52", "event": "user-login", ... }, { "event": 7521, "event-type": "error-code-2025" }, { "message": Traceback (most recent call last): File "<python-input-1>", line 6, in lumberjack bright_side_of_life() ~~~~~~~~~~~~~~~~~~~^^ File "<python-input-1>", line 10, in bright_side_of_life return t[5] ~^^^ IndexError: tuple index out of range }, { "timestamp": "1753112562", "event": "user-click", ... },


For Python I like structlog which has a feature for including exception info in a readable form: https://www.structlog.org/en/stable/exceptions.html


logfmt > json IMO

For Python users, there's a "logfmter" package which is enormously more straightforward than the popular "structlog" one.


Following this advice, I've seen service where you could say the main function was to produce logs, and the actual response to the user was only a small part of the traffic generated.

What we really need is smart logging: only log the full span when an error is detected, otherwise no need for it. But it's not a very well supported case.


I worked at a place that banned logging levels. Everything sent to to the "log" function gets written to the logs, period. And then the rest was left to the services' maintainers to figure out.

That, combined with a real, genuine devops scheme where the people implementing the system and the people keeping it running in production where on the same team and generally the same actual people, seemed to produce some of the most excellent and usable logging I've ever seen. Without needing a whole bunch of rules to try and force everyone to (still fail to) get there.

One neat thing that I think really facilitated this was the sense of empowerment that came with having exactly one rule (logging isn't configurable) combined with one goal (keep the system up). We did decide we wanted smart logging along those lines. And we did see that existing solutions didn't support this very well. So someone wrote one. And it was so dead simple, and easy to use. The 'user manual' for new hires was basically, "Here's 50 lines of code that you should read."


>when an error is detected

I wish we could all be so lucky as to only care "when an error is detected." Logging is about creating breadcrumbs that can be searched and cross-referenced to piece together what happened when no error is detected, but the behavior is nevertheless suspect.


> if possible make log level per user, so can debug specific user issue

That's clever and I'll definitely use it in the future.


My colleagues love to log as little as possible and most of the projects I’ve seen still treat logs as files instead of event streams that could have some search and filtering and categorization and automated alerting.

It’s kind of unfortunate, because for example there’d be pushback against logging branches in code etc., except for trace logs (that others wouldn’t add) that are also off most of the time when problems actually happen. It does help a lot in personal projects though, albeit the limited traffic there kinda minimizes any problems that ample logging might otherwise cause.

At least it’s possible to move in the direction of adding some APM like GlitchTip or Skywalking.


I've had colleagues try this. It rarely works. Logging every if end up introducing a huge amount of overhead, both in terms of processing power, but especially in terms of storage. You almost always end up having to filter based on some sort of log level that you then turn off by default in production.

The problem with that is that you're now required to reproduce the issue after turning on the logging, and if you already have a reproducer, why not just attach a real debugger?

The overlap of "we can reproduce" but "it has to run on the production server" ends up being practically zero.


> why not just attach a real debugger?

Probably due to no direct access to the environments, whereas there being less organizational resistance to toggling logging levels... assuming that the issue is even easily reproducible instead of "Okay, this happened once and cost us X$, it was pretty horrible, we don't know exactly why that happened but fix it before it happens again."

(not an exact case, but a pattern I've seen)


Something I've done in the past is send some logs to BigQuery for cheap mass storage and others to Grafana for fast querying and use in live dashboards. Basically a filter rule in our logging agent to send different events to different destinations. I think with some more hacking I could get both datasources into the same Grafana frontend...


The problem you'll run into, if you haven't already, is that you're now just programming against the log stream. The log stream has become part of the operational interface of the application. Changes to logs, and the setup around them, must now be managed with the same care as all the code in production.

The log filter is all of a sudden part of the application, but managed somewhere else. Everybody is now scared of touching the log lines because who knows what filters have been configured. You suddenly have to debug your log setup, and who logs the decisions about the logs that were filtered?

We already have a place to put that logic. It's the application.


    log all major logical branches within code (if/for)
This certainly does not work for any non-trivial amount of load...


My systems scale to somewhere around 200,000+ machines at peak and seem to do fine.


It's not a question of number of machines, it's a question of how much load any individual program can serve. Requests per second, per process. If that's O(1k) or less, then sure, do whatever you want, it's trivial load.


How many customers is that serving, 20 per machine? (Or whatever a customer might be in that case)


"customer" is hard to describe since the buyers are very large organizations. Think multinationals, entities which have air forces and space programs, etc.


> log as JSON objects instead of lines of text

Or logfmt which is easier to read for humans, lower overhead, and is still structured and supported in at least Grafana/Loki for parsing and queries.


Does logfmt allow nesting? I often inckude data structures like dicts/maps, arrays or complex objects in my JSON logs.


If anyone is curious to learn more, look up "profile-guided optimization" which observes the running program and feeds that information back into the compiler


Copilot has 2.5 Pro in the settings in github.com, along with claude 4


I pay for search and have convinced several of my collaborators to do so as well


I think the dev population mostly uses free search, just based on the fact no one has told me to “Kagi it” yet.


When I need a facial tissue I ask for a Kleenex even if the box says Puffs. Because who says "pass me the Puffs"?


I say “tissue” and “web search” so you’re talking to the wrong guy with that. Even though growing up everyone around me has said Kleenex and Google.


I've been curious of that phenomenon, why not juat ask "pass me a tissue?"


Cuz all the adults around me called it a Kleenex when I was growing up and I've internalized the word for that kind of tissue is Kleenex


Good old American brand loyalty.


I've got an AMD card I bought last year under $1k with 24GB. Meanwhile Nvidia is putting out these weird 8GB and 16GB cards in that price range.


I remember the veggie burgers they're talking about and they weren't black bean patties. The one I remember had potato with peas in it... god, it was delicious


It sounds like your describing aloo tikki. It's really delicious and sometimes used as a vegetarian burger patty.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: