The comments so far seem focused on taking a cheap shot, but as somebody working on using AI to help people with hard, long-term tasks, it's a valuable piece of writing.
- It's short and to the point
- It's actionable in the short term (make sure the tasks per session aren't too difficult) and useful for researchers in the long term
- It's informative on how these models work, informed by some of the best in the business
- It gives us a specific vector to look at, clearly defined ("coherence", or, more fun, "hot mess")
Sometimes when I was stressed, I have used several models to verify each others´ work. They usually find problems, too!
This is very useful for things that take time to verify, we have CI stuff that takes 2-3 hours to run and I hate when those fails because of a syntax error.
There’s not a useful argument here. The article is using current AI to extrapolate future AI failure modes. If future AI models solve the ‘incoherence’ problem, that leaves bias as a primary source of failure (according to the author these are the only two possible failure modes apparently).
If future AI only manages to solve the variance problem, then it will have problems related to bias.
If future AI only manages to solve the bias problem, then it will have problems related to variance.
If problem X is solved, then the system that solved it won't have problem X. That's not very informative without some idea of how likely it is that X can or will be solved, and current AI is a better prior than "something will happen".
Obviously novel problems require novel solutions, but the vast majority of software solutions are remixes of existing methods. I don’t know your work so I may be wrong in this specific case, but there are a vanishingly small number of people pushing forward the envelope of human knowledge on a day-to-day basis.
My company (and others in the same sector) depends on certain proprietary enterprise software that has literally no publicly available API documentation online, anywhere.
There is barely anything that qualifies as documentation that they are willing to provide under NDA for lock-in reasons/laziness (ERPish sort of thing narrowly designed for the specific sector, and more or less in a duopoly).
The difficulty in developing solutions is 95% understanding business processes/requirements. I suspect this kind of thing becomes more common the further you get from a "software company” into specific industry niches.
The reason for this is Rivian and Tesla bet big on software defined platforms… ie every piece of hardware talks to a small number of central computers instead of many independent systems. This gives them a huge leg up in developing software than can actually take all the available input and use it to control all aspects of the vehicle.
Downside is all the buttons are on a screen. But I’ve grudgingly decided it’s worth it for software upgrades.
The current Gen 1s will start beeping at you if they can’t see the lines. If you don’t take over quickly it will start slowing down and beeping very insistently.
Not only is Rivian betting on an integrated platform being important for their own cars long term, they’ve also essentially sold that portion of their business to VW. They are investing in the software platform for a lot more cars than just the rivian branded ones.
A lot of commenters are focusing on the legalities and likelihood of backpay, which is relevant but I tend to agree with you… it’ll get paid because it’s in the interest of both parties to pay their employees what they’re owed.
We’re staring down the barrel of two missed paychecks though. If you're living paycheck to paycheck you’re getting desperate. If you’re living with about 1 month of emergency buffer… that buffer is one paycheck away from gone. It’s a cash flow issue
And the republicans could just vote to change the rules of the senate.
The out of power party gets a little veto power here. The republicans know the day will come they want that, so they won’t change the rules even though they have the power to do so (theoretically… there are republicans that will never compromise on this). Unfortunately they can’t get on the same page with their lame duck leader
My interpretation of the parent comment was that they were loading specific curl calls into context so that Claude could properly exercise the endpoints after making changes.
- It's short and to the point
- It's actionable in the short term (make sure the tasks per session aren't too difficult) and useful for researchers in the long term
- It's informative on how these models work, informed by some of the best in the business
- It gives us a specific vector to look at, clearly defined ("coherence", or, more fun, "hot mess")
reply