Relatedly, I think that success in post-AI business will be about doing more with the same or more people, not the same with less. Like you say, the AI part reduces the value and increases the availability of anything automated, so competing by trying to fire the most people to do keep doing the same job is simply not the game of anyone wanting their business to survive.
You have to provide scale and quality that was out of reach before and is now table stakes (whenever now is for a given industry).
Our clandestine services will spend years getting people into the right places. I mean at at certain point the difference between the two blurs, and the social engineering entirely overlaps.
I took the plunge, I loved my RM2 but about a year ago I fell down stairs and landed on it.
I’ve ordered refurbed Paper Pro and Move.
Things that excited me about the device:
- with significant AI use I feel I need this more than ever. Drafting, thinking, note taking, annotating etc.
- it looks wonderful for todos, shopping lists etc.
- width designed to work with Paper Pro (and the landscape mode experience seems solid from reviews), so I will try the dual device setup
- I didn’t always have RM2 with me, and I’m hoping this will now change to genuinely always.
- I learned to love the constraints and for example I’ve discovered a love of Brandon Sanderson, Liu Cixin, Cory Doctorow, and countless other authors precisely because I went all in on DRM free ebooks, I want to expand that to graphic novels also hence the paper pro.
- I do get random inspiration and obsidian has been my powerhouse for oh the go notes but I’m hoping scrybble.ink will now let me bring remarkable documents into obsidian.
- very un-invasive to take notes in conversations etc.
Sure it’s a complete indulgence, but it helps me to enjoy note taking, being my library with me etc. and I find constraints foster my creativity and exploration and I lean into them.
I’m one of main devs of GitHub MCP (opinions my own) and I’ve really enjoyed your talks on the subject. I hope we can chat in-person some time.
I am personally very happy for our GH MCP Server to be your example. The conversations you are inspiring are extremely important. Given the GH MCP server can trivially be locked down to mitigate the risks of the lethal trifecta I also hope people realise that and don’t think they cannot use it safely.
“Unless you can prove otherwise” is definitely the load bearing phrase above.
I will say The Lethal Trifecta is a very catchy name, but it also directly overlaps with the trifecta of utility and you can’t simply exclude any of the three without negatively impacting utility like all security/privacy trade-offs. Awareness of the risks is incredibly important, but not everyone should/would choose complete caution. An example being working on a private codebase, and wanting GH MCP to search for an issue from a lib you use that has a bug. You risk prompt injection by doing so, but your agent cannot easily complete your tasks otherwise (without manual intervention). It’s not clear to me that all users should choose to make the manual step to avoid the potential risk. I expect the specific user context matters a lot here.
User comfort level must depend on the level of autonomy/oversight of the agentic tool in question as well as personal risk profile etc.
Here are two contrasting uses of GH MCP with wildly different risk profiles:
- GitHub Coding Agent has high autonomy (although good oversight) and it natively uses the GH MCP in read only mode, with an individual repo scoped token and additional mitigations. The risks are too high otherwise, and finding out after the fact is too risky, so it is extremely locked down by default.
In contrast, by if you install the GH MCP into copilot agent mode in VS Code with default settings, you are technically vulnerable to lethal trifecta as you mention but the user can scrutinise effectively in real time, with user in the loop on every write action by default etc.
I know I personally feel comfortable using a less restrictive token in the VS Code context and simply inspecting tool call payloads etc. and maintaining the human in the loop setting.
Users running full yolo mode/fully autonomous contexts should definitely heed your words and lock it down.
As it happens I am also working (at a variety of levels in the agent/MCP stack) on some mitigations for data privacy, token scanning etc. because we clearly all need to do better while at the same time trying to preserve more utility than complete avoidance of the lethal trifecta can achieve.
Anyway, as I said above I found your talks super interesting and insightful and I am still reflecting on what this means for MCP.
I've been thinking a lot about this recently. I've started running Claude Code and GitHub Copilot Agent and Codex-CLI in YOLO mode (no approvals needed) a bit recently because wow it's so much more productive, but I'm very aware that doing so opens me up to very real prompt injection risks.
So I've been trying to figure out the best shape for running that. I think it comes down to running in a fresh container with source code that I don't mind being stolen (easy for me, most of my stuff is open source) and being very careful about exposing secrets to it.
I'm comfortable sharing a secret with a spending limit: an OpenAI token that can only spend up to $25 is something I'm willing risking to an insecured coding agent.
Likewise, for Fly.io experiments I created a dedicated scratchpad "Organization" with a spending limit - that way I can have Claude Code fire up Fly Machines to test out different configuration ideas without any risk of it spending money or damaging my production infrastructure.
The moment code theft genuinely matters things get a lot harder. OpenAI's hosted Codex product has a way to lock down internet access to just a specific list of domains to help avoid exfiltration which is sensible but somewhat risky (thanks to open proxy risks etc).
I'm taking the position that if we assume that malicious tokens can drive the coding agent to do anything, what's an environment we can run in where the damage is low enough that I don't mind the risk?
> I've started running Claude Code and GitHub Copilot Agent and Codex-CLI in YOLO mode (no approvals needed) a bit recently because wow it's so much more productive, but I'm very aware that doing so opens me up to very real prompt injection risks.
In what way do you think the risk is greater in no-approvals mode vs. when approvals are required? In other words, why do you believe that Claude Code can't bypass the approval logic?
I toggle between approvals and no-approvals based on the task that the agent is doing; sometimes I think it'll do a good job and let it run through for a while, and sometimes I think handholding will help. But I also assume that if an agent can do something malicious on-demand, then it can do the same thing on its own (and not even bother telling me) if it so desired.
Depends on how the approvals mode is implemented. If any tool call needs to be approved at the harness level there shouldn't be anything the agent can be tricked into doing that would avoid that mechanism.
Mobile voice call compression sucks so much that about a decade ago, in order to play a live drum audition remotely, I once had to find a space with a landline and printer that would also let me play loudly drums to do it.
As a student I had none of those things.
In the end I concocted a successful scheme where I would buy a series of phone extension cables, convince my university bar to allow me use their landline for a while, book a drum practice room and wire the cables in a long chain carefully to it, using duct tape to keep the cables safe and above door frames etc.
Then I had to join the call, and when it was sight reading time run to the library to print out the sheet music, run back down and play it down the phone.
It was intense, but I got the gig and flew off and sailed around the Baltic gigging for a few months in the orchestra/show band which was awesome.
I really wish that a mobile phone would have worked, it would have saved me a huge amount of stress.
Making publication easy on social media has certainly had an impact on public speech, but private platforms do not offer free speech by design.
Naomi Klein went into this in No Logo with shopping malls replacing public spaces where you also don’t have a right to free speech and can be evicted arbitrarily at the owners discretion.
You’ll find virtually all of social media platforms have moderation, usage policies and user banning practices that go well beyond allowing the fully legally protected free speech you are afforded in a public space (in many countries).
If all spaces that attract the majority of people are private and have homogenous terms of use, then free speech ends in all ways except on this technicality.
Full disclosure, I work for GitHub, but push protection from Secret Scanning is awesome for this because your nearly leaked secret doesn’t make it to the remote, and it gives you instructions on how to fix your local repo!
Why does GitHub provide no way for a repository administrator to self-service a git gc? I seem to recall reading a blog post that suggested GitHub had invested a bunch of engineering resource in making cleaning up unreachable objects much more scalable.
And I think it answers the product vision for it well (why it’s automatic):
> We have used this idea at GitHub with great success, and now treat garbage collection as a hands-off process from start to finish.
GitHub also provides these docs for what to do if there is sensitive data in your repo, which is quite involved and (given the huge amount of knowledge internally of both GitHub internals and git internals), I would trust their advice:
If you feel strongly that a feature you need is missing, by adding your voice, you increase visibility of the request. I think GitHub does offer solutions to this problem though, including eventual GC automatically.
I noticed long ago that unreferenced commits survive on GitHub for long, but I couldn't find a way to discover them.
I know that GitHub stores together the objects of many repositories, but they should have implemented and offered a way to gc them when they came up with that optimization.
Sure, there would still be the chance that someone already obtained the objects by the time you gc them, but it's a much lesser risk then leaving them there indefinitely (and they could provide a log of the last fetches to better assess the impact of the erroneous push).
> chance that someone already obtained the objects by the time you gc them
I was under the impression that there are various 'mirror github' projects that listen to the GitHub change event API and immediately crawl some/all commits.
We turned that on about a year ago, and that totally helped reduce the silly. The new dashboards are nice to - letting you spot what application team needs a phone call. 'This is still active' warning is fantastic. Wish all providers would give you the API to show that.
This is a useful feature but can only provide a degree of protection.
To a certain extent, your approach of considering any mistakenly pushed commit as public is laudable, but it still seems unreasonable to me to not provide an analogue to gc
Yep, I’ve very much discovered and confirmed this behaviour independently, I was so shocked to learn that it was a known issue.
I know it’s silly but it helped me with imposter syndrome to see such a major OS that prides itself on seamless “it just works” experiences tolerate bugs like this.
That and the fact if you airplay a movie to Apple TV it thinks you want it to cast over the top of it with random advert videos in your web browser, so you can’t watch and browse.
But a lot of those bugs are basic functionalities, it is just that they are not the path Apple wants you to take.
For years now, and across iphone models and iOS versions, my iphone randomizes the artwork of the mp3s in the music app (ie display the wrong album cover). I see people complaining about that in forums but I guess if you are not using Apple Music they don't care.
I don't seem to be able to copy files from my iphone onto a windows desktop. Tried several cables, the transfer fails after a few GB. I guess if you are not syncing your files through iCloud they don't care.
A bug recently corrupted the iphone mail app. It wouldn't sync mails anymore, in fact would display old mails from the wrong account (I have multiple setup). I only fixed it by deleting the accounts and recreating them. That's using ActiveSync. But I guess if you don't use an apple email they don't care.
Another bug on my ipad. The USB-C DAC often fails to be recognised. Tried several DAC. I thought it was a hardware problem (usb port damaged), except that few hardware problems are resolved by rebooting. But if you don't use apple wireless headphones they don't care.
These are basic functionalities, but not using the main path which happens to be the most profitable to Apple. Apple's reputation for quality is inconsistent with my experience of their products.
It has nothing to do with profitability. The problem is there are too many devices, protocols and applications so it's impossible to test them all. You'll have exactly the same type of issues on a Windows or Linux based system. Microsoft also doesn't seem to care anymore and Linux distributions are not even tested for desktop use.
If anything, profitability of chip makers cheaping out on USB implementations is a bigger problem.
No. They have a black hole you can toss your bug reports into, and they’ll swear up and down that past the event horizon they’ll see it, but you more likely than not will never hear about it again.
In the worst case you can copy paste a patch into a new local file in the repo and then apply it with git from there. I’ve had somebody slack me patches before and it is not a big lift.
Good induction cookers and electric car chargers, heat pumps and things really do often require the high voltage for best function so this high voltage three-phase is also becoming standard for the energy transition.
I had to upgrade my electricity meter and switch box (even though as mentioned three-phase to the house is already standard) recently in order to accommodate planned environmental upgrades.
You have to provide scale and quality that was out of reach before and is now table stakes (whenever now is for a given industry).