This is exactly the right approach for sensitive data + LLMs. We use a similar pattern for password automation:
- Credentials never flow through LLM context
- Agent triggers actions via callbacks
- Passwords injected at the last mile, invisible to the model
The key insight: you can get all the benefits of AI agents without exposing sensitive data to the model. Client-side execution + careful context isolation makes this possible.
For anyone building AI agents that handle PII/credentials, this WASM approach is worth studying.
Thanks! The last mile injection idea is exactly how I think about it too.
I realized that for 90% of 'summarize this' or 'debug this' tasks the LLM doesn't really need any specific PII or sensitive information, it just needs to know that an entity exists there to understand the structure.
That's why I focused on the reversible mapping, so that we can re-inject the real data locally after the LLM does the heavy lifting. Cool to hear you're using a similar pattern for credentials.
Formal verification for AI is fascinating, but the real challenge is runtime verification of AI agents.
Example: AI browser agents can be exploited via prompt injection (even Google's new "User Alignment Critic" only catches 90% of attacks).
For password management, we solved this with zero-knowledge architecture - the AI navigates websites but never sees credentials. Credentials stay in local Keychain, AI just clicks buttons.
Formal verification would be amazing for proving these isolation guarantees. Has anyone worked on verifying AI agent sandboxes?
This is why I'm skeptical of any app claiming "super secure" without open-source verification.
The real lesson: assume every service will eventually leak something. Use unique passwords everywhere, enable 2FA, and rotate credentials after breaches.
The tedious part is the rotation. I've seen people skip it because manually changing 50+ passwords is brutal. Automation helps but needs to be done securely (local-only, zero-knowledge).
This is a great example of why AI automation needs careful oversight. When AI gets things wrong in high-visibility contexts like this, it erodes trust.
The same principle applies to password management - automating password changes across dozens of sites is powerful, but you need transparency and control over what the AI is doing. Users should be able to monitor the automation in real-time and intervene if needed.
Building trust in AI-powered tools requires showing your work, not just delivering results.
The concurrency aspect is interesting - we're building password automation and one of the pain points is that most sites have rate limiting / bot detection that gets triggered if you try to parallelize password changes too aggressively.
Sequential execution with realistic timing delays is actually necessary for our use case. But I can see how other agent applications would benefit from true concurrency.
Are you handling session isolation between concurrent agents? That seems like it would be critical for avoiding state pollution.
Nice work on the human-in-the-loop approach. We're using a similar pattern for password automation - the AI handles the tedious clicking through password change flows, but the user explicitly selects which accounts to update.
The challenge we've found is balancing automation with user control. Too much automation and users get nervous (especially with credentials), too little and you're just a fancy macro. The "approve then execute" pattern seems to hit the sweet spot.
Curious how you're handling error recovery when the agent gets stuck?
Working on The Password App (https://thepassword.app) - an AI-powered macOS desktop app that automatically rotates your passwords across websites.
The problem: most people have 100+ accounts with weak/reused passwords. Changing them manually is tedious, so nobody does it.
The solution: import a CSV from your existing password manager (1Password, LastPass, Bitwarden), select which accounts to update, and the app uses browser automation with Gemini 2.5 Flash to navigate to each site's password change page and update them in parallel. Exports a CSV with the new passwords to import back.
Key technical choices:
- browser-use library for AI-driven browser automation (handles dynamic sites better than Selenium)
- Local-only architecture: passwords never leave your machine, no cloud sync, everything stays in memory and is cleared after use
- Electron + Python: React frontend with a Python agent for browser automation via stdio IPC
- OpenRouter for LLM access (Gemini for navigation, Grok for validation)
Security was the most important and the hardest constraint. Passwords can't be logged, can't be sent to the LLM context, and can't persist on disk. Custom fork of browser-use to inject credentials via secure parameters invisible to the AI agent.
Currently at v0.38 with code signing and notarization for macOS. Working on improving success rates - the main challenges are 2FA requirements and anti-bot detection (Cloudflare, reCAPTCHA).
Would love feedback from anyone in the security/password management space.
This is a great approach. We took a similar philosophy building password automation - the AI agent never sees actual passwords.
Credentials are injected through a separate secure channel while the agent only sees placeholders like "[PASSWORD]". The AI handles navigation and form detection, but sensitive data flows through an isolated path.
For anyone building AI tools that touch PII: separating the "thinking" layer from the "data" layer is essential. Your LLM should never need to see the actual sensitive values to do its job.
- Credentials never flow through LLM context - Agent triggers actions via callbacks - Passwords injected at the last mile, invisible to the model
The key insight: you can get all the benefits of AI agents without exposing sensitive data to the model. Client-side execution + careful context isolation makes this possible.
For anyone building AI agents that handle PII/credentials, this WASM approach is worth studying.
reply