Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if you could make a luhn-like check that would require an additional approval step to post if it comes back positive. Something like "It looks like you may be posting a secret *****. Do you wish to continue?


If vendors agreed to a common prefix on all secret key values then it'd be easy for everyone to add checks, to everything. Something like "_SECRET88_".

Of course, then your secret key checker would need to build that string by concatenating so that it wouldn't set off itself.


More and more providers have been adding unique prefixes to their tokens and access keys which makes detection much easier. Ex, GitLab adds `glpat-` to their PAT.

A project I maintain, Gitleaks, can easily detect "unique" secrets and does a pretty good job at detecting "generic" secrets too. In this case, the generic gitleaks rule would have caught the secrets [1]. You can see the full rule definition here [2] and how the rule is constructed here [3].

[1] https://regex101.com/r/CLg9TK/1

[2] https://github.com/zricethezav/gitleaks/blob/master/config/g...

[3] https://github.com/zricethezav/gitleaks/blob/master/cmd/gene...


RFC 8959 registered the 'secret-token:' prefix / URI scheme.

https://www.rfc-editor.org/rfc/rfc8959.html


How about scanning for any string with high entropy? Might be easier to get buy-in if we don’t all have to bike-shed over what the prefix is.


That’s helpful but the token prefixes are also helpful. You might be interested in GitHub’s reasoning at https://github.blog/2021-04-05-behind-githubs-new-authentica...


Unfortunately, it's not as simple as that. Lots of secrets are "generic" (think of a DB user/password combination), meaning that you need to take into account the surrounding source code context to be able to determine if they are a "real" secret.

Here is a full explanation if you are interested: https://blog.gitguardian.com/why-detecting-generic-credentia...


I was thinking about that too, but it's actually tricky, even the example given, they use the var `accessId` but you could filter for all that, even the standard ones, but you couldn't have enough confidence in it so that if someone did post with a typo or even a random var name, they would think "Okay, no warning so must be okay".

Something like giving false confidence to the user. Not the best idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: