Hacker Newsnew | past | comments | ask | show | jobs | submit | demurgos's commentslogin

Here is mine: https://demurgos.net

There's not much, but I keep a few articles and games there.


The "unexpected" part is that the browser automatically fills some headers on behalf of the user, that the (malicious) origin server does not have access to. For most headers it's not a problem, but cookies are more sensitive.

The core idea behind the token-based defense is to prove that the origin server had access to the value in the first place such that it could have sent it if the browser didn't add it automatically.

I tend to agree that the inclusion of cookies in cross-site requests is the wrong default. Using same-site fixes the problem at the root.

The general recommendation I saw is to have two cookies. One without same-site for read operations, this allows to gracefully handle users navigating to your site. And a second same-site cookie for state-changing operations.


I feel like there's a parallel with SQL where you want to discourage manual interpolation. Taking inspiration from it may help: you may not fully solve it but there are some API ideas and patterns.

A logging framework may have the equivalent of prepared statements. You may also nudge usage where the raw string API is `log.traceRaw(String rawMessage)` while the parametrized one has the nicer naming `log.trace(Template t, param1, param2)`.


You can have 0 parameters and the template is a string...


The point of my message is that you should avoid the `log(string)` signature. Even if it's appealing, it's an easy perf trap.

There are many ideas if you look at SQL libs. In my example I used a different type but there other solutions. Be creative.

    logger.log(new Template("foo"))`
    logger.log("foo", [])
    logger.prepare("foo").log()


And none of those solve the issue.

You pass "foo" to Template. The Template will be instantiated before log ever sees it. You conveniently left out where the Foo string is computed from something that actually need computation.

Like both:

    new Template("doing X to " + thingBeingOperatedOn)

    new Template("doing " + expensiveDebugThing(thingBeingOperatedOn))
You just complicated everything to get the same class of error.

Heck even the existing good way of doing it, which is less complicated than your way, still isn't safe from it.

    logger("doing {}", expensiveDebugThing(thingBeingOperatedOn))
All your examples have the same issue, both with just string concatenation and more expensive calls. You can only get around an unknowing or lazy programmer if the compiler can be smart enough to entirely skip these (JIT or not - a JIT would need to see that these calls never amount to anything and decide to skip them after a while. Not deterministically useful of course).


Yeah, it's hard to prevent a sufficiently motivated dev from shooting itself in the foot; but these still help.

> You conveniently left out where the Foo string is computed from something that actually need computation.

I left it out because the comment I was replying to was pointing that some logs don't have params.

For the approach using a `Template` class, the expectation would be that the doc would call out why this class exists in the first place as to enable lazy computation. Doing string concatenation inside a template constructor should raise a few eyebrows when writing or reviewing code.

I wrote `logger.log(new Template("foo"))` in my previous comment for brevity as it's merely an internet comment and not a real framework. In real code I would not even use stringy logs but structured data attached to a unique code. But since this thread discusses performance of stringy logs, I would expect log templates to be defined as statics/constants that don't contain any runtime value. You could also integrate them with metadata such as log levels, schemas, translations, codes, etc.

Regarding args themselves, you're right that they can also be expensive to compute in the first place. You may then design the args to be passed by a callback which would allow to defer the param computation.

A possible example would be:

    const OPERATION_TIMEOUT = new Template("the operation $operationId timed-out after $duration seconds", {level: "error", code: "E_TIMEOUT"});
    // ...
    function handler(...) {
      // ..
      logger.emit(OPERATION_TIMEOUT, () => ({operationId: "foo", duration: someExpensiveOperationToRetrieveTheDuration()}))
    }
This is still not perfect as you may need to compute some data before the log "just in case" you need it for the log. For example you may want to record the current time, do the operation. If the operation times out, you use the time recorded before the op to compute for how long it ran. If you did not time out and don't log, then getting the current system time is "wasted".

All I'm saying is that `logger.log(str)` is not the only possible API; and that splitting the definition of the log from the actual "emit" is a good pattern.


Unless log() is a macro of some sort that expands to if(logEnabled){internalLog(string)} - which a good optimizer will see through and not expand the string when logging is disabled.


What are TV brands/OSes that complain the least when not connected to the internet?


My LG and Samsung TVs have never been connected to wifi. They don't complain at all.


I looked into it for work at some point as we wanted to support SVG uploads. Stripping <script> is not enough to have an inert file. Scripts can also be attached as attributes. If you want to prevent external resources it gets more complex.

The only reliable solution would be an allowlist of safe elements and attributes, but it would quickly cause compat issues unless you spend time curating the rules. I did not find an existing lib doing it at the time, and it was too much effort to maintain it ourselves.

The solution I ended up implementing was having a sandboxed Chromium instance and communicating with it through the dev tools to load the SVG and rasterize it. This allowed uploading SVG files, but it was then served as rasterized PNGs to other users.


Shouldn't the ignoring of scripting be done at the user agent level? Maybe some kind of HTTP header to allow sites to disable scripts in SVG ala CORS?


It's definitely a possible solution if you control how the file are displayed. In my case I preferred the files to be safe regardless of the mechanism used to view them (less risk of misconfiguration).


Content-Security-Policy: default-src 'none'


It is intentional to avoid non-free projects from building on top of gcc components.

I am not familiar enough with gcc to know how it impacts out-of-tree free projects or internal development.

The decision was taken a long time ago, it may be worth revisiting it.


Over the years several frontends for languages that used to be out-of-tree for years have been integrated. So both working in-tree & outside are definitely possible.


Internal means "not exposed outside some boundary". For most people, this boundary encompasses something larger than a single database, and this boundary can change.


You are talking about forward compatibility.

JS is backwards compatible: new engines support code using old features.

JS is not forward compatible: old engines don't support code using new features.

Regarding your iPad woes, the problem is not the engine but websites breaking compat with it.

The distinction matters as it means that once a website is published it will keep working. The only way to break an existing website is to publish a new version usually. The XSLT situation is note-worthy as it's an exception to this rule.


No it doesn't. A global flag is a no-go as it breaks modularity. A local opt-in through dedicated types or methods is being designed but it's not stable.


I believe that you are describing `Vec::with_capacity` which allows to change the initial reserved memory on construction.

`reserve` and `reserve_exact` are used when mutating an existing vec. What you provide is not the total wanted capacity but the additional wanted capacity.

`reserve` allows to avoid intermediate allocation.

Let's say that you have a vec with 50 items already and plan to run a loop to add 100 more (so 150 in total). The initial internal capacity is most likely 64, if you just do regular `push` calls without anything else, there will be two reallocations: one from 64 to 128 and one from 128 to 256.

If you call `reserve(100)`, you'll be able to skip the intermediate 64 to 128 reallocation: it will do a single reallocation from 64 to 256 and it will be able to handle the 100 pushes without any reallocation.

If you call `reserve_exact(100)`, you'll get a single reallocation for from 64 to 150 capacity, and also guarantee no reallocation during the processing loop.

The difference is that `reserve_exact` is better if these 100 items were the last ones you intended to push as you get a full vec of capacity 150 and containing 150 items. However, if you intend to push more items later, maybe 100 more, then you'd need to reallocate and break the amortized cost guarantees. With `reserve`, you don't break the amortized cost if there are follow-up inserts; at the price of not being at 100% usage all the time. In the `reserve` case, the capacity of 256 would be enough and let you go from 150 to 250 items without any reallocation.

In short, a rule of thumb could be:

- If creating a vec and you know the total count, prefer `Vec::with_capacity`

- If appending a final chunk of items and then no longer adding items, prefer `Vec::reserve_exact`

- If appending a chunk of items which may not be final, prefer `Vec::reserve`


This should be in the docs or a blog post somewhere. Very clear explanation.


That's a nice idea, thank you. I have personal blog, I'll try to clean it up a bit and provide performance measurements so it's worth posting.

Regarding the official documentation, I've returned to read them. I agree that the docs would benefit from more discussion about when to use each method. In particular, the code examples are currently exactly the same which is not great. Still, the most critical piece of information is there [0]

> Prefer `reserve` if future insertions are expected.

If anyone wants to reuse my explanation above, feel free to do it; no need to credit.

[0]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.res...


It is, though not worded as nicely as the GP comment.

docs: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.res...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: