This isn't exactly what I laid out, of course, but I think it achieves the goals I was looking for, which is the real issue. In particular, having the default string interpolation be prefixed with "STR." is enough for linters and scanners to get a chew on (it's easy by scanner standards to track that a STR.-interpolated string got fed into a database query), and for code reviewers to develop an instinct to look at such interpolations more closely than they need to for a DB. interpolation. An STR annotation is not technically necessary for the former, if it were simply the default, but it is a big deal for the latter. I want people to have a chance to notice and think about their use of bare string interpolation for at least a fraction of a second as they type "STR." (or autocomplete it or whatever).
This does put a heavy burden on the libraries to implement it correctly, but if they do it's even safer than what I was thinking.
One thing though: Please for the love of the internet, for those of you writing interpolators, DO NOT write an interpolator that picks apart the values passed in through a \{ ... } and starts instantiating arbitrary classes via Java reflection. Just stay away from that entirely, OK?
Scala has had all of that for years. This JEP literally copies the design of Scala string interpolators. Except that Scala has macros so, if desired, a string interpolator can perform validation at compile time, making it even more secure than what is being proposed here.
Back in 1997, when James Gosling outlined his vision for Java, he said it should be a conservative language (wrapping a very innovative runtime) that would ideally only adopt features that have proven themselves, for some time, in other languages. Being a last mover is at the very core of Java's evolution strategy. It's not playing catch-up because we're not trying to adopt all features other, more "adventurous", languages have, but rather to selectively pick the fewest features that would have the biggest impact. That's the aspiration, at least.
And honestly, in my opinion, that approach is just really proving itself right now. I am so happy to see Java seemingly on the right path again, adopting solid capabilities, innovating, but letting other languages take some punches first. We have had some dark days (maybe a dark decade), but full steam ahead now. Thank you pron and team!
It is not really innovating, if it has been battle-tested by other languages, is it?
While modern Java sometimes makes me think about reviving and modernizing an old project of mine, if Java is simply always behind by design of its evolution process, that makes it less likely, that I want to spend time with it.
Java innovates a lot in the runtime, where it's ahead of virtually all other languages in its combination of performance and observability. But the language very much tries to be conservative and not to innovate (compared to others, that is) for the simple reason that the vast majority of programmers prefer it that way, and Java is a mass-market language. The language itself is not supposed to excite or to challenge but to inspire confidence that you can build a 100KLOC-10MLOC piece of software in it and your investment would be safe 10, 15, 20 years from now.
This strategy has worked really well for Java, and it's worked really well for those who choose Java. Those companies who 10, 15, 20 years ago picked more exciting languages like PHP and Ruby are generally not as happy with their choices now as those who picked Java.
We fully understand that a minority of developers want more feature-rich, adventurous languages, and we're happy that the platform offers them such choices.
Perhaps it's not fair to blame the language, but the javax -> jakarta EE transition and Oracles license changes have broken a lot of faith in stability that must be frustrating as a language researcher with that goal.
Oracle's "license change" was to open source the entire JDK for the first time in Java's history (although many people tried to make sure this was misunderstood, which wasn't helped by Oracle's poor communication -- now that's frustrating), and we had no control over Jakarta's insistence to leave the JCP, Java's standards body that controls the javax namespace. Even Oracle only uses the java/javax namespaces for JCP APIs. They claimed it was too inefficient, but Java SE churns out two specification versions a year through the JCP, despite them being bigger and more complex than Jakarta EE.
> Oracle's "license change" was to open source the entire JDK for the first time in Java's history
The parent comment is most likely not talking about that. The "Oracles license changes" (plural) in question are things like the recent change to count all employees (even the ones who do not use Java) when calculating the price for an Oracle JDK license (see for instance https://houseofbrick.com/blog/oracle-java-pricing/), or IIRC an earlier change to require a paid license for newer releases of Oracle JDK 8 (which AFAIK is the version most companies use). It's true that it's easy to avoid all of that by exclusively using OpenJDK instead of the Oracle JDK (but then you find out that you have to download from AdoptOpenJDK instead of downloading directly from OpenJDK, and then you find out that it's no longer called AdoptOpenJDK, and now has an even weirder name), but it's also true that it's easy for an employee to end up downloading the Oracle JDK (especially when it's someone who learned Java back when the recommendation was to prefer the Sun JDK instead of an open alternative), which could expose the company to huge licensing costs. It can be less risky to just forbid the use of Java outside of carefully curated environments.
> and we had no control over Jakarta's insistence to leave the JCP, Java's standards body that controls the javax namespace.
It doesn't matter who is at fault; all the users see is a pointless namespace change, which breaks both source and binary compatibility for something which had been part of the JDK for many releases. To make things worse, it's not something which can be easily be worked around; when providing a library which can run with either the old or the new namespace, you have to provide two separately compiled JARs, and I've seen several popular projects take that path. It's very similar to the Python 2 to Python 3 transition, except that Python allows you to dynamically import a module at runtime (so you can easily create a compatibility shim), while with Java, the package names are fixed in the bytecode. The best workaround I've seen so far (though I haven't played with it yet) manipulates the class bytecode at load time to change the package names (https://github.com/eclipse/transformer).
> things like the recent change to count all employees (even the ones who do not use Java) when calculating the price for an Oracle JDK license
AFAIK, the change to the pricing model for support was done at the request of Oracle's support customers, as it's easier for them to track. I don't know why non support customers would care one way or another.
> but then you find out that you have to download from AdoptOpenJDK instead of downloading directly from OpenJDK
Of course, you can download any of the other builds if you prefer.
> which could expose the company to huge licensing costs
That's incorrect. The Oracle JDK is free to use.
> all the users see is a pointless namespace change
That's perfectly understandable, and users who are bothered by this are welcome to stop trusting the libraries that have chosen to do that to them. But the Java ecosystem is decentralised and Oracle has no control over third-party libraries. Allowing a library that chooses not to be an official Java standard to pretend that it is is also unfair. The maintainers of that library have decided that it would be better to inflict that change on their users rather than remain a Java standard, and maybe they know what's best for their users.
> That's perfectly understandable, and users who are bothered by this are welcome to stop trusting the libraries that have chosen to do that to them. But the Java ecosystem is decentralised and Oracle has no control over third-party libraries.
The libraries which did the namespace change are things like JAXB and JAX-RS and JAX-WS, which have been part of Java since Java 6. It's not some random third-party library, and Oracle did have control over them, until one day they decided they didn't want them anymore. As a part of Java, users did expect them to keep working (other than small changes like adding or removing a couple of methods, which became even less of a risk of breakage once Java 8 introduced default methods). The use of these libraries is pervasive all over the Java ecosystem, which leads to the complaint mentioned in the comment above: by first removing them on Java 11, and then forcing their new maintainer to rename the packages (instead of grandfathering them as an exception, or even better, providing an aliasing mechanism so that the renaming wouldn't break binary compatibility), Oracle broke some of the faith people had on Java's stability and backwards compatibility.
> and then forcing their new maintainer to rename the packages
Their new maintainers were not forced to rename the package; they chose to do it. That the java/javax namespaces are governed by the JCP (which will turn 25 years old this year), is well-known to anyone familiar with Java's governance, and I can't see a reason why an exception would be warranted if the maintainers had the choice of staying in the JCP; they weren't compelled to leave -- they wanted to leave. Why would a standards body lend its seal of approval to someone who no longer wishes to work in that standards body? "I don't want to follow the rule" is not a good reason to grant an exception from a rule.
we should probably also not blow this "problem" out of proportion. If you actively maintain your things, you can probably handle swapping a few strings out that are garantueed to give buildtime errors. If you dont maintain your application and run on antiquated runtimes? well good news, it keeps totally unchanged...
edit: okay, i guess if you do some funky stuff like reflection or having stuff in various config files, it might not give compiletime errors.. still though, recursive search and you'll find the spots in practically all cases
It's more involved than just "swapping a few strings". You also need to make sure that all the libraries you depend on are on the same side of the divide, that is, you have to "swap a few strings" and update all your dependencies at the same time (and these updates could bring unrelated breaking changes with them). And some of these dependency changes might be unexpected; you might not expect a general-purpose logging library to depend on that, yet one popular logging library has to be upgraded from 1.3.x to 1.4.x when changing the namespace from javax to jakarta.
It's even worse when your software can load plugins written by third parties, and your API allows these plugins to expose servlets or similar. You'd have to not only "swap a few strings" and update the dependencies in your software, but also make all these third parties update all plugins at the same time, with no chance of a gradual transition. It's a large amount of pain, for a completely unnecessary naming change.
well sure, I have been through this myself, but I have exercised some restraint in my dependency chain, and so should everyone else. Dependencies can be wonderful, or they can be liabilities.
nevertheless, I stand by my point, sure, you may have a little bit of a crappy time, but thats it. If you dont maintain your application? well you're every bit as fine as before. If you do maintain it? well then you had better be updating dependencies at some point, and sure, you might have chosen a dependency that breaks compat and such stuff, but then you have this problem ANYWAY, and are probably running old code with at best just irellevant bugs, but more likely various security implications (as the dependencies are probably not trivial, right? and those bugs might be like the insanity going on in a certain popular logging library...)
Your mileage may vary, but I went from liking simple languages like Java, to very expressive, complex ones (Scala, Haskell, Rust) back to Java.
I mean, I like all those languages (and many more), but the thing is, most programs really don’t need all that many language features — an okayish type system, OOP for encapsulating private vs public state, and preferring immutable public APIs have served me well in almost every problem thrown at me.
I would much rather take a “boring” language with an absolutely huge ecosystem and just get my work done. Especially that after going high-level/managed, there is no feature that would give any significant productivity boost (as per Brooks).
Smalltalk should be a no-brainer then. Arguably one of the simplest languages with merely 6 keywords and immense power. The ecosystem is of course a question.
I think I can understand the journey from Java -> other -> Java, because as you learn more about programming, you become more capable of solving problems properly in a simpler language. But there might be a point, when you get fed up again with writing the same old stuff over and over again, and journey on into languages, which allow you to abstract from syntax, making programming more pleasant. And then maybe back again, for type safety or another reason. It is very understandable to move between those preferences over time, or even have an inner conflict about what one wants to use for ones next project.
I really like Java. Also Go, C, LISP. I find most of the languages I like are very pragmatic. I'm much more interested in getting stuff done than being clever. I used to quite like C++ but it seems like a minefield of features and gotchas these days. I'm very much a fan of the principle of least astonishment. I feel like Java manages that well and not being a kitchen sink language aids in that.
Java is a simple language even if people build gigantic piles of abstractions in it. It doesn't have multiple inheritance, it only has like eight base types, only a few well-worn ways to control flow (while, do while, for, if, switch, throw, return).
It doesn't have operator overloading, or array access overloading. Attributes and methods are instantly discernible by the presence or lack of parentheses. It doesn't have templates or macros.
I don't know many languages that are as simple as Java. If anything Java being so simple syntactically is a major reason why it's so wordy.
It has quite a few keywords, you have primitives and objects and you call methods on those. Compare it to C#, Swift or C++, and it is insanely tiny compared to those.
Neither one of them I would consider small or simple, but you are correct, that Java has much fewer than C++ and I would definitely agree, that it is the simpler language.
Right, and that's why Java introduced checked exceptions in 1.0. A feature which wasn't proven back then and remains unproven now.
Another example being overengineered streams. Parallel streams look really cool as a demo, but have been proven dangerous in their default configuration (they can exhaust the default thread pool).
Checked exceptions are exact analogs to Result types, but better (auto-unwrap, bubble up, stack traces). It is unfortunately far from flawless (inheritance is not a great combo with it), but that has been part of the language since the beginnings.
Streams are not a language feature, it is only a library. And I honestly don’t find them over-engineered, they really make plenty of logics very readable.
Exceptions have implied control flow which makes them strictly worse than the Result types which are, as their name suggests, just types.
Imagine if some other common types had unrelated features like this baked into them. Want a string? Sorry in my new language all the strings need their own separate thread for some reason.
Actually sorry, I forgot, we're in a Java topic, Java actually did have features baked into types. All of Java's user defined types insisted on living on the heap. Just want to make a pair of integers? Sorry, that's an Object now and so it lives on the heap, whereas just one integer is fine, those are just local stack variables. Did they fix that yet?
>Exceptions have implied control flow which makes them strictly worse than the Result types which are, as their name suggests, just types.
If they're checked exceptions, then the control flow is hardly "implied". If anything, it's explicit: this method potentially throws X, so if X is raised, expect this control flow consequence.
I think the "implied" part is in the callsite ambiguity: a Java method marked with "throws X" can raise X at any callsite in its method body, whereas a Rust function of type `Result<...>` has each `Err` variant marked either directly with a return or with the `?` sugar.
The ideal is probably in the middle. Being able to see where exceptions can be thrown is very helpful if you're manipulating state which can be seen from outside the method, because you need to make sure you leave it in a valid state if an exception is thrown. But in a pure function it's mostly noise.
Checked exceptions are explicit. They are no weirder than doing do-notation in an Either monad, the syntax could be more ergonomic but they are clear as day.
> Exceptions have implied control flow which makes them strictly worse than the Result types which are, as their name suggests, just types.
There’s nothing wrong with implied control flow.
> All of Java's user defined types insisted on living on the heap. Just want to make a pair of integers? Sorry, that's an Object now and so it lives on the heap, whereas just one integer is fine, those are just local stack variables.
That’s not how Java works. The runtime is free to store things wherever it likes, with whatever representation it likes. It needs to follow the semantics of the standard, but whatever gets executed is what the JIT felt like emitting.
Checked exceptions were a fundamental mistake which basically makes it impossible to actually use (non-RuntimeException derived) exceptions reliably for their intended purpose. As soon as you introduce any higher-order programming (e.g. functions that take other functions as parameters) you start to have to wrap e.g. IOException... but that means that calling code can no longer catch that wrapped IOException via a simple "catch (IOException e) { ... }"... it instead has to catch whatever you wrapped that exception in... and manually look inside that Throwable's suppressedExceptions list. And so you end up with stuff like Guava's Throwables utility methods.
Of course, this probably happened because Java basically didn't have much in the way of higher-order constructs at the start, so they didn't notice that they were painting themselves into a corner... If they had a more powerful type system something usable might be possible, but as of Java right now, it just hopelessly broken.
I'd suggest that a fix to the langauge would be to simply change all exceptions to unchecked, but unfortunately that might break a lot of code where the above workaround has already been implemented once consumed libraries switch away from wrapping. (Note that it's only Java-the-language which has checked exceptions. The JVM doesn't and Scala/Kotlin are much the better for it.)
EDIT: The JVM is a marvel of engineering. Java... not so much.
> Of course, this probably happened because Java basically didn't have much in the way of higher-order constructs at the start, so they didn't notice that they were painting themselves into a corner...
The problem is generifying over checked exceptions (i.e. parametric polymorphism over checked exceptions), and there's actually an elegant solution that's been long known -- so the "corner" was well-noticed, as well as the means to get out of it -- but that requires laying some groundwork first which wasn't done because the ground wasn't ready, but it's getting there, so stay tuned.
> The JVM is a marvel of engineering. Java... not so much.
This was James Gosling's stated strategy and one of the secrets of Java's exceptionally-long-lasting success. While many developers like feature-rich, sophisticated languages, the vast majority of them really, really don't (although they are underrepresented on HN, certainly when it comes to language feature discussions). Many of the things people need can be hidden in the runtime, wrapped by a conservative, non-threatening language. Gosling called that "a wolf in sheep's clothing."
> The problem is generifying over checked exceptions (i.e. parametric polymorphism over checked exceptions), and there's actually an elegant solution that's been long known -- so the "corner" was well-noticed, as well as the means to get out of it -- but that requires laying some groundwork first which wasn't done because the ground wasn't ready, but it's getting there, so stay tuned.
Happy to hear something might be getting done about it, but users have suffered for 20+ years. I'm probably not going back to Java anyway :).
Any technical details you can share about a potential redesign? AFAICT checked-exception-like-things will always be at odds with the variance requirements on method signatures, so I'm a bit curious. (Outside of the natural solutions where variance isn't a problem, e.g. polymorphic variants or row types.)
Regardless, I'm waaay more interested in the JVM-level improvements that you & co. have been doing! Kudos! Btw, any inside tips on TCO? ;)
> [Gosling]
I have seen that talk, and I largely agree with the philosophy (assuming the goal is correct), but in this particular case, I think it was a mistake to let the problems fester for 20+ years rather than just admitting defeat and removing the checked-ness. If it's going to take that long to implement a solution to a constant pain point... the solution might as well not exist. (Perfect = enemy of good and all that.)
EDIT: Just to add: I definitely don't mean to cast aspersions on anybody in the Java language design team or anything like that -- if that's how my original comment came across. Everybody makes mistakes and predicting the future is hard at the best of times :). I just wish they'd reevaluated some things along the way...
It's nice to see from a theoretical perspective. But my impression is that the ecosystem is moving towards having no checked exception. What will happen to the existing code when generics works with union typed checked exceptions?
It runs counter to variance rules for method signatures; see pron's sibling comment and my reply to that. I'm not exactly sure what he's alluding to in terms of an actual solution, but he is an insider, so there may be hope for Java programmers :).
Anyway, the key point is that any implementation of a method will (almost by definition) need to be able to throw more types of errors than the writer of the original interface anticipated. That is the fundamental problem and unless you get rid of overriding there's no solution for that. The only solution is to force everything into a single super-type... but that just leads to "throws Exception" and we've been there before.
(I mean, there's approximations to solutions as in OCaml's polymorphic variants, row types, intersection types, etc. .. but those only work when variance isn't involved... or do away with most of the touted benefits of the "checked" part of checked exceptions, so...)
> One thing though: Please for the love of the internet, for those of you writing interpolators, DO NOT write an interpolator that picks apart the values passed in through a \{ ... } and starts instantiating arbitrary classes via Java reflection. Just stay away from that entirely, OK?
I think I’ve discussed this on and off with people for almost six years at this point. I’m so glad to see it’s finally happening.
Now we just need to combine this, the constants work that has been done over the last few years, and regular expressions, to make the regexp API so much nicer.
Over the 9 languages mentioned: 5 languages (inc. JVM ones like groovy and kotlin) are using `$`, 1 language (swift) is using `\`.
`\` is a pain to type on many keyboard layouts -- actually most but the US one.
It seemed to me that `$` would have been a much more "conventional" choice.
This really makes me sad. It looks like the choice was made on purpose to be different.
I assume because `\{` was not a valid escape sequence, which means any use of this character pair can be identified as a template without changing the semantics of existing string literals.
But you need the `FOO.` prefix anyways. So there was never any ambiguity. It seems that it would be good to either make the prefix optional (presumably something equivalent to the `STR.` processor would be default) or keep the interpolation syntax clean. (IMHO just `STR."Hello {name}!"` would have been ideal). But instead they require both the prefix and the ugly interpretation syntax.
PHP never stood out as a language with very clean syntax. It is very PHPesque to put the burden on the user of the language instead of going the extra mile and implement something that might be harder to parse, but would be more consistent. Inonsistency in general is one of PHP's issues.
if you ever had the misfortune of seeing the code for the parser and lexer of earlier versions of PHP youd see that it wasnt due to parsing simplicity but rather the author making poor syntax decisions due to a lack of understanding.
I don’t mind the typing ergonomy. More that it is visually ugly and counter-intuitive. Because in every other language a backslash would mean you want to keep the braces instead of interpolating the variable inside. They even have a comparison table at the top of the document, where they didn’t dare to put the Java option next to the others because it would stand out as being too different. Why not just follow the established conventions?
You just made me to get my MacBook out of closet :) (I stopped using it two months ago and moved to X1 carbon right before pandemic)
The ergonomic of \{X} is exactly the same on it as on my carbon x1, my keychron tls and four other keyboards I happen to have in my drawer.
Are You using modified Apple II like Rebecca Heineman or some kind of super small keyboard that looks like someone forgot to put all the keys in :) ?
Either way its on you not Java.
(Dont take this wrong way - this is an honest question - here in Poland almost all keyboard are backslash friendly and I would live to knowe where this is not the case)
It's a standard full sized external Mac keyboard for Norway bought directly from the Apple store. We share the keyboard with Denmark too I think. The Swedes and Finns have different keyboards, but I think they use a similar combination for backslash.
So the Mac issue applies to 4 countries in the Nordics at least. Please note that traditional PC keyboards use a different layout for some characters and may in some cases have a dedicated key for backslash too, in case you Google Norwegian keyboards. It's one of those Mac vs PC things we are used to.
Before anyone starts suggesting connecting a PC keyboard to my MacBook Pro, keep in mind that I use the built in keyboard almost 50% of the time and having two different key mappings is a hassle
Regarding backslashes and other special characters,
I never said it's on Java specifically, but the US centric culture in general. K&R used curly brackets when they designed C because they fit the US teletype character set of the 60s/70s, but that doesn't mean it's the best choice today.
I'm not suggesting we use special unicode symbols like arrows and emojis, but just pause for a minute and look beyond the US keyboard and see if we can find a solution that also works ergonomically for at least some non-US countries.
I press ⌥ + 8 for square brackets and SHIFT + 8 for parentheses.
It's probably not possible to find better alternatives for Mac keyboards that could be used for a language based on Lisp syntax.
The Mac keyboard simply doesn't have any dedicated keys that are not used for normal text, except for "@". For some unknown reason, they have a dedicated key for umlauts ("¨") that we don't use in our language. I guess I could map it to brackets, but SHIFT + UMLAUT returns a "^" that is often used in regexps and I don't want to lose that.
Windows keyboards has a different layout that is slightly more programmer friendly if you have a full sized keyboard. They do have a backslash key but lack the dedicated "@" key. Windows needs a dedicated backslash due to the file paths, unlike Mac. It is an acceptable tradeoff for Mac users, as non-developers are more likely to use "@" than "\".
Let's just say that there is a reason why I love my IDE that will automatically add closing curly brackets when it detects a block.
> For the syntax of embedded expressions we considered using ${...}, but that would require a tag on string templates (either a prefix or a delimiter other than ") to avoid conflicts with legacy code.
Can't the template processor expression itself function as the tag? Is STR."..." already legal now?
to be a compile-time error because it is missing the template processor (e.g. the `STR.` prefix). Since existing code like
String info = "My name is ${name}.";
is valid, they can’t use that syntax, or any other syntax that is currently allowed, as otherwise they would lose the ability to make it an error.
———
However, what they could have done instead is to use a syntax like
String info = "My name is "(name)".";
i.e. place the interpolated expressions outside of string literals. Slightly longer, but maybe more readable and typable, and currently invalid syntax. (The parentheses would be mandatory.)
It makes perfect sense to have the expressions outside of the string literals, exactly because they are expressions and not literal. Quotes express literalness, the opposite of evaluation.
This is simply replacing the existing
"My name is " + name + "."
by
"My name is " (name) "."
by eliminating the plusses, and adding parentheses to make expressions like
"My name is " ("John") "."
unambiguous (a string template with one parameter that happens to be a string literal). The parentheses also indicate that this is a parameter, like for a function call, that is immediately evaluated. (The whole string template feature is really just syntactic sugar for a function call.)
That way you don’t have to “interrupt” string literals with an expression. Instead you end the string literal normally and then comes an expression.
A string template would be defined as any sequence of string literals and parenthesized expressions.
I'm not a fan either. Java seems to be declining in prevalence in my corner of industry, I'm sure these changes are made by wiser minds than mine, but I'm sceptical about whether such a choice is really right for users.
I remain to be convinced and program daily in Java for a long time.
It seems just as difficult to remember/useless as lambda expressions that look fine as vanity one-line expressions and then get difficult to write for most programmers.
We don't really need vanity innovation in the Java world. I would have asked for priority on the FX replacement to Swing, or support for running on ESP32 IoT devices as replacement to the Arduino platform.
Instead we get some weird looking syntax for what is basically a non-problem.
JavaScript basically has the same functionality but with ${}. Anyone who uses named parameters in HQL or SQL queries would be very grateful for this feature as it means you no longer have to type out the same name three times for each parameter.
You could always use the + operator to join strings. Even a external library would do just fine when you want to insert some kind of variable name inside strings for that purpose.
Don't really get the need for this to be a core function with such a java-unlike grammar.
> We don't really need vanity innovation in the Java world. I would have asked for priority on the FX replacement to Swing, or support for running on ESP32 IoT devices as replacement to the Arduino platform.
Which are entirely unrelated projects to this one with very different engineers working on it. Those are easily parallelized, if you will. I don’t see why having progress completely elsewhere by different people is detrimental to these goals at all.
Because this thing here adds complexity to the language.
I'll be fine and ignore it completely like I've done with lambdas. I just pity the fools who enter Java world to learn the language and will suffer trying to learn what they believe is something normal programmers use.
Seriously, yes. Great thought process behind the design, marred by the good-for-nothing (subjective opinion) backslash.
If any of you design a language or a DSL, please, please - avoid the backslash - for the reasons stated above. Hard to type, introduces unseen problems, most will hate it. It (backslash) is unbecoming, of anything elegant.
But if you do follow that advice consider if it is worth just using a different escape character for all interpolations.
That way could still make use of this very sensible syntax of reusing the escape for interpolation whilst avoiding the backslash issues.
It’s not as bad as people make it out to be. Swift without any backwards compatibility constraints chose “\(val)”. It does need a slight getting used to but it is not any worse than ${}.
For another n=1 opinion, I like Swift’s choice, even though it’s different from convention. It’s lighter, typographically. ‘${‘ draws more attention to the delimiters than ‘\(‘.
And yes, the delimiters could be made less obtrusive using syntax coloring, but not all tools will do that (e.g. when using grep on a code base)
Swift’s choice has the advantage of visually unifying the two cases where text in a string literal is interpreted as something other than the text itself. One is backslash escapes like \n and \\. The other is string interpolation.
Slightly unrelated, but I only buy ISO keyboards and have my native tongue layout (Hungarian) and English easily switched. The former absolutely sucked for any programming work, so I think we just sort of have to accept that programming is done with an en-us layout, the same way that (hopefully) all identifiers/comments are also in English.
It's interesting how C# is always far ahead of Java, they introduced it way earlier, the syntax is simpler, and you can make is safe by using FormattableString as the param type, for example in EF you can do this without worrying about SQL injection:
A major shortcoming with FormattableString is that the C# compiler is hard-coded to always prefer implicitly converting an interpolated string to a string, which makes it impossible to write extension-methods for FormattsbleString objects.
…and they also always default to formatting with CurrentCulture instead of InvarintCulture. Apparently this was by-design as interpolated strings were never originally intended for generating machine-readable strings.
Finally, there’s no way to perform common “mini-templating” with a FormattableString, such as repeating regions, show/hide regions within a string, or little things like inflection (e.g. rendering “{0:N} items” or “{0:N0} item” when arg0 is 1 or not.
I’m happy to see Java (finally) gain a similar feature, but it, like C#,s, seems… limited in its abilities.
C# 10 introduced interpolated string handlers. They allow to address some of your points (you can handle format strings however you want and you could also choose invariant culture by default, without using FormattableString.InvariantCulture) and at the same time avoid allocations.
Whether or not a hoarder is "ahead" of you at having stuff is a matter of perspective. C# strives to be a very feature-rich language, while Java strives to be minimalistic. So it's pretty certain that any feature Java does end up adding will have been in C# first (although this one is not quite the same), but Java certainly doesn't want to go in the same direction.
On the other hand, Java's GCs, JIT compilers, and observability features are more sophisticated than .NET's. We like adding stuff to our runtime, while C# likes adding stuff to the language.
It doesn't eventually add the same stuff (even in this case, the feature is quite different from C#'s) as it will never add most of the stuff C# has because we want to keep Java as minimal as possible. But if your point is that Java usually only adds features that other languages already have, that is absolutely true and we'd like to keep it that way. We believe most programmers generally prefer languages with fewer features over languages with many features. As to it having worse usability, that's usually a matter of personal aesthetic preference, and there's no evidence to suggest that's the case by some objective metric.
Could a diet people would say "works well" be determined by looking at the dishes people say they love most? Topping the list would be foods that are too unhealthy or too expensive, foods that most would agree wouldn’t "work well" as a sustainable diet.
When you've followed such polls for many, many years, you see that programmers report they "love" languages that they pick more than those that are picked for them, that they tend to use for hobby more than work, that they tend to use in newer/greenfield projects, that they tend to use in smaller projects, and that they tend to use alone or in a small team. But the languages that end up working out well -- for the kinds of projects where Java is used -- are those that gain wide adoption in industry, that scale to large codebases and large teams, and yield programs that are maintained for many years. The languages that have managed to achieve that best are Java and C, and, to a lesser degree, C++.
If you've lived through the BASIC and later VisualBasic craze of the eighties and nineties, the SmallTalk craze of roughly the same era, the Haskell craze in academia in the mid to late nineties, the PHP craze of the early oughts, and the Ruby craze of the mid oughts, you see that the most "loved" languages are rarely the ones that end up working well.
Aside from a short ~5 year period (its first five years), Java was never a particularly "loved" language. Hell, James Gosling designed it to be a boring, conservative, "blue collar" language for work, and this work-in-teams-on-large-import-projects has been the main focus of Java's evolution. It's hard to argue that this hasn't worked out really well for Java. Even if Java isn't your cup of tea, you should appreciate the need for a language you can safely bet on to build a very important codebase that should be maintained for twenty years or more.
Having said that, I believe we should work on making Java more suitable for smaller, more fun projects without giving up its scalability. That's a difficult challenge, and, as far as I know, no language has ever achieved it, but perhaps Python is a reasonable source of inspiration.
Spears seems unsexy and rather limited compared to swords at an individual level, even a skilled soldier fighting with a long spear would probably lose 1v1 with a soldier who primarily fights with a sword. Put a few dozen of them in phalanx formation though, and the swordsmen don't even stand a chance.
I feel most programmers often judge tools primarily by how they function at an individual level and managers and CTOs pick based on how they perform at scale, and neither side understands the other's priorities much leading to so many debates on this aspect. So it may be unloved by individual programmers(I don't agree with most such people), but people who're responsible for delivering software absolutely love it imho.
Golang too fits in the same category, every time it comes up, half the crowd here likes shitting on it for one reason or other, but quality software is being written in golang in many places.
You-the-user can make C#'s safe, you-the-library-developer can make Java's safe. All you need is for one silly person to assign the C# query to a `var` beforehand and then the behavior's wrong. I'd usually be the first to point out how C# did it earlier and better, but in this case Java's manages to surpass even Rust's, which I didn't think was possible.
> All you need is for one silly person to assign the C# query to a `var` beforehand and then the behavior's wrong.
That would cause a compilation error. `DbSet<TEntity>.FromSql(FormattableString sql)` [0] does not have a string overload. Therefore, if a user assigned the interpolated string to a SQL query to a `var` beforehand, then there would be an implicit conversion to a `string` [1] which will be a type mismatch for the `FromSql` call. The user would have to assign the interpolated string to a `FormattableString` instead of `var` for the program to compile. There is also no implicit conversion for a `string` to a `FormattableString`.
There is a `DbSet<TEntity>.FromSqlRaw(string sql, params object[] parameters)` [2] method that explicitly documents that it is vulnerable to SQL injection.
> You-the-user can make C#'s safe, you-the-library-developer can make Java's safe.
But anyway, Roslyn Analyzers are built into the language. If you throw them away then you might as well call these new template processors off limits too.
It's interesting that this feature is essentially a clone of how string interpolation has been made extensible in Scala. However, the JEP only mentions Scala once.... in the form of an example which has been made strangely more complex than it has to be (it would be much shorter if they had used the s interpolator instead of the f interpolator).
The one major difference is that, because Scala has macros and typeclasses, it can turn malformed interpolations for things like JSON or SQL into compile-time errors instead of runtime errors.
> Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.[0]
So we can have this super powerful, extensible string interpolation system using never-before-seen syntax as a language feature, not a standard library extension, but not unsigned integers?
I'll echo what other commenters have said: This looks powerful but ugly. I'll add to it by pointing out that "powerful but ugly" is what we've already got. So we gain the ability to move format arguments inline with the string contents if we're willing to use the `STR."\{}"` syntax? That seems like a lateral move.
We could have unsigned integers but we're choosing not to because on the whole their disadvantages outweigh their advantages. On the other hand, once we have user-defined value types, you'll be able to define unsigned integers in a library if you want.
The extensibility and power of string templates were a requirement; security experts quite simply vetoed adding string interpolation as it's just too dangerous, especially in server-side software, where Java is mostly used.
> We could have unsigned integers but we're choosing not to because on the whole their disadvantages outweigh their advantages.
You (collectively) have this wrong; C, C++, C#, Rust, Go, et al. have it right. As long as we're just stating our beliefs outright instead of justifying them, that is. (That's not a request to justify your opinion; I'm sure it's been argued to death already.)
> security experts quite simply vetoed adding string interpolation as it's just too dangerous
Laughable. In an alternate universe where it didn't already exist, this could have been a JEP for StringBuilder and they would have vetoed that too.
> As to `STR.` -- which you won't need to use most of the time
Except for when you're trying to... make a String? Which is what ~everybody who ever asked for string interpolation in Java wanted to do?
What year would you estimate it will be when, at the average Java job, I can expect that most of the library APIs available to me which were often invoked with the form `func(String.format(...))` will have similar APIs that behave as JEP 430 template processors? Maybe after it's been full release for 3 years? 2, optimistically? Because during those years, in order to use this thing at all, people will be using `STR."\{}"`
Maybe you're right, but I think it's not so much a matter of what's right or wrong, but what's appropriate for languages with different audience size. Our experience is at evolving a language for a certain audience size, and perhaps if Java's popularity declines to the more modest market shares of the languages you mentioned, their choices may become appropriate for Java, too.
As to your other points, like I explained to someone else here, even three decades of experience evolving a language that's achieved a measure of success is still no guarantee that every feature will work well, which is why most big new features, string templates included, are first released as preview (i.e. a feature that may change). That allows us to test our hypotheses against the problems people actually encounter rather than those we or others may speculate they'd encounter, and adjust the feature accordingly before it is finalised. To the best of my knowledge, all preview features have undergone some adjustment before becoming permanent, but usually the changes required were relatively small.
Also, we try to think more long-term, because in ten years no one is going to care or even remember if a feature arrived five or seven years prior.
The problem with unsigned is the ridiculous underflow behavior. Unsigned integers are inherently unsafe without underflow checking at runtime. When you are using signed integers for a positive number, then seeing a negative number tells you everything you need to know (the program is wrong and it will most likely crash soon). However, when you are using unsigned to mean a number that can never be negative, then you can't just expect people to never underflow it even just to -1 because it will turn into a large positive number by accident. There is a semantic gap between the meaning of the type and what invariants it guarantees at runtime and that means having no unsigned integers is better for most code unless the unsigned Integer is fully underflow checked at runtime.
I think you are confused. Unsigned arithmetic is modular. Unsigned integers are both positive and negative.
0b1111_1111 is congruent to both -1 and 255 (mod 256). We simply choose the value 255 as being the principle value for coercions, but you could equally well choose -1.
My favorite example of this was the major outage at a Swedish futures exchange in 2012 where the system wound up treating an unsigned order quantity as "negative" and placed an order for 4 billion futures contracts: https://www.reuters.com/article/markets-sweden-bug-idUSL5E8M...
> unless the unsigned Integer is fully underflow checked at runtime.
Sure. That doesn't mean the datatype wouldn't be nice to have even with such checks. There are languages out there that fully check over- and underflow. Java could too.
Java still sits very close to JVM byte code, which is typed - adding unsigned ints would duplicate a huge deal of instructions.
So this feature has to wait for primitive classes, where it can be as simple as
primitive class unsigned {
int value;
}
(Though it would need operator overloading to be used as a+b, which is not really liked in Java designer niches (with many good reasons against it) :D )
Just take the operator overloading from Ceylon which has been designed to only allow mathematical use cases and harder to abuse for anything else. When you see an expression a + b it might not be two numbers but it will most likely be associative, distributive and commutative.
A lot of seemingly primitive Java syntax already expands to method calls: autoboxing, string conversions, string concatenation. Similarly, primitive arithmetic operators on unsigned integers could expand to calls to the corresponding Integer.{compare,divide,remainder,...}Unsigned methods. The compiler would turn these back into single machine instructions anyway. This would not need any new bytecode instructions.
Sure, though I do think that the currently existing rift between the primitive and object world should be healed first, to not leave behind another construct that will only be a backwards compatibility hindrance later.
(If I’m not mistaken, the existing unsigned functions already compile down to efficient machine code, so in the meanwhile that works)
Arbitrary overloading can be problematic, like !#< or whatever is quite hard to look up (scala, haskell), but I agree that implementing an interface/trait that will give you that syntactic sugar is mostly okay (like Rust, e.g. implements Add would give you + operator).
> We could have unsigned integers but we're choosing not to because on the whole their disadvantages outweigh their advantages. On the other hand, once we have user-defined value types, you'll be able to define unsigned integers in a library if you want.
Underflows are much more common than overflows (most numbers are small) and people again and again fall under the impression that unsigned integers are a good way to represent positive numbers, which is not at all what they do. E.g. C++ is very unhappy with the mistake they made of representing sizes with unsigned types [1].
Unsigned types are also primarily useful when the types have very few bits -- like in bitfields -- but Java doesn't have those, and the uses of unsigned types in Java would be mostly restricted to interaction with native code and hardware, that not many do and for which we have other solutions. So some very dangerous disadvantages and not many advantages for a language like Java.
> and the uses of unsigned types in Java would be mostly restricted to interaction with native code and hardware
And with network protocols and file formats, which also often use unsigned types. As an example, one of the many pain points when implementing JGit was the lack of unsigned types (https://marc.info/?l=git&m=124111702609723&w=2). Another heavy user of unsigned types would be cryptographic algorithms, but I expect these to be mostly implemented in other languages (like C, C++, or Rust), with a sprinkling of assembly or SIMD intrinsics in performance-critical loops, and called through JNI.
Java does allow you to work with unsigned types efficiently (look at the methods with unsigned in their names in the Integer or Long classes: https://docs.oracle.com/en/java/javase/19/docs/api/java.base...). But those uses are relatively specialised and the vast majority of Java users don't need that. So it's best to keep them as APIs rather than in the core language.
> E.g. C++ is very unhappy with the mistake they made of representing sizes with unsigned types [1].
95% of that unhappiness seems to be due to mixed signed/unsigned expressions with implicit conversions. Java wouldn't be forced to repeat the mistake of making conversions implicit.
The other 5% of the unhappiness is "machine integers are not mathematical integers". But that's the case for signed integers anyway. Overflowing your unsigned integer is almost certainly wrong, but overflowing your signed integer is almost certainly wrong as well.
> 95% of that unhappiness seems to be due to mixed signed/unsigned expressions with implicit conversions. Java wouldn't be forced to repeat the mistake of making conversions implicit.
Yes and no. As Stroustrup's document says, this is an inevitable result of allowing arithmetic on unsigned integers (i.e. subtracting an unsigned int variable with the value 2 from an unsigned variable with the value 1 has the same effect as explicitly mixing it with a signed integer by adding a -2). The only way to avoid that is to add runtime underflow checks to unsigned arithmetic. Once we have user-defined value types, people can choose to do just that, but there seems to be little point in adding it to the core language; the cost/benefit just isn't there.
> overflowing your signed integer is almost certainly wrong as well.
True yet less likely because most integer values appearing in programs are small, and so underflows are more common than overflows.
> True yet less likely because most integer values appearing in programs are small, and so underflows are more common than overflows.
That's a dangerous argument. The values might be small, until an attacker provides a large one. Your code has to work properly with all possible values, regardless of which ones are more "likely" or "common". In my experience, unsigned makes this simpler, since there are only two cases to consider (zero or positive) instead of four (zero, positive, negative, and INT_MIN); I've expanded on this in a comment here an year ago (https://news.ycombinator.com/item?id=29770689).
That's one way of looking at it; another is that the common bugs due to underflow with small numbers themselves create more exploitable vulnerabilities. Anyway, I think your point of view is valid, but it will have to become much closer to a consensus before Java considers changing the status quo and adding unsigned types to the core language.
> As Stroustrup's document says, this is an inevitable result of allowing arithmetic on unsigned integers (i.e. subtracting an unsigned int variable with the value 2 from an unsigned variable with the value 1 has the same effect as explicitly mixing it with a signed integer by adding a -2).
This is true for C++. It is a meaningless statement in a language where such mixing is not allowed. The semantics of unsigned modular arithmetic can be defined in its own terms, without dragging signed arithmetic and forced implicit conversions into it.
The semantics of "subtracting" (unsigned, modularly) 2 from 1 is UINT_MAX. This may or may not be what the user wanted. The semantics of subtracting signed 2 from a signed variable called `arrayIndex` with the value 1 is -1. This is almost certainly not what the user wanted. Why should the language police one case and not the other? It doesn't know the context.
> The semantics of subtracting signed 2 from a signed variable called `arrayIndex` with the value 1 is -1. This is almost certainly not what the user wanted. Why should the language police one case and not the other? It doesn't know the context.
Because in many situations it does know the context or figures it out quickly. Unlike the former case, if you compute a size and get a negative result you can tell it's wrong, and most uses would result in an immediate exception. It's much harder to detect a bad unsigned value. You can think of the sign bit as a bad value bit for values that are supposed to be positive. An alternative would be to add runtime checks that throw an exception on underflow, but that's already a much bigger change in the service of a feature that very few use to begin with.
Those few who wish to work with unsigned types can do so efficiently in Java with the unsigned methods in Integer and Long that were added in JDK 8 (https://docs.oracle.com/en/java/javase/19/docs/api/java.base...). Those relatively rare uses, however, don't merit support in the core language, especially when they're known to bring danger with them. A popular library may choose to return sizes as unsigned values -- like C and C++ did with size_t, something many consider a mistake (although there there are many more valuable uses of unsigned types in C/C++ thanks to bitfields) -- and now unsuspecting users who otherwise have no need for unsigned types, the vast majority of users, are susceptible to a new kind of common bug. It's just not worth it.
> It's much harder to detect a bad unsigned value. You can think of the sign bit as a bad value bit for values that are supposed to be positive.
You can also think of NaNs and infinities as bad value markers for values that are almost certainly supposed to be finite values. You can also think of the CPU's overflow flag as a bad value marker for signed arithmetic that wrapped around. In these cases Java happily allows me to compute logically incorrect values (I'm aware of Math.addExact etc.). Such bad value bugs happen, and sure they can often lead to exceptions, but not always, and often not immediately so that they are easy to track down. I still don't think unsigned values are somehow different in this regard and must be checked.
> Those few who wish to work with unsigned types can do so efficiently in Java with the unsigned methods in Integer and Long that were added in JDK 8
Those are helpful because they allow one to get things done, but from the point of view of usability they are the worst of both worlds. They are verbose and unchecked. Porting some tricky crypto or math intrinsic from a specification or a reference C implementation becomes unnecessarily hard: The verbose method names obscure the similarity in the code, and you have no help from the type system to check whether you used the unsigned variants in all the right places. I had the joy of fixing such a bug just a week or two ago.
If I'm doing unsigned arithmetic I need to keep track in my head which values to treat as unsigned. Any later comparison on such values should use Integer.compareUnsigned, but the language doesn't help enforce this. This is exactly what type systems are for. Sure I might still compute wrong unsigned values, bugs happen. But at least I would be applying the intended operations. Telling users to go ahead and compute unsigned values but then allowing them to apply logically incorrect operations on them is actually dangerous. Here you're suddenly not so convinced that the language should be strict?
> A popular library may choose to return sizes as unsigned values
Unlikely. If there are no implicit conversions between signed and unsigned, there's not much you could actually do with such values without lots of casts, unlike in C and C++. Users would rightly complain about making the library harder to use for no benefit. I don't see this happening.
In fact, almost the only thing you could easily do with such a size is do arithmetic on it and then pass the result back into the library as an index. If you got things wrong, the library will throw an IndexOutOfBoundsException. This is exactly what you said why it's OK for signed index calculations to become negative, because "most uses would result in an immediate exception". This would be just as easy or hard to debug as an AIOOBE due to signed arithmetic going wrong.
Anyway. I know I won't change your mind. I get that the feature is not important enough to change the language spec. It's just not worth it? OK, sure. But I don't buy the safety argument.
Python is also widely used server-side, and they introduced f-strings with simple and friendly syntax a few years ago. JS added template literals in 2015’s ES6, when Node.js was very much a thing. Why is Java special here?
Java's syntax is just as friendly and simple -- see my other comments on the subject -- it just requires the receiver to define a policy, which is essential for security. You only need to use STR when the receiver does not define a template processor and works with strings. I have no idea what Python's or JS's security experts advised, but that code injection is one of the most common vulnerabilities in memory-safe languages is a fact reported by all security advisories. String interpolation is one of the most dangerous features a language can have.
I’m not a fan of the special `log.info."x: \{x}";` syntax, it looks like a weird mix of field access and a string literal.
And even with the fancy new syntax, I’m sure there will be people passing STR."SELECT * FROM x WHERE y = \{y}" to their database. You can educate people all you want, but not all developers will read the docs telling them this is dangerous. Even if all the docs do the right thing, people might end up reading old tutorials, and will then notice the “inefficiency” of the prepared statement syntax and will just do a STR."" format. Or they might consider the database.executeQuery."SELECT \{x}"; syntax “invalid” and try to fix it.
Well, you know, even thirty years of experience as the custodians of one of the most successful programming languages is no guarantee of success with every feature, but that's why we have the preview process now that lets us figure out what problems people actually encounter -- which may well be different from the problems they, or we, think they would encounter -- before finalising the feature.
Just remember that you can't pass a string to an API that doesn't take strings, and even with existing APIs, methods that take strings can be deprecated to cause compiler warnings. So sometimes there are stronger ways than just documentation to discourage dangerous code.
If I had a cent for every time pron appealed to Java’s age and “popularity” in technical arguments, I would have at least enough cents to use a dollar-sign in string template syntax.
That's cute, but that appeal is crucial because technical arguments that are divorced from empirical results in the field are a great way to spend a lot of time solving the wrong problems. For example, you can make technical arguments in favour of either VHS or Betamax, but while they both had superior solutions to two separate technical problems, one of those problems (recording time) ended up being a much more important one for more people than the other. The purpose of engineering is not to enjoy intellectual puzzles, but to solve problems for people. You can only insist that the team that's at or near the top of the charts for so many years don't know what they're doing so many times. After a while, you may want to reconsider your beliefs about what the right approach is.
Those who misunderstand that will forever be puzzled by why it is that the VHSes usually win over the Betamaxes, and will continue betting on the wrong horses.
The relatively new concept of preview features helps us gather information on what problems people actually encounter when working with a new feature as opposed to what problems some speculate they'll run into. The best way to influence Java's direction is not to speculate, but to report problems you've actually encountered to the relevant OpenJDK mailing list (amber-dev in this case). If the use of the \ or $ character turns out to be a real problem, we expect to find that out during the preview process (which will begin in JDK 21, out this September) and we'll reconsider, only we'll do it based on reports, not speculation.
Well, I’ll agree that one shouldn’t be naive about the “technically superior” arguments. One has to look at why certain things succeeded. In an honest and non-ideological way (e.g. don’t blind yourself on the “technically superior” talking points, whatever that means).
I don’t really have a problem with a backslash over a dollar-sign. I just couldn’t resist...
When I want to write a query, I will use a prepared statement or read the docs of a library I am using to interact with a database, whether that library automatically creates a prepared statement. If I use a prepared statement and later read that code, I never again have to ask myself, whether it is safe. Quite frankly, who does not know about injection and prepared statements should not work with databases.
Isn't that better solved by not allowing templates to be passed around and blanks filled in later but instead force them to be executed immediately? This is the case in Ruby and while there have been plenty cases of code injection, I've never seen one using the basic string interpolation to do that.
It's not a better solution but one part of this solution (which also requires templates to originate in the program rather than as input by requiring literals). The rest is done to ensure proper, context-specific quoting/escaping for machine-interpretable constructs, such as JSON and SQL.
We want to make the safe choice the easiest choice, and certainly not the trickier choice. We do that by adding a small tax to the dangerous choice -- i.e. string interpolation. When this feature previews we'll know more about how well this works and can then adjust based on people's actual experience.
Are they trying to make Scala look more complicated than the other languages on purpose? The Scala example f"$x%d plus $y%d equals ${x + y}%d" could be written simply as s"$x plus $y equals ${x + y}"
It's funny too because the Scala ecosystem has had safe string interpolation for years. e.g. with Slick sql"SELECT * FROM Person p WHERE p.last_name = $lastName" will produce a parameterized query/prepared statement that you can run, and it does this at compile time so runtime injection like log4shell is impossible.
One of the best application I've seen is structured and contextual logging [1]. Combined with implicits, this allows you to require a right renderer or/and automatically get a current state of an execution context.
String name = "Joan";
String info = STR."My name is \{name}";
I don't love it, to be honest. `STR` is a bit much. Requiring the \ for the first but not second brace. I want string templates in Java, but this feels ugly.
STR is not some special syntax, just a receiver for a method call. You would only need STR to interpolate a template into a String, but any receiver can specify that interpretation of the template -- or a different one. So, for example, a logger could use:
log.info."x: \{x}";
and similarly for other uses that don't produce strings but JSON, SQL etc. So STR would only be used -- hopefully rarely -- for old APIs that have not determined their own template interpretation strategy and only accept String.
The use of the backslash is important to distinguish between a string literal and a template literal because "x: \{x}" is not a valid string literal today. Swift uses \(...), BTW.
Because it doesn't hurt, and because it will take some time until all relevant libraries expose their chosen template processor, reducing the need for STR, the tax on less restrained uses needs to be neither too high nor too low.
As far as I understand the backslash is a mark of a variable. The problem that it looks like the escape symbol and this is really confusing. What about another symbol? #,%,|,<...
Of course, $ is ideal because it's already everywhere and familiar.
I guess a typographer of a font designer would be helpful to find a good solution.
> As far as I understand the backslash is a mark of a variable. The problem that it looks like the escape symbol and this is really confusing.
No, it is the escape symbol and works just like the escape symbol because it is the escape symbol.
The escape symbol \ means that whatever follows should not be treated literally but interpreted in some special way. So "\n" means "not the letter n but rather a newline", and "\u1234" means "not the letter u followed by the digits 1, 2, 3, 4, but rather the unicode character U+1234". And in exactly the same way, "\{...}" means "not the opening curly brace followed by some stuff and then a closing curly brace, but rather special semantics for the expression (not variable) within the curly braces".
> Of course, $ is ideal because it's already everywhere and familiar.
It is already everywhere, including in existing Java strings in existing Java code, and the semantics of that existing code must not change. The very fact that it is already everywhere means that it is not ideal in the context of retrofitting this feature onto Java.
Yes, it is a very tiny new syntax so that templates can be distinguished from string literals (so string templates can never stand without a template processor, hence the RAW)
It is indeed ugly. Having switched full-time to Kotlin a little less than a year ago, I honestly prefer their approach to String interpolation using `${var}`.
I understand Java has a backwards compatibility issue that Kotlin does not though, which would make breaking changes to all the hard-coded strings containing `$` that would now need to be escaped. This is discussed in the JEP.
Backwards compatibility - simultaneously Java's strongest and weakest asset.
Im guessing that the second backslash was deemd unecessary (and in my opinion it is - no need for additional key strokes in the name of some arbitral consistancy).
Besides you are probably looking at this wrong - backslash here is equivalent of hash or dolar sign - it denotes begining of expression (just like most languages do). And they literally could not use another one because of backward compatibilty it they want to keep the "no STR"/simple versuon of interpolation possible. They will just add new escape sequence /{ - its kinda brilliant if you think about it.
I tell you, if the longest lasting super popularity of any programming language in the history of computing (with the possible exception of C) is what failure to get anything right looks like, I hope we keep getting things wrong just like that for decades to come.
But to those who ever wish to have another language that's as successful as Java, I would suggest studying what it is that Java values, which might explain why is it that almost twenty years after having died at the hands PHP, and some years later being finished off again for good measure by Ruby, and then had its coffin nailed shut by Node, Java is still more lively than its supposed heirs.
My fear is that Java is now only popular in a professional setting, and is no longer of interest in the broader community. This is sad to me because I personally love the features that you all have been delivering in recent releases, and am very excited about the ones coming down the pike.
A few circumstantial examples. If you look at the Matrix SDKs and implementations (https://matrix.org/docs/projects/try-matrix-now/), Java's presence is sorely lacking. The initial server implementation was written in Python of all things and the rewrite in Go. This is baffling as a Java implementation seems like a no-brainer to me.
In addition to working for a variety of corporations, I have also done some work for academic institutions, specifically in the library space. They have a lot of old tools from the 2000-2010ish range that are predominantly in Java. However, now, these institutions are primarily using Python and Javascript, and are even struggling to find Java developers to maintain their old infrastructure.
fwiw, the team that created Matrix almost exclusively used Java serverside from 2003-2014 (when we switched to creating Matrix). The last gen of Java servers we wrote were super efficient and nice to maintain thanks to netty.
The only reason we switched to Python and Twisted for the first gen Matrix server (synapse) was for rapid prototyping using a platform that we reasoned the open source and selfhosting community would already have installed and be comfortable with. Java felt way too enterprisey and non-open-source-friendly, making quick tweaks to the codebase a huge pain, not to mention the verbosity of the language. The team agreed that expecting casual folks to install and use a JVM just for a chat server would be a major turn-off, and we continue to feel that was the right call.
I think this perfectly demonstrates my point. Thanks for sharing. Java was perceived as "enterprisey and non-open-source-friendly".
It's also interesting to hear that a Java deployment was thought to be more difficult than a Python deployment. My baised opinion is the exact opposite. Java services are generally incredibly simple to run, especially if you make a fat jar executable.
There is no longer any need to download a JDK (which, BTW, also no longer requires any installation) to use Java. Applications are now encouraged to bundle a custom runtime which, thanks to jlink, can be quite small; usually smaller than a Python runtime (~40MB for a runtime suitable to many or most servers).
You have a point, which is why making Java easier to learn and easier to write smaller, less "serious" software is one the areas we're focusing on at the moment. You'll see some of the relevant features appearing very soon. Some of the relevant enhancements are on the language side (e.g. https://openjdk.org/jeps/8302326 with more in that area to follow soon) and others on the tooling side. Stay tuned!
Yeah, the choice of \{ to create an expression instead of printing a literal { character is an odd one when you compare how you output a double-quote character:
> To aid refactoring, double-quote characters can be used inside embedded expressions without escaping them as \".
I would expect for consistency that \"{expression}\" outputs an expression and \"\{expression}\" would not.
I maintain the lit-html template library which uses JavaScript tagged template literals to embed HTML in JS.
A key feature of JS template literals is that the tag receives the string fragments and values separately, and can return non-string results. This means we can escape values and validate template structure to prevent XSS attacks.
This approach in Java should lead to a lot more lightweight but safe DSL embeddings.
(I don't love the syntax, but... it's Java)
edit: the other critical feature of JS tagged template literals, that I'm not sure this has, is referential equality of the template strings passed to the tag function across multiple invocations.
This is required to be able to do one-time preparation work on a template and re-use that with different sets of values. Think a `SQL.` template processor that turns the strings into a prepared statement once and re-uses that for each query with different parameters.
edit 2: awesome, it does:
The fragments() of a StringTemplate are constant across all evaluations of a template expression, while values() is computed fresh for each evaluation. For example:
One thing I'm not sure is possible in this proposal, is there a way to make the interpolation "lazy", such that the string (and the evaluation of its interpolated components) can be skipped if the string isn't ultimately used?
In swift, there's some nice quasi-laziness you can add to function parameters, so that (say) a logging function that can fully skip evaluating a string sent to it with the `@autoclosure` syntax:
log(message: "User is \(someExpensiveFunction(user))", level: .debug)
And if the configured log level does not include debug messages, `someExpensiveFunction(user)` doesn't get called.
This works because @autoclosure lets you take a parameter that is "function returning String", but callers can just pretend they're passing it a String, without having to decorate it in a function. The compiled code will turn it into a closure behind the scenes, and thus it'll be evaluated lazily.
Not sure if there's any way to do something like this in Java with this proposal...
That would require explicit support by each template processor implementation. The Object#toString approach works independently of the template processor (assuming they all call toString() on the parameters).
Since the string template interface is untyped (parameters are arbitrary Objects), having case distinctions based on the type parameter is fiddly. What if you pass an instance of a class that has a genuine toString() but also happens to implement Supplier<Foo> for some reason? Or if you pass a Supplier<Supplier<Foo>>, will it recurse? What if you pass a Callable or a Future? Will those work as well? This only creates the opportunity for opaque “magic” behavior.
The question was whether string templates will support laziness, not whether whatever the string template is eventually used for supports laziness. String templates supporting laziness would make you independent from what the point-of-use supports.
I think that's the wrong question. String templates don't need to support laziness. It's a well established pattern in Java that laziness is done via Supplier's and it directly solves op's question about logging where the level isn't enabled. This is exactly how logback and other logging implementations handle it now.
I thought the choice of STR was ugly at first but as you continue reading and see how that allows multiple options that provide more than generic string joining it really starts to make sense.
Unlike many here I do like the choice of \{}, although personally I would have preferred \() like Swift.
STR seems to function like a class name, where you’re calling a static method. In fact that’s literally what you’re doing, but it seems to be syntactic sugar that you don’t have to write the method name and the parentheses.
Since class names start with a capital it fits. At least I think that was the reasoning they stated.
Why not Str? Perhaps they thought this would be less likely in existing code. I’m not sure I don’t think they mentioned the reason.
They could have also just put it on the String class. Again I didn’t see a discussion as to why that wasn’t suggested.
String.”Hello, \{name}.”
Seems like it would have been reasonable. But this is a proposal so it may change before officially joining the JDK.
It would be more readable to have all the values in their respective place in the SQL string, rather than have a bunch of question marks followed by all the values bunched together at the end. e.g. this:
PreparedStatement query = SQL."SELECT * FROM users WHERE firstname=\{firstName} and lastname=\{lastName} and email=\{email}";
rather than:
query = client.prepare("SELECT * FROM users WHERE firstname=? and lastname=? and email=?",
firstName, lastName, email);
This literally even has examples in the JEP, one can create custom template processors which don’t even have to return a String, but can return a PreparedStatement instead.
That database stuff looks horrible. Why do they feel the need to introduce DB query templating into string templates? No matter what you do on the client side, the database engine itself should escape/validate the data. Didn't we learn that lesson with PHP?
In addition to that, not every database needs prepared statements for safe queries e.g. "Parametrized queries" in PostgreSQL (available in libpq as PQExecParams and exposed in many other higher level languages)
That is exactly the point, you should not use general string templating system for SQL queries, together with "roll your own" escape and validation mechanisms. I really don't see why they included that part, if not to show how to shoot yourself in the foot.
I am very confused by your comments. PHP developers thought "sanitizing" strings aka escaping and validating strings is enough to get rid of SQL injections and that is how they ended up with multiple iterations of escaping functions. The problem, which is the separation of code from data, has not been solved and that is why it is a bad idea. The SQL example template in the article uses positional parameters via JDBC and is therefore completely safe to use. It is impossible to get it wrong except by using STR which is obviously the wrong template processor.
>No matter what you do on the client side, the database engine itself should escape/validate the data. Didn't we learn that lesson with PHP?
You apparently didn't learn any lesson from PHP. The impossibility of the database engine to distinguish a code from a data character is what lead to SQL injections in the first place.
It doesn't matter whether you replace the template expression with a ? or with $1. The database receives the parameters outside the SQL query and treats them as user input either way.
You have to escape the things before they reach the database engine. Prepared statements do this via ? Placeholders but they are hard to read. With this you get named placeholders, much nicer to read.
All strings are templatable in Kotlin. No need to have different types of syntax for that. What purpose does this s or STR. prefix serve serve? Seems completely redundant to me to have that.
It helps to actually read the article. There will be multiple template processors, not just one. This proposal goes beyond simple string interpolation and adds security measures against injection attacks.
I'm mostly developing in Kotlin btw., but usually when Java introduces its take on a feature that has existed for a while in other languages, they improve on it. One such example: handling whitespace/indentation in multi-line strings in Kotlin vs Java. In Kotlin you usually have to call .trimIndent(), in Java the defaults are sane so you don't have to.
The article in this case is a longish specification. I skimmed through it. It looks over engineered to me. That's my honest first impression. A dead give away for that is that it needs a longish specification.
And I indeed missed the bit about a pluggable template engine. Just one question related to that: why? I don't see the big use case for this. Maybe a bit less flexibility and a more sane syntax would be better?
As long it is technically solid, ugly is just fine in places where Java is used most. Even if it were most lovely syntax Python/JS devs are not gonna jump to code in Java.
I love it. The design choices are great, given the number of constraints. Looking forward to IDE and Maven plugin support of this feature, which can add a lot of juice in pre-compile time enabling very well integrated DSLs.
The template instantiation itself is done by the language, not libraries, and the entire mechanism was designed for security (read the JEP); for example, templates are (virtually) limited to literals and can't come from user input.
As to boxing, the built-in template processors, STR and FMT, don't do boxing (they use MethodHandle mechanisms similar to those used by lambdas) -- FMT is ~40x faster than String.format, I'm told -- but that capability hasn't been exposed as an API yet. As with other features, we try exposing basic usages first, and more sophisticated ones later.
> the entire mechanism was designed for security ...
I don't have the same definition of security (i've read the JEP). Unlike TypeScript, you can not type the template values.
> built-in template processors, STR and FMT, don't do boxing ...
but all the others user-defined template processors (think a logger) are second class citizens and will do boxing.
Efficient String interpolation is hard without macros, that's why it's a macro in Rust.
This JEP tries to solve a harder problem, user defined template processors but do not provide the tool to make them efficient and never will. Are you suggesting that there is a JEP about macros in the making ?
> Efficient String interpolation is hard without macros
Not quite. It is hard without some compile-time computation, but what's compile-time for languages relying on AOT compilation (and requires macros or compile-time introspection as in Zig), happens at runtime in Java because that's when the optimising compilation happens, and Java has a user-exposed mechanism for defining call-sites (https://docs.oracle.com/en/java/javase/19/docs/api/java.base...). It's just that, as with most of our new features, we expose a simple API first and a sophisticated API later. No need for macros.
This isn't exactly what I laid out, of course, but I think it achieves the goals I was looking for, which is the real issue. In particular, having the default string interpolation be prefixed with "STR." is enough for linters and scanners to get a chew on (it's easy by scanner standards to track that a STR.-interpolated string got fed into a database query), and for code reviewers to develop an instinct to look at such interpolations more closely than they need to for a DB. interpolation. An STR annotation is not technically necessary for the former, if it were simply the default, but it is a big deal for the latter. I want people to have a chance to notice and think about their use of bare string interpolation for at least a fraction of a second as they type "STR." (or autocomplete it or whatever).
This does put a heavy burden on the libraries to implement it correctly, but if they do it's even safer than what I was thinking.
One thing though: Please for the love of the internet, for those of you writing interpolators, DO NOT write an interpolator that picks apart the values passed in through a \{ ... } and starts instantiating arbitrary classes via Java reflection. Just stay away from that entirely, OK?