Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Proposal to change default annotation processing policy in JDK 23 (openjdk.org)
94 points by mfiguiere on May 28, 2024 | hide | past | favorite | 107 comments


"This policy of implicitly running annotation processors by default may have been reasonable when the feature was introduced in JDK 6 circa 2006, but from a current perspective, in the interest of making build output more robust against annotation processors unintentionally being placed on the class path, the policy should not be the default anymore."

This makes sense, especially in light of the growing number of supply chain attacks. However, given how many libraries a typical Java application has in the classpath, most of which are transitive dependencies of popular frameworks, I foresee many build jobs not producing working binaries anymore when JDK23 is adopted. Plus, I'd love to be proven wrong, but I could bet that most people will fix the issue by simply adding the "all-on" switch to the command line instead of carefully evaluating which annotation processors are really needed by their codebase. So I am afraid that in the long term this change will not improve security or reduce build times. Unless maybe JDK 23 comes with some tool to print a catalogue of the annotation processors found in the classpath, with their respective purpose and documentation, so that developers can make an educated guess about what they need or don't need.


> but I could bet that most people will fix the issue by simply adding the "all-on" switch to the command line

Definitely this. This is a dev experience problem.

> Unless maybe JDK 23 comes with some tool to print a catalogue of the annotation processors found in the classpath, with their respective purpose and documentation, so that developers can make an educated guess about what they need or don't need.

My thoughts on as to how a solution to the dev-x problem might look like go in the same direction: have a way to provide a file with an accept list, another file with a deny list (vetoing accept list entries when matching both) and, this would be the key devx feature, a mode to automatically populate the accept list file from the full scan. Teams who'd be tempted to run the "all-on" mode could keep it running on auto-populate accept file, put the auto-populated accept file in versioning and see additions to the processor zoo in commits. And de-trusting a processor (or speeding up the build) would be as easy as copying a line to the deny file.

Minor improvement that would probably come up at some point: the auto-generated accept file should better not automatically exclude entries exluded by the deny file so that replacing the deny with a subset from accept remains a straight-forward option, but chances are you might encounter some situations where some dev environments have a processor that other build environments don't have (this would not exist in a perfect world..) and that would cause undesirable changes to the auto-generated accept file. So you'd want a stronger deny option that forbids already during the scan, either with some special syntax in the deny file or (better I think) an optional file for "even higher priority deny".

As to auto-generating documentation for the educated guess: clearly a trade-off between content and conciseness. My vote would be offering a way for processors that want to be helpful (or misleading, in the attack scenario!) to supply text for trailing line comments in the auto-generated accept file. For teams who do want to use the auto option. (trailing line comments are such an under-utilized magic compromise for conciseness/volume conflicts, probably because they aren't for readers running their editors with auto-wrap)


> My thoughts on as to how a solution to the dev-x problem might look like go in the same direction: have a way to provide a file with an accept list, another file with a deny list

I'm not involved with this particular change so I'm not familiar with its details, but when it comes to other changes restricting operations that involve extra risk (such as JEPs 260 and 472) the operations are enabled on a per-module basis and, as always, any part of the command line configuration can be easily put into configuration "@files".

One thing to keep in mind is that such configurations should be kept simple because when things become complicated misconfiguration becomes common.

Another thing is that restriction should be the default, especially in situations where most people don't need to disable it. Otherwise things may creep in without the application owner's knowledge (this is like transitioning between high and low entropy; it's always better to start at low entropy, as going in the other direction requires energy).


With respect to auto-generating documentation, I was thinking about the possibility for annotation processors implementors to add javadoc comments or yet another annotation to their processor that JDK could use to dump a list of processors found in the classpath as a table with some information.

> javac <other parameters> -proc:info

some.library.Proccessor:

    scans classes annotated <whatever> and produces a file in META-INF/services
    for <some.library> to be able to load your implementation via ServiceLocator
some.otherlibrary.OtherProcessor:

    .....
With that table creating your inclusion/exclusion files would be easier


> Plus, I'd love to be proven wrong, but I could bet that most people will fix the issue by simply adding the "all-on" switch to the command line instead of carefully evaluating which annotation processors are really needed by their codebase.

That’s ok!

For those people, they hit a small speed bump where they have to make a small change to their build. For those of us who care, we get a bunch of control we may choose to exert.


> As of the April 2024 JDK security updates, support for the "-proc:full" option has been backported to 17u (17.0.11) and 11u (11.0.23) for both Oracle JDK and OpenJDK distributions.

Someone tell me again how LTS isn't a think for OpenJDK.


I read this as "the updated versions can now tolerate the option that makes future versions behave the same without calling out an error: invalid flag"

This is exactly how you want an LTS to be supported long term.


OracleJDK has LTS defined. OpenJDK doesn’t, but as changes made to forks have to be migrated back to the original codebase, it does tend to be a “de facto” LTS version.


I assume this is to prevent a file-dropping attack, similar to DLL injection.

How hard is this to exploit in practice? Does javac include the current directory in the class path? Does it look in other directories that are easy for other users to drop files in?

Also, how much are people running javac directly? I would guess a lot of people use build tools like Gradle or Ant that limit the class path, right?


How hard is this to exploit in practice? Very - this is a silly update in OpenJDK's war against things like Project Lombok.

It _seems_ easy to exploit: Just.. get any jar file containing an annotation processor on the classpath and it will be executed as part of `javac` - and almost every java build tool calls javac under the hood.

However, this is misleading: _if_ somebody with malicious intent manages to either sneak a jar file into the build dependencies somehow, or manages to libxz-style take control of a commonly used dependency, the damage is done. That jar will also be on the classpath when running the app. So now we're running compromised code. Java is not a sandboxed thing, running malicious code inside a JVM is a bit like an XSS web attack: The game is lost. Totally and utterly.

The fact that javac runs annotation processors by default until JDK23 does mean that a compromised build chain now runs _during the build_, whereas starting with JDK23 they'll only run when you run the app.

This doesn't seem impactful to me; I'm having a hard time figuring out scenarios where this is meaningfully less bad. Developers tend to run the app they are writing. The odds a developer will build a source tree and never actually run what they built, seems insufficient to consider this a meaningful contribution to security.

CI servers might be a useful place to look for 'systems that will run the build but will not run the app', except - no. Just about every CI tool will run some tests, thus, running the app, thus, running the malicious code.

Which gets us back to: OpenJDK's backwards-compatibility breaking crusade. The OpenJDK team has also broken reflection (you can no longer access anything in another module that wasn't explicitly exported without command line switches, even though reflection used to be able to do this. Reflection has 'CARE! You are accessing APis that were not designed to be messed with!' written on the tin. It's.. the point of it). - same reasoning. "For security" without being particularly clear about how that update contributes to security. It's not about 'it is impossible to use reflection to cause serious damage' (it is very possible to do that, in fact). It's more about: .... if you are running malicious code inside a JVM, we've got much, much bigger problems.

It's sort of like stating that security is improved by ensuring that it is no longer possible to open the front door from inside the house without a key. Seems nice - but, they're... already inside the house.


> OpenJDK's backwards-compatibility breaking crusade. The OpenJDK team has also broken reflection (you can no longer access anything in another module that wasn't explicitly exported

I’ve been out of the Java world recently, but my understanding is that an application developer can still do anything they want. All of these new restrictions are for library developers. To be clear the library developer can still do anything, they just have to make it explicit.

This is the opposite of a backwards compatibility breaking crusade. This is a crusade to make sure application developers don’t accidentally depend on libraries that are either breaking encapsulation or depending on JDK internals. This should improve backwards compatibility.


Slightly OT, but sincere thanks from me to you for creating Lombok.

It's a tool that sparks strong opinions, which is a testament to its significance and the impact it has made. You've created something that people are passionate about, whether they're for or against it.

Personally, I'm firmly in Team Lombok. I believe the negative feedback it receives is disproportionate.

For those of us who use it, Lombok significantly improves our coding experience. For those who don't, it's entirely optional and doesn't interfere with their workflow.

Thanks again for creating something that made me enjoy coding in Java :)


Those of us that are not on team Lombok still have to deal with it due to transitive dependencies.


Complain to the author of your direct dependencies (or better send a PR if open source). There is absolutely no reason why lombok should get pulled in as transitive dependency apart from ignorance on library author side.


I rather get rid of them when possible, instead of indirectly contributing to lombok's existence.


It's clear you are just spiteful and have absolutely no clue what you are talking about.

Lombok does not get pulled in as a transitive dependency unless the author of your dependency fucked up their build. RTFM and you're good. Even if they fucked up, you can just exclude the transitive dependency. Easy.

Lombok produces bytecode that could have been written by hand. You will not be able to tell the difference when using these compiled classes.


Except you are forgetting that building dependencies from source is also something that happens in real life, as it is opening such projects in IDEs.

In one thing you are right, I make my life easier by choosing other alternatives when possible.


Wouldn’t BFS only require Lombok’s presence at the build step rather than after deployment?

Or is your stance here against it showing up a a transitive during development when including sources written with Lombok? If so, that’s pretty extreme; the build dependency surface of many packages is really huge. Do you have issues with other large or opinionated build depends (protoc, schema downloaders like buf, test frameworks)?


Rather uncommon in the java world. Anyway, you can just delombok if you hate it so much.


There are multiple attack vectors via the supply chain that this new settings prevents.

Sure, if the compromised library is one of your core libraries that runs during build, test and runtime, then this does not help. But there are

  test libraries only
  compile time only libraries (hello lombok?)
  transitive dependencies that may not be used during runtime (or run only in rare code paths)

Sometimes compromising the build environment is more valuable than the app's runtime environment - e.g. it may allow the attacker to compromise all apps.

Explicitly enabling a particular annotation processor I want to run is small price to pay for the increased security.


> test libraries only

Typically runs during build (unit tests), just like annotation processors.

> compile time only libraries (hello lombok?)

Right, annotation processors. This is what we're discussing.

> transitive dependencies that may not be used during runtime (or run only in rare code paths)

Irrelevant. If they are compromised they will just set themselves up as an SPI and run on JVM start.


Had some developers not starting to Monkey patch this wouldn't be needed, Java isn't Ruby.

Go is now going to do a similar approach, because internals and not being public symbols apparently isn't clear enough.


Public/private does nothing to change the problems a developer has to deal with. On the JVM side the most widely "hacked" private API was sun.misc.Unsafe, which just could not be implemented using pure Java and Oracle did not design public replacement APIs until the restrictions on it where anounced.

> because internals and not being public symbols apparently isn't clear enough.

You might as well tell a starving person that eating bread is illegal.


> Oracle did not design public replacement APIs until the restrictions on it where anounced.

The very opposite is true. All the remaining required replacements for Unsafe were put in place in JDK 22, and access to Unsafe has been unrestricted and unencumbered in any way until the upcoming JDK 23 [1]. It is precisely because we know people have come to depend on Unsafe that we had not started to restrict its use until all replacements were delivered.

> You might as well tell a starving person that eating bread is illegal.

Also no, because no functionality has been taken away. The only thing that has been taken away is the ability of a library (possibly a deep transitive dependency) to unilaterally use internals (again, not Unsafe, which wasn't encapsulated) -- something that has a global effect on the application -- without the application's knowledge and consent. In other words, doing the thing is not illegal; what's illegal is doing it without the application's permission.

If a library is, indeed, trusted by the application to do things that carry special risk, then it's easy for the application to grant it permission; if it isn't trusted, then surely there is no argument that the trust should be implicitly taken rather than given.

[1]: https://openjdk.org/jeps/471


Private APIs have been starting to be restricted in previous versions already, requiring command line flags like add-exports=jdk.compiler/com.sun.tools.javac.api


Yes, those restrictions were turned on by default at runtime starting in JDK 16 (and at compile-time starting in JDK 9), but they specifically did not include Unsafe [1] precisely because it's been widely used for capabilities for which there were no supported replacements.

Capabilities requiring access to other internals -- such as jdk.compiler/com.sun.tools.javac.api -- do not have supported replacements as they can violate the specification in ways that the specification exists to prevent.

[1]: https://openjdk.org/jeps/260#Critical-internal-APIs-not-enca...


This seems a real vulnerability if you're using legacy infrastructure - If you're running your build process on a highly privileged build machine, like a single large Jenkins instance. These machines might have a bunch of subprojects - and a bunch of credentials to login to other prod systems for deployment purposes.

This is not the reason that I prefer containerized build solutions, but it is a real concern, outside of the little bubble that is the startup ecosystem.

Edit: It occurs to me that since I just gave a talk on this, it behooves me to link it: https://youtu.be/dswPHnfGwlY


Annotation processors on the compile classpath are not automatically added to the build output's runtime classpath. The status quo doesn't change, because annotation processors also don't do anything when running the app (e.g. with java and not javac), on any Java version.


Annotation processor can generate code which will be included in the build output runtime classpath..


which is completely irrelevant for the context of this specific change: automatically picking up annotation processors.

If you don't like the (generated) code of a lib don't use it.


It's not irrelevant.

A rogue (automatically run) annotation processor will generate exploit code bundled into the build output artifact, which will then run on the production systems.

Besides, annotation processor can attack the build system as well.


> this is a silly update in OpenJDK's war against things like Project Lombok

Project Lombok doesn’t use the normal annotation processing system, as that is deliberately “add-only” - you can’t change a class’s implementation, unlike what lombok does. They instead hack into the javac compiler to be able to modify class files, which is a very different mechanism (and prone to break with any javac update, which they don’t control) and I think it’s quite easy to see why it’s not loved (though this “war” bullshit is just propaganda from the creator’s ego or whatever).

Also, default being strict is a good stance (both in case of the reflection restrictions) — you can access everything with just a few command line flags, so I don’t really see all the complaints. The point is, you have to know about whether some module in your system accesses another in a non-standard way. Like, are firewalls overly strict because they only allow traffic through port 22 when specified to do so? Should they start with allow-all?


> They instead hack into the javac compiler to be able to modify class files, which is a very different mechanism

They only do that because there is no public Java API to do the things they want to do. If a public Java API to do the same things were made available, I’m sure they’d gladly migrate to doing that instead


> They only do that because there is no public Java API to do the things they want to do. If a public Java API to do the same things were made available, I’m sure they’d gladly migrate to doing that instead

Lombok is an alternative language for the Java platform, and the thing they want to do is modify javac so that it compiles Lombok source (which does not conform to the Java Language Specification) rather than Java source. You are correct that the JDK does not currently wish to offer an API that would allow code, without any special configuration, to change the compiler so that it violates its own specification. If you want to change the behaviour of a JDK tool in a way that violates its specification, then you need to explicitly configure it to allow that.

The Java platform, however, does support many alternative languages, and that is not a violation of the specification. The only reason Lombok is experiencing a technical challenge that, say, Clojure, Kotlin and Scala do not, is because it insists on being used in a way that hides its operation. If Lombok were used like all other Java platform languages it would experience no friction.


> Lombok is an alternative language for the Java platform, and the thing they want to do is modify javac so that it compiles Lombok source (which does not conform to the Java Language Specification) rather than Java source. You are correct that the JDK does not currently wish to offer an API that would allow code, without any special configuration, to change the compiler so that it violates its own specification.

There are many other languages for which nobody would make the argument you just made, because the language has some kind of macro system (or equivalent thereof). Lombok is trying to plug some major gaps in Java’s feature set, and the fact that it exists and is so popular is testament to the fact that the gaps are real. And the “public API” I am talking about would essentially be the missing macro system (or the foundations of one)

> If Lombok were used like all other Java platform languages it would experience no friction.

Because Lombok is trying to be a minimalist extension to Java not a completely different language with very different syntax and semantics.


> because the language has some kind of macro system

Sure, but Java currently does not offer a macro system (which brings many downsides in addition to upsides) and that's by choice. Offering what amounts to a macro system via an API would be a decision with big ramification that is not to be taken lightly. Even then the language needs to decide the extent of the powers of such a system.

You don't have to like this state of affairs, but you can't pretend it isn't what it is. Every language gets to choose what guarantees it offers its users.

> Because Lombok is trying to be a minimalist extension to Java not a completely different language with very different syntax and semantics.

And that's perfectly fine, but the Java specification, by current choice, does not allow making either small or large changes to it. Any language that does not conform to the specification of the Java language is, by definition, not Java regardless of how similar it is. Consequently, the Java compiler does allow code implementing another language to change the compiler's inner workings to compile that new language by configuring it in a certain way, but it does not allow code to do that without special configuration while masquerading as a Java library that appears to conform to the specification.


> You don't have to like this state of affairs,

This attitude is part of why I have given up on Java – despite having spent so much of my career on it – and nowadays avoid it as much as I can. In my own mind it is a legacy language - if I have some existing code base in Java, extending/enhancing it might be the path of least resistance, but for a greenfield project it would be far from my first choice.


Any philosophy, let alone policy, isn't for everyone, and we want to make our policies clear so that the people who align with them will choose the platform, while those who do not -- won't. It is due to this careful philosophy and policy that places an emphasis on integrity (safety) and specification that has made Java the #1 chosen language for new important server-side applications today; i.e. it is more people's first choice than any other language for this domain (despite others who have considered it a legacy language since 2006).

Java says that an application can monkey-patch whatever it likes but that a library needs to be given that permission by the application. If that's not what you need, then Java isn't for you and you should consider it a legacy language for your needs. Like some other languages and unlike others we will continue placing an emphasis on safety, as that appeals to most of our users and hopefully to future users, too. Those who want to easily monkey-patch anything from any code without controls -- which is also a valid requirement -- also have a selection of languages to choose from.


> Java the #1 chosen language for new important server-side applications today

What is the evidence for that claim?


Any real world market analysis survey that shows what languages are used across the industry, not only SV startups, when performance matters.

Plus the whole development toolchain for the mobile OS with 80% of the world market, even if Kotlin is favoured, there is plenty of Java on Android Studio, Gradle, Bazel, AOSP, and Maven Central.


Ask for a report from a market analyst.

Out of curiosity, though, what language would you imagine is now chosen more frequently than Java for important server-side applications? (you can test your guess against some publicly-available jobs data such as https://www.devjobsscanner.com/blog/top-8-most-demanded-prog...)


It is a vague claim. How do you decide what is “important”?

And if the backing is a non-public market analyst report, different such reports can reach different conclusions.

> you can test your guess against some publicly-available jobs data such as

Which I note has both JavaScript/TypeScript and Python ahead of Java. Both of which are being used for a lot of server-side applications-whether or not you consider them important is rather subjective. (JS/TS is obviously including a lot of front-end stuff, and that site doesn’t disaggregate front-end from back-end; but Python is almost all back-end.)


> It is a vague claim. How do you decide what is “important”?

Java is so clearly ahead and by so much that it doesn't really matter, but generally, "important" can mean size, expected lifetime, criticality for business or some combination of them.

> Which I note has both JavaScript/TypeScript and Python ahead of Java.

Python is about on par or slightly ahead but given that the vast majority of the uses of Java are for server side applications, while the vast majority of Python uses are for data analysis etc., so clearly Java is much more favoured for serious applications (not to mention Python's serious scaling issue). JS is indeed significantly ahead, but again, clearly the vast majority of JS use is for web client programming. Node.js was very big for a while, but it isn't as big anymore. Again, ask any market analyst. Perhaps in 2035 some new language is going to show up and unseat Java, but in 2024, Java is the language of choice for serious server-side apps -- not the majority choice (i.e. all other languages combined may make up more than 50%), perhaps, but the first choice nonetheless.


I’ve been spending the last few months reimplementing Java backend code in Python. I’m sure I’m not the only one. It actually wasn’t originally my idea - I fought the decision and was initially upset about it, but now I’ve been living with it I’m actually glad they forced it on me :)

Yes it is true that Python can have limitations with high scale. But there are solutions to that (multi-process Python app servers for example). And the Python core performance story is improving (GIL removal is finally happening, JIT is moving into the core.) Plus there are a massive number of Java/Spring/etc business apps/microservices which aren’t actually high scale (I’ve written some myself) and could just as easily be done in Django/Flask/etc

The thing with saying that most Python is for data analysis, is what starts out as some ad hoc data analysis or data science prototype sooner or later morphs into a production service. And maintaining the same language from prototype to production makes life a lot easier-especially when the data science team decides they have to fundamentally change algorithms to improve performance leading them to rewrite half of it in the middle of the project. It is much easier to teach a backend developer Python (many of whom already know it anyway) than try to get a data scientist to learn Java.

Python is the de facto standard language of AI, and AI initiatives are driving a lot of Python adoption. But once you are using it for AI, why not consider it for non-AI use cases too? There isn’t a hard boundary between the two anyway - a lot of the Python code I’ve been writing recently has been related to getting OAuth tokens to talk to various pre-existing microservices (mostly Java with some node.js) and although I wrote that code for AI use cases it is obviously very applicable to non-AI use cases too.


> I’ve been spending the last few months reimplementing Java backend code in Python.

Lots of people are doing the opposite once they hit scale. DoorDash went from Python to Kotlin/JVM for their backend: https://doordash.engineering/2021/05/04/migrating-from-pytho...


From the article you linked:

> our monolith was built on old versions of Python 2 and Django, which were rapidly entering end-of-life for security support

That doesn't seem to be an issue with Python scalability per se. There are massive creaking monolithic Java apps out there, stuck on old versions of the JDK and various Java libraries, which are just as brittle.

Also, if the discussion is about defending the design choices of the Java language, this blog post doesn't really support that defence, given that while they did choose the JVM as a platform, they also selected Kotlin over Java


I think you're selectively reading that to fit your narrative. The design choices of the Java language are the design choices of the Java Platform and that's exactly why they chose it:

> CPU-efficient and scalable to multiple cores

> Easy to monitor

> Supported by a strong library ecosystem, allowing us to focus on business problems

> Able to ensure good developer productivity

> Reliable at scale

> Future-proofed, able to support our business growth


> I think you're selectively reading that to fit your narrative.

Please don’t tell people they are selectively reading things, because if you are going to do that, I can do the same right back to you. You will note up-thread I’m complaining about gaps in the Java language (not platform) which are why Lombok exists, and the Java language maintainers’ unwillingness to provide official solutions within the language to address that - which is something Kotlin handles much better (it has some Lombok-like features, plus its DSL support). So, just as you accuse me of selectively reading that blog post to fit my narrative, I can accuse you of selectively reading this thread to fit yours - but mutual accusations of “selective reading” aren’t really adding anything useful to the conversation, are they?

> The design choices of the Java language are the design choices of the Java Platform and that's exactly why they chose it:

No. As I said, the design choices of the Java language not to provide many Lombok-style features, whereas Kotlin does, has nothing to do with the JVM as a platform


You realize that conversations can evolve right? I merely pointed out that your anecdotal evidence of you rewriting from Java -> Python could be shown in the reverse direction of people moving away from Python. You then claimed this was not for scalability issues, which in fact it was if you didn't selectively read the article to fit your anecdotal evidence. I _never claimed this was about language features_.

Cheers.


> You then claimed this was not for scalability issues, which in fact it was if you didn't selectively read the article to fit your anecdotal evidence

They said at the start of the article that the primary motivation for finding a new technology stack was they were running on Python 2 and old versions of Django, and they also had the kind of issues which commonly happen with monolithic apps (slow bisection).

They then said they wanted to look for a new platform. And some of the reasons why they decided to pick Kotlin/JVM over CPython3 because they viewed the former as having likely better scalability and manageability.

If anyone here is "reading selectively", it is you, not me – you are mixing up (1) their original reasons for looking for a new platform (2) the reasons they chose for picking the new platform they did. If you read the blog post carefully, the reason they chose Kotlin/JVM was because they expected it would scale more easily in the future – which might be true – but the present day scalability issues they were having were due to an outdated stack and a monolithic architecture (problems which can occur on any technology stack), not those future expectations.


I like java, but it's unfortunate the current stewards come off as unlikable on social media. Most of the work they're doing is great, though.


I particularly like the current stewards. They've done a great job and have been consistently putting Java and the JVMs best interest at the forefront.


So would you suggest he change it to work like quasar?


That's one option. Quasar was configured to run as a runtime agent while the Lombok compiler runs at compile-time, but the Lombok compiler could perhaps indeed be implemented as a Java agent of javac's. This, however, would also require a special configuration of javac, just like all other alternatives. A simpler option is to offer a separate launcher for that compiler (and/or a special build-tool plugin, which is also what Quasar did in its AOT mode).

No matter what, code that wishes to break the guarantees that JDK programs want to make needs to configure that program in some special way so that there's some record, auditable by the application author, saying "this program has been modified in a way that may make it behave not in accordance with its specification".


You could do a lot of the lombok things as a runtime agent. I think @ToString, @EqualsAndHashCode, @Synchronized, @Getter(lazy=true), @UtilityClass, @Delegate, and @Cleanup could all be implemented in a runtime agent. Most of the rest could work as a runtime agent too as long as you were willing to type out the method signature (plus maybe a native keyword) for the things you actually want to link against at build time.

I think this would end up being way more difficult to use though and I think it would be perceived to have way more risk compared to lombok as it is now. What do you think?

Also is something wrong with paralleluniverse.co? The root domain seems to go to a gambling site?


Yet another reason not to complain when those things go away.


Lombok uses the annotation processor to bootstrap itself. That particular part of Lombok is uncontroversial and is in fact well-documented and used by other tools.

The parts of Lombok you have an issue with are not impacted by this change.


He knows what Lombok does. He’s the author.


Kind of - the point remains. Lombok requires to alter the compiler (hook into), not (just) the annotation processing, itself. I'd be okayish if Lombok was a mere post compilation/enhancement too, but it isn't. It's a per-compilation step


I’m not refuting the point. Just giving kaba0 context that they are explaining how a library works to its author.


Well, then he just prefers telling lies. Annotation processors can’t modify classes, this is a fact. The primary purpose of Lombok is adding new methods like getX and setX to the same class based on fields. It’s pretty easy to conclude that Lombok is thus not an ordinary annotation processor, and actually uses sun.misc.unsafe to go into the private internals of javac to modify the AST, which has become possible only by specifying some additional arguments, notifying the end-user that some libraries on the class path might do something that can’t be promised to work ad eternity. This is a completely reasonable decision on the Java team’s part (and quite narcissistic to conclude that it is due to your lib..), as many of the breaking changes between 8 and 9 were actually due to libraries doing exactly that. These changes helps uphold Java’s strong backwards compatibility guarantees.


I wasn’t refuting anything and in fact I’m onboard with these changes. I do not like Lombok and _in my opinion_ it’s for the laziest of devs. I just wanted to give you context that you are explaining how a library works to its author.


Well, I was sort of explaining it to the wider audience, but fair enough, I wasn’t actually aware it was the author.


> That jar will also be on the classpath when running the app.

No. In many maven config, runtime classpath is not the same as compile time


Is the OpenJDK behavior different from Oracle JDK?


They are basically the same, other than support and packaged set of GC configurations.


> OpenJDK's backwards-compatibility breaking crusade.

First, Java has always been very explicit about where it offers backward compatibility and where it does not. You cannot break backward compatibility where it must not be expected. These are classes that carry a warning saying: these are internal classes that offer no kind of backward compatibility and can be changed at any time and without warning; by depending on them you are accepting upon yourself the responsibility to respond to any change. Nevertheless, in most situations we are offering plenty of advance warning.

Second, the "crusade" isn't done to break anything or against anyone, but because these internal and not-backward-compatibility-breaking changes are necessary to offer Java users the features they're asking for, which, in turn, rely on integrity [1]. We're talking about changes to the very core assumptions of the platform, which could violate any invariant and have a global impact, that could have been made by any code in any transitive library. In order to offer certain features the platform must know which of its invariants it may trust (for example, the JIT compiler cannot perform certain optimisations that assume strings are immutable because even though it is an invariant of the platform, some third-level dependency could have decided that actually strings would be mutable in any application that consumes this library).

Nevertheless, the platform does not prevent code from choosing to violate integrity invariants. It just prevents libraries from doing so -- which has a global effect on the application -- without the application's knowledge.

> "For security" without being particularly clear about how that update contributes to security.

The changes I was referring to above (deep reflection, Unsafe, JNI/FFM, dynamic agents) are about integrity, not security (you can think of integrity as a generalisation of memory safety, which is a special case of integrity; it is not security in itself, but it can make security easier). We are very clear both about that and about the relationship between integrity and security [1].

As far as this particular change to annotation processing, however -- a change I'm not personally involved with and don't know the specifics of -- the email does mention security as the motivation. It has been a long-standing policy of the JDK team (as well as that of many other projects) not to disclose any specifics about any vulnerabilities involved. OpenJDK has a specialised vulnerability group [2], made up of people from multiple companies, and they are given access to the vulnerabilities.

> It's sort of like stating that security is improved by ensuring that it is no longer possible to open the front door from inside the house without a key. Seems nice - but, they're... already inside the house.

Yeah, one, it's not about security but about integrity -- as explained in the motivation [1] -- which is, indeed, essential for security but for other important things as well, and two, your security analysis is just wrong. That is why for security is best to rely on security experts and not on people unfamiliar with the field who go by what seems to make sense to them.

> if you are running malicious code inside a JVM, we've got much, much bigger problems.

Benevolent, well-meaning code with unintended vulnerabilities is a far bigger security problem than malicious code in server applications. Malicious code is a common problem on the client, but benevolent code is the bigger danger on the server, so if you start thinking about what it is that malicious code could do you know you're thinking in the wrong direction.

[1]: https://openjdk.org/jeps/8305968

[2]: https://openjdk.org/groups/vulnerability/


Not sure how much that would help. As far as I understand if you have access to putting stuff in the class path surely you can just override Java classes and run arbitrary code that way.


I think the difference is that annotation processors run arbitrary code at compile time.

An org might have e.g. a CI with a build environment that’s not as well-sandboxed as the test environment for the built app, because a Java compiler isn’t generally expected to (and other than through annotations, usually doesn’t) expose arbitrary code execution abilities to the payload of code being compiled.


> An org might have e.g. a CI with a build environment that’s not as well-sandboxed as the test environment for the built app, because a Java compiler isn’t generally expected to (and other than through annotations, usually doesn’t) expose arbitrary code execution abilities to the payload of code being compiled.

Is that kind of setup common though? I’ve never seen anybody running sandboxed tests but non-sandboxed compiles. In my personal experience, either one has ability to sandbox and one sandboxes everything, or one lacks that ability and sandboxes nothing

I have seen compile and unit tests run directly on a Jenkins agent (with a lot of ability for the job to mangle the agent config), but then spinning up a Docker container for integration tests - but in that case the motivation for the Docker container isn’t sandboxing, and the Docker container is often given lots of privileges (like access to the Docker socket)


Is it actually common practice these days to have Java repositories that do not contain the build scripts, packaging, etc?

The last time I looked at maven, it seemed easy enough to have it run arbitrary code directly during the build.


> Is it actually common practice these days to have Java repositories that do not contain the build scripts, packaging, etc?

It's not that they don't contain these things; but rather that you can (and people often do) set things up so that the build scripts + packaging can be "more trusted" than the source files.

If you've ever tried to set up CI on e.g. Github for an open-source project, then you might be familiar with the concept of a "PR attack" — where an external contributor forks your project, submits a PR that adds malicious code to your build scripts (to e.g. exfiltrate your build-time secrets); and then your CI "helpfully" runs those build scripts (in order to e.g. evaluate that the PR compiles + passes tests + lints in order to determine whether it should be blocked from merging or not.)

GitHub and others have come up with ways around "PR attacks", that involve treating triggered automation workflows for external PRs differently: in these workflow runs, the core of the workflow — the workflow manifest file — is sourced not from the contributor's branch, but rather from the base branch that the PR aims to merge to.

Now, it's up to you, as a repo maintainer, to come up with a way to bootstrap that little bit of safety (protected workflow manifest) into whole-repo "PR attack" arbitrary-code-execution protection. But usually doing so involves:

1. moving as much of the logic for executing your build as possible out of the repo itself and into buildpacks / "action" repos; and

2. sourcing any scripts you do need for the build, not from the worktree of the external PR, but rather by having the workflow environment also check out the base branch, and then running those scripts against your worktree. (IIRC the most common pattern for this is that you check out the base branch worktree, blow away its src/ dir, symbolic-link the PR branch's worktree's src/ into the base branch as src/, and then run your build in the context of the base branch.)

This approach is incomplete, however, if the PR's source files can themselves be the source of arbitrary code execution.


This is mostly about supply chain attacks, where transitive dependencies are at play. Those are mostly just already built jar files, that can still contain annotation processors, but not build scripts.


I assume this is to have more control over compilation process.

Without annotation processor, you can expect your code to be compiled and behaved in an obvious way.

With annotation processor all bets are off, your code and code that results from compilation are completely different entities.

So with this switch being explicit, you might enforce politics like lack of annotation processors for better clarity.

While security theoretically might be better, in practice with modern build tools there are enough ways to cause code execution, so it probably doesn't matter much.


Using Gradle, ant or Maven still means, in the end, you’re calling javac. All it takes for this to be exploited is for the files to be dropped anywhere in the classpath.


It's an annotation processor. They generally require annotations in your own source code to kick in.

As a security risk this is pretty minor. About on the same level as any software project with any dependencies risking running arbitrary code unless you audit those dependencies. This is of course a very real risk and it has affected a bunch of projects. But it's not really stopping people from using things like cargo, npm, etc.

Overall, it makes sense to make the use of annotation processors a bit more explicit. With Kotlin this is kind of how it works as well. You have things like ksp that you have to configure explicitly if you want to use them. Additionally there are compiler plugins that you can configure if you need them. It's not a big deal to configure this explicitly. I actually prefer it over magic discovery mechanisms that are hard to debug when they don't work.

So, good change that probably simplifies the build process a little.


> They generally require annotations in your own source code to kick in

Right but you don’t know which annotation processor will actually run. Anybody could look for javax.persistence.Entity and do something. There’s no guarantee only your JPA provider will be running and looking at them.


In exactly the same way, unless you audit your dependencies, you have no idea what you are going to run. It all boils down to whether you trust your dependencies. The only difference here is that this is a compile time dependency, not a run-time dependency. But unless you checked it, there are no guarantees.


Sure. Hypothetically, I have checked them. I have checked what I’m actually using, but the compiler is still executing things I didn’t know about.

I only use CatUtils from org.catpache.commons. I’ve audited this single class that I use. I know it’s safe. It only contains a Map<String, String> of Latin cat names to their common English name, but what I didn’t know was the compiler magically running an annotation processor behind my back and it’s now modified all of my classes to throw MeowException whenever toString returns “dog”.


It’s pretty easy to statically verify that, say, no dependency (even transitive ones) contain a System.exit call.

It’s basically impossible to determine what your annotation processors will output besides running them yourselves (and their output may not be deterministic to begin with).


Mixed feelings about this one, now every project has to start tweaking compiler options for lombok etc, I kind of like how it works out of the box.

I guess if you have a crazy project and a lot of transitive dependencies there is a small chance some processor lands on the classpath accidently and starts processing things, but I have not ran into this in practice


Lombok has come up a bit in this discussion. Are there any other popular Java libraries or frameworks that are affected?


Lombok is not affected by this, as it is not an annotation processor.

The most popular annotation processor that I've seen "in the wild" is the Hibernate Metamodel Generator (https://hibernate.org/orm/tooling/).

Also, Immutables (https://immutables.github.io/), my favorite Lombok alternative, is affected.

Of note, you can bypass this more-security-concious approach by just passing `-prof:full` to javac.


> Lombok is not affected by this, as it is not an annotation processor.

Lombok uses an annotation processor to bootstrap itself.


I know of these:

- Immutables - Autovalue - Mapstruct - Checker Framework

There's quite a list here https://github.com/gunnarmorling/awesome-annotation-processi... (Though I don't think Error Prone is actually an annotation processor, but rather a javac plugin.)

There's some irony in that Immutables and Autovalue are often named as alternatives for people that dislike Lombok's implementation but do like (some of) Lombok's features.


No other comes to mind. Most other libraries are proper annotation processors, meaning they abide by the rules and are only additive, generating new classes. One such would be mapstruct which is pretty frequently used.


Don't quite understand the hate. This is very reasonable behavior for a compiler in 2024. I would even go so far as to say that a compiler should never execute any user code by default.

Imagine if gcc could automatically, by default, download some shared library and include it in the build because of a special macro included by a random header file on the system.

Of course, this change is optional (just add `-proc:full` to the command line) if you want the old, less secure behavior.

Also, Lombok is not affected because it's not an annotation processor. (Also, look at Immutables: https://immutables.github.io/).


I find it a great improvement. Secure by default.

To me the hate started when Java went from a statically compiled languages, to a half way dynamic language that does a fair bit of compilation at application startup, based on a bunch of annotations. Suddenly many Java projects introduced Ruby-on-Rails level "magic" (yes looking at you Spring(Boot)).

I'm not totally against annotations, but it's easier to over use them just to remove a bit of boiler plate.

To me Kotlin's approach makes more sense: reduce boilerplate by making the language more expressive, adding KClass and KFunction (https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.reflect) while mostly avoiding annotations and exceptions.

Kotlin also has a nice story for immutability.


The Java team is still breaking things despite numerous pleas of the community pleading not to do so? Color me surprised.

The Security Manager send its regards.


You mean pleas from some members of the community not to do so and many more pleas from many more members of the community to the contrary. It is impossible to do "the community's bidding" when different parts of the community demand contradictory things. So we try to cater to the majority while giving the minority sufficient time to adapt.

SecurityManager is a case in point. You and a few others claimed that removing it would have a large harmful impact. We proceeded in our usual cautious manner to first test the more widely believed hypothesis to the contrary before doing anything irreversible and put a warning in place in JDK 17. A lot of people have adopted JDK 17 or later over the last few years and those of them who are using SecurityManager have seen the warning about its planned removal. As most people believed, the warning did not uncover some hidden widespread use of the feature (and your campaign didn't manage to find widespread support for your position).

It is perfectly alright to be dissatisfied when a decision doesn't go your way, but it's not alright to present it as if the decision went against "numerous pleas" without mentioning that, however numerous (which, in this case, was fewer than ten people), those pleas came from a small minority. Had we gone the other way, far more people would have been dissatisfied.


Never seen anyone ask for you to get rid of security manager. Fix and improve it, sure. But no, it had to go and we did not get a suitable replacement. RIP.


Of course you did -- many, many, many times -- you just didn't know that that's what you were seeing.

People rarely directly asks for an old feature that they're not using to be removed. Rather, they ask for new features that the old feature interferes with. SecurityManager imposed a significant tax on nearly every new feature, as it interacts with just about everything, slowing down much of the work on the JDK. SecurityManager was (actually, still is as it has not been removed yet) simultaneously one of the most expensive features and least used features in the JDK. Everyone who asked for new significant features effectively asked for SM to be removed, as they could not be delivered -- certainly not as expeditiously -- with it in place.

In fact, as early as JDK 8, new features (such as streams) had to work around SM (simply documenting that it could not be effectively used in combination with the new feature; few people noticed because few people used SM). With time, it became ridiculous to say that SM is ineffective when used in combination with more and more new features (e.g. virtual threads, and the same would have been the case for Leyden [1]), and it still kept exacting a tax on most uses, so it had to be removed if we were to give people the features they were asking for. The fact that more robust, effective, and relevant security mechanisms are now not only available but have grown far more popular in Java applications than SM ever was only meant that the whatever arguments there may have been to keep it and delay and significantly complicate new features were weak.

[1]: I shudder to think how Leyden could be done while keeping SM in some functional state.


> The fact that more robust, effective, and relevant security mechanisms are now not only available but have grown far more popular in Java applications

Like what? Or are you referring to isolating the entire JVM? That's pointless for plugin systems that don't want to deal with the ridiculous amount of overhead that'd entail.

> SecurityManager was (actually, still is as it has not been removed yet) simultaneously one of the most expensive features and least used features in the JDK

Well, yeah. It was hard to implement, I'd know because I was working on integrating it in a plugin framework I was working on at the time. But we eventually abandoned that idea when we heard the announcement of SM being deprecated with no replacement.

Would've been great to be able to support sandboxed plugins for the game we were working on.


> Like what?

Like all the security mechanisms that are used by virtually all security-sensitive Java applications, from cgroups/containers and firewalls to encryption protocols and serialization filters (they're not using SM).

> That's pointless for plugin systems that don't want to deal with the ridiculous amount of overhead that'd entail.

You cannot offer secure server-side plugins (i.e. on shared machines) with a mechanism like SM, and even client-side plugins now use different, simpler, mechanisms.

> Would've been great to be able to support sandboxed plugins for the game we were working on.

It's not that hard to offer some in-process client-side sandboxing using modules and some instrumenting class loaders without SM. It may not be very robust, but SM wasn't as robust as some imagined it was, either.

There aren't in-process isolation mechanisms robust enough for server-side use -- where DoS attacks are a common threat -- and even on the client it would be both safer and simpler to sandbox the entire process or make no security claims (many client-side programs with plugins opt for the latter).


> Like all the security mechanisms that are used by virtually all security-sensitive Java applications, from cgroups/containers and firewalls to encryption protocols and serialization filters (they're not using SM).

Not reasonable to implement across all platforms users may choose to run a game on. This discussion is about a game client, not the server. A replacement solution would have to work everywhere Java runs, and should not impact the user's system in any noticeable way.

> but SM wasn't as robust as some imagined it was, either.

The docs, at the time, implied SecurityManager was the way to go to run untrusted code, similar to Applets.

Since there is no reasonable alternative and the JDK team has seemingly given up on this feature we've instead opted to require all plugins to be source-available, manually vet the code, and build & distribute via our CI & only allow the client to load plugins signed by our CI.


> This discussion is about a game client, not the server.

Well, I didn't know that's what this discussion is about, but sure, we can talk about that. :)

> A replacement solution would have to work everywhere Java runs, and should not impact the user's system in any noticeable way.

Except 1. there's little demand for such a system in the JDK (and, as I've said, you can do something reasonable on your own with modules, class loaders, and a tiny bit of instrumentation) and 2. I don't think anyone offers such a general mechanism, especially as different programs would have different requirements on robustness, some of which cannot be achieved in-process (e.g. web browsers, the prime example of a program running untrusted code, these days use process isolation).

> The docs, at the time, implied SecurityManager was the way to go to run untrusted code, similar to Applets.

Yes, that was the best way; whether or not the best was good enough is another matter.

> Since there is no reasonable alternative and the JDK team has seemingly given up on this feature we've instead opted to require all plugins to be source-available, manually vet the code, and build & distribute via our CI & only allow the client to load plugins signed by our CI.

I would say that even SM may not have given you what you need, but yeah with it or without it, some other approaches are required for better fortification. I would recommend additionally looking into some sort of basic sandboxing -- which will control which APIs are exposed to the plugin -- using modules and instrumentation (that can inspect/restrict some operations that are exported by, say, the java.base module).


Success of VSCode plugins, microservices, containers, and out of process VSTs, have proven that on modern hardware people favour stability and improved security, over the in process plugins.

.NET also dropped their version of SecurityManager when they did the Core rewrite.


All of those plugins have shit latency. This model is not suitable for games, at all. The plugins need to be able to render their own graphics, which happens at 60~120fps. Also have you ever tried running ~200 JVMs on the same machine?


> Also have you ever tried running ~200 JVMs on the same machine?

This is one of my pet peeves with the "garbage collection" model of memory management: it does not play well with other processes in the same machine, especially when these other processes are also using garbage collection.

With manual memory management (and also with reference counting), whenever an object is no longer being used, its memory will be immediately released (that is, the memory use of a process is always at the minimum it needs, modulo some memory allocator overhead). With garbage collection, it will be left around as garbage, and its memory will only be released once the process decides that there's too much garbage; but that decision does not take into account that other processes (and even the kernel for its page cache) might have a better use for that memory.

This works fine when there's a single process using most of the memory on the machine, and its garbage collection limits have been tuned to leave enough for the kernel to use for its caches (I have seen in practice what happens when you give too much memory to the JVM, leaving too little for the kernel caches); but once you have more than a couple processes using garbage collection, they'll start to fight over the memory, unless you carefully tune their garbage collection limits.

It would be really great if there were some kernel API which allowed for multiple processes (and the kernel caches) to coordinate their garbage collection cycles, so that multiple garbage collectors (and in-process caches) would cooperate instead of fighting each other for memory, but AFAIK such API does not exist (the closest I know of is MADV_FREE, which is good for caches, but does not help with garbage collection).


Contrary to common culture, if memory strain is an issue with GC, it is even worse with algorithms that cannot cope with fragmentation, or have to keep going down into the OS for memory management.

Optimizations to avoid fragmentation, locking contention or stop the world domino effect in reference counting algorithms, eventually end up being a poor implementation of a proper GC.

Finally, just because a language has a GC, doesn't mean it also doesn't offer language features to do manually memory management and reference counting if one feels like it.

While Java failed to build up on the learnings from Eiffel, Oberon, Modula-3, there are others that did, like D, Nim, C#.

Not all GCs are born alike.


I'm building this JEP for automatic heap sizing right now to address this when using ZGC: https://openjdk.org/jeps/8329758 I did in fact run exactly 200 JVMs, running a heterogeneous set of applications, and it ran totally fine. By totally fine I mean that the machine got rather starved of CPU and the programs run slowly due to having 12x more JVMs than cores, but they could all share the memory equally without blowing up anyway. I think it's looking rather promising.


> With manual memory management (and also with reference counting), whenever an object is no longer being used, its memory will be immediately released

Well, this is a fundamental space vs time tradeoff — reclaiming memory takes time, usually on the very same thread that would be doing useful work we care about. This is especially prominent with reference counting, which is the slowest of these all.

Allocators can make reclamation cheap/free, but not every usage pattern fits nicely, and in other cases you are fighting fragmentation.


> Well, this is a fundamental space vs time tradeoff — reclaiming memory takes time, usually on the very same thread that would be doing useful work we care about.

Precisely. Which is fine if you don't have to share that space with anyone else; the example which started this sub-thread ("running ~200 JVMs on the same machine") is one in which that tradeoff goes badly.

But it wouldn't be as much of an issue if the JVMs could coordinate between themselves (and with other processes on the same machine), so that whenever one JVM (or other things like the kernel itself) felt too much memory pressure, the other JVMs could clean some garbage and release it back to the common pool of the operating system.


It might even be a problem without garbage collection - linux might be a big culprit here with its tendency to over-allocate. Some signal would be welcome that says “try to free some memory” - I believe OSX has something like that.


Which games are you shipping in Java that depend on Security Manager's existence?

Most games are deployed as services nowadays, they have a full network between their rendering and game logic.

Yes, that is what Cloud Services do all the time across the globe in Kubernetes.


> Which games are you shipping in Java that depend on Security Manager's existence?

None, because it didn't pan out like I described above. No sense continuing developing something using a technology that is due to be removed. The project was abandoned.

This was for a third-party Old School RuneScape client which supports client side Java plugins. The current approach is to manually vet each plugin and its update before it is made available for users to install.

> Most games are deployed as services nowadays, they have a full network between their rendering and game logic.

Networked games do not communicate at 60~120fps. That's just not how it works, writing efficient netcode with client-side prediction is important for a reason.

> Yes, that is what Cloud Services do all the time across the globe in Kubernetes.

Yeah, on the servers they pay several grand a month for. Not on end user craptops which is where games obviously run.


You might want to take a look at GraalVM’s Isolates. They are lightweight security boundaries, that can even limit things like memory usage and cpu time, for java, js, python among other languages, all within a normal JVM process.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: