2017-05-05

Is it time to fork Guava? Or rush towards Java 9?

Lost Crew WiP

Guava problems have surfaced again.

Hadoop 2.x has long-shipped Guava 14, though we have worked to ensure it runs against later versions, primarily by re-implementing our own classes of things pulled/moved across versions.


Hadoop trunk has moved up to Guava 21.0, HADOOP-10101.This has gone and overloaded the Preconditions.checkState() method, such that: if you compile against Guava 21, your code doesn't link against older versions of Guava. I am so happy about this I could drink some more coffee.

Classpaths are the gift that keeps on giving, and any bug report with the word "Guava" in it is inevitably going to be a mess. In contrast, Jackson is far more backwards compatible; the main problem there is getting every JAR in sync.

What to do?

Shade Guava Everywhere
This is going too be tricky to pull off. Andrew Wang has taken on this task. this is one of those low level engineering projects which doesn't have press-release benefits but which has the long-term potential to reduce pain. I'm glad someone else is doing it & will keep an eye on it.

Rush to use Java 9
I am so looking forward to this from an engineering perspective:

Pull Guava out
We could do our own Preconditions, our own VisibleForTesting attribute. More troublesome are the various cache classes, which do some nice things...hence they get used. That's a lot of engineering.

Fork Guava
We'd have to keep up to date with all new Guava features, while reinstating the bits they took away. The goal: stuff build with old Guava versions still works.

I'm starting to look at option four. Biggest issue: cost of maintenance.

There's also the fact that once we use our own naming "org.apache.hadoop/hadoop-guava-fork" then maven and ivy won't detect conflicting versions, and we end up with > 1 version of the guava JARs on the CP, and we've just introduced a new failure mode.

Java 9 is the one that has the best long-term potential, but at the same time, the time it's taken to move production clusters onto Java 8 makes it 18-24 months out at a minimum. Is that so bad though?

I actually created the "Move to Java 9": JIRA in 2014. It's been lurking there, Akira Ajisaka doing the equally unappreciated step-by-step movement towards it.

Maybe I should just focus some spare-review-time onto Java 9; see what's going on, review those patches and get them in. That would set things up for early adopters to move to Java 9, which, for in-cloud deployments, is something where people can be more agile and experimental.

(photo: someone painting down in Stokes Croft. Lost Crew tag)

No comments:

Post a Comment

Comments are usually moderated -sorry.