Introduce guava-libraries as a GeoServer core dependency and provide some general guidelines on when, why, and how to use them
I've been using some of the guava utilities for the most part of last year in other GeoServer related projects. At the mailing list we decided a GSIP would be worth it as an introduction to its benefits and as a reference for other GeoServer developers.
This proposal aims at introducing the Google core guava-libraries as a core GeoServer dependency and to provide some guidelines and material for the progressive adoption of its utility classes, ranging from collections utilities, to IO, concurrent, primitive and String oprations, cache facilities, and more.
In a nutshell, excerpt from the Guava Explained wiki:
- Basic utilities: Make using the Java language more pleasant
- Collections: Guava's extensions to the JDK collections ecosystem. These are some of the most mature and popular parts of Guava.
- Caches: Local caching, done right, and supporting a wide variety of expiration behaviors.
- Functional idioms: Used sparingly, Guava's functional idioms can significantly simplify code.
- Concurrency: Powerful, simple abstractions to make it easier to write correct concurrent code.
- Strings: A few extremely useful string utilities: splitting, joining, padding, and more.
- Primitives: operations on primitive types, like int and char, not provided by the JDK, including unsigned variants for some types.
- Ranges: Guava's powerful API for dealing with ranges on Comparable types, both continuous and discrete.
- I/O: Simplified I/O operations, especially on whole I/O streams and files, for Java 5 and 6.
- Hashing: Tools for more sophisticated hashes than what's provided by Object.hashCode(), including Bloom filters.
- EventBus: Publish-subscribe-style communication between components without requiring the components to explicitly register with one another.
- Math: Optimized, thoroughly tested math utilities not provided by the JDK.
It's on maven central, so just:
It's a single but sizable Jar, around 1.5 MB. In order not to increase the size of our downloads too much, it looks like at least we could get rid for the following libraries (thanks Andrea):
The following are just some small concrete examples of using Guava utilities in GeoServer, and focus only on the bits that I got to use so far.
We use a lot of caches. Specially in core classes like CatalogImpl and ResourcePool.
Some are plain HashMap, some others are custom crafted specializations of SoftValueHashMap . Some need to do additional clean up when a resource is evicted from the cache.
So in ResourcePool we have all these cases. Replacing those HashMaps and custom classes by Guava Cache makes for doing more with less code:
- Set cache capacity bound;
- Entry expiration based on last access time or last read time;
- Ability to use weak keys and/or soft value references
- Concurrency hints ( the table is internally partitioned to try to permit the indicated number of
concurrent updates without thread contention.)
- For the cases where resource clean up needs to be done upon entry eviction, encapsulates the cache population logic and entry eviction hooks into a single object, so
related logic remains close.:
- Eliminates the need for the "double checked logic anti-pattern", so that every get method on cacheable contents becomes basically:
Although that patch is not strictly part of this proposal, it would be a good thing to have once/if this proposal is accepted.
No!. And maybe. There are lots of things than (IMHO) can be done better with guava than with commons-collections. But guava is way more than the collections utilities, and so is Apache commons-*. Both of them have utilities not present in each other, and some overlap. My personal preference is to use Guava from now on for all collection utilities needs, as it's more modern, well designed, faithfully respects the Java collection contracts, leverages immutability and code clarity, is under active development and well supported. But Apache commons is gonna be around for sure as there are a lot more to commons than collections.
Also, the point of this proposal is to present guava to you and recommend you take your own tour not only about the collection utilities, but also the I/O, net, primitives, concurrent, etc.
Googling gives as usual thousands of links. Here are some of the ones that seemed more appealing to me:
- Some points on the Guava philosophy, explained.
- The "Guava Explained" wiki
- Presentation slides focusing on base, primitives, and io
- Presentation slides focusing on cache
- Presentation slides focusing on util.concurrent
- What are the big improvements between guava and apache equivalent libraries?
- Writing more elegant comparison logic with Guava's Ordering
- Creating a fluent interface for Google Collections
- Google's guava java: the easy parts
- Beautiful code with Google Collections, Guava and static imports
...Well I really like that set of capabilities; while it would represent an increased learning curve to work on GeoServer - it would be a win if we could remove a few more dependencies. We may need to duck back into GeoTools to make that happen; but that would perhaps not be a bad thing.
...We are welcome to peruse this library for GeoServer prior to that point. I also have some uDig code that used the earlier google collections library that I can fix up (and get some experience).
So you are getting two bits of feedback:
- Yes - but not for GeoTools until after 8.0
- A good trade if we cut down or out the other dependencies (coming from GeoTools)
The guava library looks beautiful, no question there, and there is a lot of hype around it at the moment on all the java blogs. But as I mentioned before, and as jody mentioned i don't love the idea just lumping on another utility library. Obviously it leads to much nicer code, and has some functionality we don't have now but without a concrete problem it solves i don't see that as justification enough alone. It is already enough of a maze trying to look up the right utility class to use when you have to do something, this will make it worse.
I would actually be more in favor of a lower level effort at the geotools level to replace commons with guava. Obviously though that is a larger effort and by no means meant to block the proposal
I feel the same, but at the same time I'm worried the code will turn into COBOL pretty soon if we don't do some effort to modernize it.
The situation with scripting languages and the various "java successors" seems like a grand royal mess that is not going to give us a clear successor to Java anytime soon, so we better try to get onto more compact/modern code and try to prolongue the life of the code base as much as possible.
Of course once we adopt Guava we must make an effort to use it instead of commons wherever
possible/makes sense to get some uniformity back.
As the proposal aims to adding a new set of utilities to the class path for progressive adoption, there are no backwards compatibility issue foreseen.