Hibernate based catalog

Introduction

The purpose of this work is to replace the current Geoserver file-based catalog with a DB-based one, using Hibernate as the persistence framework.

We want to develop the geoserver-hibernate module so that it may work without breaking nor changing the existing Geoserver code, by providing new endpoints (factories, catalogs) that will be used instead of the standard ones only by using Spring injection.

Aside from the persistence layer itself, we do not want the full configuration to be cached in memory: all of the dynamic data should be handled - and cached - at DB level.

We started working on the 1.7.x stable branch, but the legacy stuff was really hard to overcome. We then moved working on the trunk (2.0 RC) and many problems went magically away (nice clean up, folks!).

For further info, please also refer to this entry in the GeoServer Roadmap and to this blog entry.

How to test it

Warning

Please, notice that the Hibernate bassed catalog is a work in progress therefore things might change between checkouts which may need a schema drop-create!

By default the hibernate community module runs with an embedded H2 database which resides within the data directory of GeoServer.
It is however possible to use a PostgreSQL database by tweaking the module configuration (notice that so far we have tested only the H2 and PostgreSQL databases).

The easiest way to customize the hibernate configuration module to use, as an instance a PostgreSQL database is to create a file called gs-db-config.properties as follows:

dataSource.driverClassName=org.postgresql.Driver
dataSource.url=jdbc:postgresql://localhost/gscatalog
dataSource.username=your_username
dataSource.password=your_password

entityManagerFactory.jpaVendorAdapter.databasePlatform=org.hibernate.dialect.PostgreSQLDialect
entityManagerFactory.jpaVendorAdapter.database=POSTGRESQL

This file can be either:

  1. put inside the GeoServer classpath (i.e. inside the war next to the applicationContext.xml). This is good for people that are building their own GeoServer from source code.
  2. put inside the current user home directory
  3. put at a location specified by the GeoServerDBConfigPropertiesFile system property, as an instace you can configure Tomcat by adding something like this  -DGeoServerDBConfigPropertiesFile=/home/simone/gt-renderer/postgis.properties to the JAVA_OPTS
Hint

Please, notice that default setting for the file above refer to a database called gscatalog which must be created prior to running the GeoServer. This name can be changed by changing the database part of the jdbc string above

Digging into the code

Where's the code?

On 20090907 the code has been committed into the geoserver trunk, as a community module called 'hibernate'.

Replacing beans

Given the aforementioned constrains, one of the first issues that was encountered is that many of the POJOs to be persisted are not Hibernate friendly: some of them do not have the empty constructor, many other do not have getters or setters for the internal attributes - these accessors methods are not even, of course, in the interface declarations.

As a first step, new beans have been created in order to allow the geoserver data to be persisted by Hibernate. Interfaces extended with the missing accessors have been named like the original ones, adding an "Hb" suffix (e.g.: StoreInfo -> StoreInfoHb). Implementation classes have been copied and extended with the missing methods, following the same schema (e.g.: ResourceInfoImpl -> ResourceInfoImplHb).

The copy-and-extend approach has been choosen because it allows to put the new implementations wherever we need them to be. Extending the original implementations is a little uncomfortable, because many of them declare package private fields, so they need the extending classes to be placed in the very same package. This choice is arbitrary and can be easily refactored.

Replacing logic

In order for the new persistence layer to work, we need to use these new *ImplHb classes when creating new instances. The HibCatalogFactoryImpl and HibGeoserverFactoryImpl classes provide the way to create the needed instances.

Initializing and injecting beans

The following diagrams shows the beans declared in the hibernate module applicationContext.xml.

The orange entities are beans that replace existing instances with the same name.

We force the loading of these implementation by using the "primary" attribute:

<bean id="configTarget"
      class="org.geoserver.config.hibernate.HibGeoServerImpl"
      lazy-init="false" primary="true">

This diagram is a bit out of date, but we leave it there because it's a good first approssimation of the whole structure.

This structure has been changed for two reasons: the first one is that the passage from Hibernate to JPA changed a bit the transaction/interceptor part.

The second and main reason is about beans initialization. As soon as HibGeoServerImpl is loaded and initialized, the GeoServerLoader class (in the main module) postprocesses it and starts an inner initialization. This init involves the loading of some resources from the catalog; at this point the transactionManager could not have properly initialized yet, so entities will not be loaded and the persistence layer will throw.

Here is the new structure:


The changes concerning JPA usage are on the right side of the image.

HibGeoserverImpl now does not implement explicitly anymore the GeoServer interface, even if it is really implementing it. In this way, the GeoserverResourceLoader will not start its initialization once this class il loaded. HibGeoServerLoader will use a dynamic proxy on HibGeoserverImpl.

The new geoServer2 bean is now instantiated by HibGeoServerWrapper. The dependencies on HibGeoServerLoader and on transactionManager are provided via Spring's depends-on attribute: it forces these classes to be instantiated before geoServer2 is. In this way, when geoserver2 will be postprocessed, it will be able to access the DB resources.

The catalogInterceptor is now used a little differently than in the Hibernate configuration. The entityManagerFactory wants the name of the class, so we will declare it also as a spring bean with the only purpose to have some static fields initialized.

These diagrams shows the relevant dependencies for the main applicationContext, one for the 1.7.x branch, the other for the trunk. We need this info in order to spot the beans we have to replace.

Status

20090909 Updated codebase, now we can run by default without touching anything with an embedded H2 db which resides inside the data directory.


20090907 04:42AM CEST - OK, the whole stuff has just been committed into the trunk. Have fun.


20090903 - Almost there. Stores, resources, workspaces, namespaces, layers and many other objects are properly persisted and retrieved. getMap and getFeatureInfo work. The patches proposed for small fixes in the trunk (GEOS-3416, GEOS-3313) have been accepted and committed (thanks to aaime and jdeolive).


20090731 - Moving the hibernate module into the trunk. Beans have been reviewed. A jira ticket has been created in order to make the catalog bean more hibernate friendly. Mappings have been partially fixed. Not all tests are green at the moment.


20090724 - GS boots with the new JPA structure. There are still many issues dealing with saving resources into the DB.


20090723 - The hibernate transaction manager is giving us a little headache. Switching the structure to a more standard JPA design.


As of today (20090722), the hibernate module compiles and passes all the tests (with a 43%  coverage reported by cobertura).

Compiling the whole geoserver also works -- that means that the web module passes its tests using the hibernate catalog.

Running geoserver in tomcat fails at startup, because the transaction manager seems not to be initialized properly. We are working on this problem.

Added by Emanuele Tajariol, last edited by Simone Giannecchini on Oct 16, 2009  (view change)
View Attachments (5) Info