GSIP 31 - Use DataAccess API

Update codebase to use DataAccess API

Overview

[GEOS-2568] Update codebase to use DataAccess API, which the previous assumption that all features are a very limited subset of GML Simple Features Profile 0 (the simplest available official GML profile).

Proposed By

Gabriel Roldán
Jody Garnett
Ben Caradoc-Davies

Assigned to Release

GeoServer 2.0.

State

Accepted and completed.

Motivation

GeoServer presently obtains data via the GeoAPI DataStore/SimpleFeature/SimpleFeatureType (DSSFSFT) API, which limits features to a "flat" list of simple data types in a fixed order. This is a very rudimentary subset of the ISO Feature Model defined in ISO 19101.

Many organisations have extensive repositories of data with complex structure, and have agreed to share this data using application schemas (ISO 19109), such as GeoSciML , that make full use of the ISO Feature Model. The limitations of DSSFSFT make it inadequate for this purpose, and so GeoServer as it stands cannot be used to deliver this data in complex feature form.

GeoTools incorporates substantial support for the GeoAPI DataAccess/Feature/FeatureType (DAFFT) API, which is the culmination of many attempts to provide complex feature support in GeoServer. Indeed, the DSSFSFT API is a specialisation of DAFFT, so there is already a GeoAPI relationship between these two interfaces.

Proposal

Goals

  • Support detection of any DataAccessFactory and loading into ResourcePool.
  • Update core resource management infrastructure to support DAFFT.
    • Once this work is in place, service modules can be incrementally improved to support DataAccess providers. This is outside the scope of this proposal.
  • Use performance/simplicity benefits of DSSFSFT by supporting as a special case where necessary.
  • Preserve all existing functionality.

Scope

  • This proposal aims to rework the global GeoServer plumbing to facilitate ongoing development of DataAccess-based providers and their integration into GeoServer.
  • This proposal does not aim to prove that any useful work can be done with GeoServer DataAccess support, merely to fix the infrastructure so that future incremental development can be made.
  • Getting DataAccess implementations to work with service modules is outside the scope of this proposal. Please see Gabriel's email linked below.
  • No DataAccess support will be provided at this time for:
    • Locking
    • Retyping
    • Versioning
    • Reprojection

Approach

It is proposed that GeoServer be migrated to use the DAFFT API. Migration consists principally of widening interfaces:

  • DataStore becomes DataAccess
  • SimpleFeature becomes Feature
  • SimpleFeatureType becomes FeatureType

Note that in each case, the refactored type is the supertype of the original type.

Modules affected

Because the proposed API changes affect global resource management, they have global impact, and require changes to the following modules.

  • main
  • data
  • web
  • wfs
  • wms

There are also minor changes required to GeoTools, but these are outside the scope of this proposal.

Changes to org.geoserver.catalog.ResourceInfo interface

DAFFT makes use of qualified names throughout, so the ResourceInfo interface will be extended to include qualified Name forms of the published and native names:

interface ResourceInfo {
    Name getQualifiedName();
    Name getQualifiedNativeName();
    ...

Changes to org.geoserver.catalog.FeatureTypeInfo interface

Because features are no longer simple, this interface must be widened. Note that all implementations in the GeoServer codebase will also be updated.

BEFORE:

interface FeatureTypeInfo extends ResourceInfo {
    SimpleFeatureType getFeatureType() throws IOException;
    FeatureSource getFeatureSource( ProgressListener listener, Hints hints ) throws IOException;
    ...

AFTER:

interface FeatureTypeInfo extends ResourceInfo {
    FeatureType getFeatureType() throws IOException;
    FeatureSource<? extends FeatureType, ? extends Feature> getFeatureSource( ProgressListener listener, Hints hints ) throws IOException;
    ...

Changes to org.geoserver.catalog.DataStoreInfo interface

Because we will support DataAccess, not just the DataStore subset, this interface must be widened. All implementations in the GeoServer codebase will likewise be modified.

BEFORE:

interface DataStoreInfo extends StoreInfo {
    DataStore getDataStore(ProgressListener listener) throws IOException;
    ...

AFTER:

interface DataStoreInfo extends StoreInfo {
    DataAccess<? extends FeatureType, ? extends Feature> getDataStore(ProgressListener listener) throws IOException;
    ...

Changes to org.geoserver.catalog.ResourcePool interface

We also widen the ResourcePool interface:

BEFORE:

class ResourcePool {
    public DataStore getDataStore( DataStoreInfo info ) throws IOException {
    ...
    public SimpleFeatureType getFeatureType( FeatureTypeInfo info ) throws IOException {
    ...
    public FeatureSource getFeatureSource( FeatureTypeInfo info, Hints hints ) throws IOException {
    ...

AFTER:

class ResourcePool {
    public DataAccess<? extends FeatureType, ? extends Feature> getDataStore( DataStoreInfo info ) throws IOException {
    ...
    public FeatureType getFeatureType( FeatureTypeInfo info ) throws IOException {
    ...
    public FeatureSource<? extends FeatureType, ? extends Feature> getFeatureSource( FeatureTypeInfo info, Hints hints ) throws IOException {

Changes to org.geoserver.catalog.ResourcePool implementation

  • ResourcePool will be updated to use DataAccessFinder so that both DataStoreFactorySpi and DataAccessFactory implementations are detected via SPI.
  • ResourcePool caching of feature types will be updates to support FeatureType.

Consequential changes throughout the codebase.

  • Because of the widespread use FeatureTypeInfo, DataStoreInfo, and ResourcePool, many classes will change. Most of these cases will be small type widenings. Where this cannot be accommodated, judicious use of instanceof will be made to preserve existing functionality.
  • The main change between Feature and SimpleFeature is the style of iteration of descriptors. Code may be updated from the SimpleFeature API (often found with pre-foreach [i.e. Java 4] iteration):
    SimpleFeatureType sft = ...
    List<AttributeDescriptor> descriptors = sft.getAttributeDescriptors();
    for (int i = 0; i < descriptors.size(); i++) {
          AttributeDescriptor ad = descriptors.get(i);
          ... do something with ad ...
    

    to use the Feature API:

    FeatureType ft = ...
    Collection<PropertyDescriptor> descriptors = ft.getDescriptors();
    for (PropertyDescriptor pd : descriptors) {
        if (pd instanceof AttributeDescriptor) {
            AttributeDescriptor ad = (AttributeDescriptor) pd;
            ... do something with ad ...
    

Tasks

  • Implement a test DataAccess so that unit tests can be written. (Ben)
  • Consequential GeoTools changes. (Ben)
  • GeoTools patches submitted. (Ben)
  • GeoTools patches reviewed, tested, and submitted to svn .(Jody has volunteered to assist as needed.)
  • Port main to DAFFT. (Ben)
  • Port data to DAFFT. (Ben)
  • Port web to DAFFT. (Ben)
  • Port wfs to DAFFT. (Ben)
  • Port wms to DAFFT. (Ben)
  • Get all GeoServer unit tests to pass. (Ben)
  • Submit changes as patches, one per module. (Ben)
  • Code review of patches. (Justin Deoliveira)
    • Note that because the API changes are global, all patches must be applied before GeoServer will build.
  • CITE tests run? (TOPP)
  • Performance test?
  • GeoServer patches committed to svn. (Ben)

Discussion

INSPIRE propaganda

Deployed public data access services need to use defined GML schemas, in particular the several hundred WFS deployments required to deliver data under the pan-European INSPIRE legislation.

"Community Schemas" - aka ISO "Application Schemas" known to a community - are supported by the GeoTools APIS - DataAccess. Gradually relaxing strict assumptions within Geoserver to allow such data to be delivered by GeoServer services makes GeoServer a relevant technology in emerging Spatial Data Infrastructures. This is a big piece of work, and needs to be done in a way which does not disrupt ability of people to play with GeoServer in the meantime.

Gabriel's advice

This proposal is a capture of an email Gabriel sent out in response to Ben's question about the use of DataAccess. Apparently Gabriel has a plan, we will need it captured as a change proposal so here it is.

Here is Gabriel's email on the subject:

Feedback

This section should contain feedback provided by PSC members who may have a problem with the proposal.

Major feedback from email concerns project scope and staffing.

Backwards Compatibility

State here any backwards compatibility issues.

Onwards to victory! But seriously, we are not planning on breaking anything that works now.

Voting

Alessio Fabiani:
Andrea Aime: +1
Chris Holmes:
Jody Garnett: +1
Justin Deoliveira: +1
Rob Atkinson: +1
Simone Giannecchini:

Links

JIRA GEOS-2568
Email discussion: GSIP 31 - Use DataAccess API - Revised and Implemented
Email discussion: review of patches for GSIP-31
[Wiki Page|]

Added by Jody Garnett, last edited by Ben Caradoc-Davies on Jul 13, 2009  (view change)
View Attachments (1) Info