Vector Super Overlays

Caching KML

This part should be relatively easy. It simply involves caching the KML file for a particular region. Perhaps just storing the file itself compressed on disk should do it.

Building Vector Pyramids

This is a hard problem. The goal is to come up with a way to "rank" features in a dataset so that they only show up a particular tile resolution, and every resolution below that. Consider the following dataset:

Building a pyramid involves determining which features are visible at which scale.

At each level of the pyramid, a new set of features must be added to the visible set. There are a number of algorithms that could be used to realize this. They all follow more or less the same pattern:

request = ...;
features = get features in request.bbox;

for f in features

    if show( f,request ) then
         output.add( f );
    end

end

Where show is a function which does the actual work to determine if a feature should be visible at a particular scale / resolution.

Attribute Weighting

This algorithm involves using a particular attribute in the dataset to determine if a feature is visible or not. The inputs to the algorithm are:

  • the attribute
  • the minimum and maximum values of the attribute
  • a max number of resolutions to support
min = min value of "foo"
max = max value of "foo"

function show(feature,request) {
   v = feature.foo;
   res = request.resolution;
   vres = v / (max - min) * numberOfResolutions;

   return vres > res;
}

This algorithm has the benefit of being simple to implement. However the implementation above is somewhat naive, in that it assumes an even distribution of the value of an attribute between its minimum and maximum values. A better version of the algorithm would calculate the actual distribution.

It would be possible to approximate the distribution function of the values with a histogram. Using the histogram a better mapping function could be used to map values to resolutions. This would lead to a better streaming effect.

SLD ScaleDenominator

This problem is the same as the one users who want to do scale dependent mapping face. SLD has the ability to specify a set of scales, and for each scale a filter specifying which features in a dataset are visible at that scale.

This algorithm is already implemented and is what our kml outout does by default. The downside is that it puts the burden on the user to set up the SLD properly, which without a decent editor, is not easy chore.

Geometry Properties

One possible method could be to use properties of the geometry (area,length,etc...) to determine if should be visible at a particular resolution. This algorithm would have the benefit of placing no burden on the user, but it would be a heuristic, and not always accurate.

function show(feature,request) {
   a = feature.area;
   return  a / request.area > threshold;
}

Quad Tree + Attribute Weighting

This method uses a quad tree index to determine when features should become "visible". Attribute weighting is taken into account while building the index to determine which level in the quad tree a feature should be assigned.

Each node in the quad tree maintains the following properties:

  • A bounding box
  • A sorted set of features

Each node is constrained by a constant N, which is the maximum number of features that can be stored at any node. The index building process works as follows.

The index starts with a single empty root node. The bounding box of the root node is the bounding box of the world.

When a feature is added to the index it is added to the highest node in the tree in which its bounding box can "fit".

When a node exceeds its maximum number of features, it is "split", and 4 child nodes are created. Each of the 4 children are assigned a quadrant of the parents bounding box. An exception to this is the root node, which is split it 2 rather than 4.

When a node is full, one of its features must be "demoted" to a child. The feature which is demoted is the one with the smallest value of the ranking attribute. Only features whose bounds fit fully withing the bounds of a child are demoted.

The child node in which a feature is demoted to is the one in which its bounding box is encompassed. Features which span quadrant boundaries can arbitrarily be assigned.

Implementation

Specifying an Overlay Method

Overlay methods are something we are going to want to make pluggable, and probably something the client should be able to request. A simple "format_options" parameter should do the trick:

http://.../wms/reflect?layers=foo&format=kml&format_options=overlayMethod:[sld|attribute|geometry]

The value will be available via:

GetMapRequest req = ...;
String overlayMethod = (String) req.getFormatOptions().get( "overlayMethod" );

Overlay Interface

It makes sense to hide the overlay or filtering algorithm behind an interface. Something like the following:

interface OverlayFilter {

  /**
   * Name of the filter, used to identify the method in request.
   */
  String getName();

  /**
   * Determines if the feature should be filtered.
   */
  boolean filter( Feature feature, GetMapRequest request );

}

The implementations then become:

class AttributeOverlayFilter implements OverlayFilter {

   String getName() {
      return "attribute";
   }

   ...
}

class SLDOverlayFilter implements OverlayFilter {

   String getName() {
      return "sld";
   }

   ...
}

class GeometryLengthOverlayFilter implements OverlayFilter {

   String getName() {
      return "length";
   }

   ...
}

class GeometryAreaOverlayFilter implements OverlayFilter {

   String getName() {
      return "area";
   }

   ...
}

Refactoring KMLVectorTransformer

Currently our KML output uses only the "sld" method to filter features. It should be simple enough to filter this logic out into SLDOverlayFilter and have it be the default. The experimental transformer can then override and load the appropriate overlay filter algorithm.

Algorithm Implementation Details

Proposed Speed Improvement

The current implementation is pretty slow for large feature sets (which is bad, since regionating is mostly useful on... large feature sets).  A possible solution is to fill in the index lazily, which would let us generate tiles sooner but take longer to regionate the entire dataset.  More details on regionating improvements.

SLD-based Regionating (regionateBy:sld)

The implementation of this method is fairly straightforward, it simply delegates the inclusion decision to the rendering code by querying the Style to see if any rules exist that apply to a particular feature at the current zoom level.  See SLDRegionatingStrategy.java

Attribute-based regionating (regionateBy:data)

The implementation of this regionating method closely follows the description above, however some minor tweaks are made.  Instead of creating a smart tree that inspects the attribute values during insertion, we assume the features are inserted in order of prominence, with most prominent features first.  The sorting is left to the datastore.  After the tree is generated, it is stored in an embedded database in a table indexed by x,y, and z (the coordinates (x,y) of a tile in the grid at zoomlevel z) containing the fid's of the features for each tile.  When a tile is requested, we query the database, keep the resulting collection of strings in a set, and check the set for the fid of each feature when considering them for inclusion.

Geometry-based regionating (regionateBy:geo) [Highly experimental]

The current strategy for this is as follows:

Sort by zoomLevel(f.getGeometry()), the highest zoomlevel at which a feature fits into a single tile.  Then build the tree based on that.  Features that occupy large swaths of the map and features that span multiple tiles will automatically get assigned to higher zoom levels, minimal code will need to be added.

GeoServer + GeoWebCache Collaboration

Some features are being added to GeoWebCache to support regionated KML as well.  

  • Generate NetworkLinks to the tile hierarchy.  This is fairly straightforward, GeoWebCache generates small KML files that link to the tiles generated by GeoServer.  The regionated KML itself is also cached by GeoWebCache to improve performance.
  • Truncate the tile links when no features are present. Since data points are often clustered, including all the tiles at a zoomlevel has a decent chance of including many tiles that contain no features, especially as a client zooms in.  To alleviate the bandwidth and client-side performance costs of this, GeoWebCache implements a smart link generating scheme where the link generator checks child tiles before sending a link and only directs the client to download tiles that actually contain feature data.  GeoServer facilitates this by setting a "204 No Content" status code for tiles containing no features.
Added by Justin Deoliveira, last edited by David Winslow on Jul 03, 2008  (view change)
View Attachments (6) Info