Communication of geospatial data from server to client is what Geoserver is all about. The data in question may be rendered onto a map with many different layers (Geoserver/WMS) or may be published as "raw" data (Geoserver/WFS and Geoserver/WCS). Regardless of the particular data being communicated, all data must be accompanied by an unambiguous description of the coordinate reference system (CRS) used by the data. In order for the client to make sense of the data, client and server must agree on the CRS used by the data! This article attempts to address the many ways in which this is not actually occuring.
The current suite of OGC/ISO standards concerning the publication of geospatial data on the web provide for communication of CRS information by reference. That is, client and server use CRS definitions which are provided by external authorities. The most common external authority is the European Petroleum Survey Group (populating the "EPSG" namespace), followed by OGC/ISO themselves (populating "CRS" and "AUTO" namespaces.) Authorities are "universally recognized" only if they are specified in the standard. The standards do not prohibit alternate authority namespaces, but also do not provide any means for a server to communicate what authority is associated with a non-standard namespace. In practice, only the namespaces defined in the standards are interoperable.
There are currently three separate ways in which this situation (communicate CRS only by reference to predefined authorities) causes ambiguity:
- Authorities not assigned a namespace in the suite of standards (e.g. ESRI, Cubewerks) tend to "borrow" the predefined EPSG namespace, using a "user-defined" code. This does not communicate a CRS from server to client (or vice versa).
- Common practice conflicts with compliance with the OGC/ISO suite of standards when it comes to axis ordering. Consequently, merely specifying "EPSG:4326" requires both client and server to guess whether "the other guy" is swapping axes or obeying the standard. (See [GEOTOOLS:The axis order issue].)
- Under this system of "communication by reference", servers may not provide a complete coordinate reference system definition to a client. In essense, this means that servers may not publish data sets which use coordinate systems not blessed by an external authority. Such data sets must be reprojected into a "blessed" CRS before the server can indicate the CRS to the client.
Every month, some joker on either the Geotools or GeoServer list points out one or more of these ambiguities and a new debate begins. The common thread to all these discussions is that "our" software can't be sure what the "other" software means when it says "X". Every time it comes up, some tentative, "hacky" solution is arrived at which resolves the problem for a specific situation, or a specific combination of client and server or client and data store. Unfortunately, there is no single good solution which works with all combinations of clients and servers (or applications and data stores). Resolving the problem for one case will break a different case. We can flip flop back and forth between opposite conventions forever, but this is not productive.
This page is the start of an attempt to unify the varied discourse on this topic. Specific circumstances will not be considered except as illustrations of a category of problem. It is hoped that abstracting the problem in this way may lead us to a single fix (or a single approach) which is applicable to an entire suite of specific troubles. As such, this article considers "reference" to a CRS identifier to be the root cause of all these problems. It follows that communicating CRS by explicit definition resolves ambiguity, because then each participant in the communication is only required to assume that the other guy is being self-consistent.
This section considers current strategies and proposed solutions to the problem of CRS ambiguity. Not all solutions attempt to address the overall problem of communicating CRS by reference. I attempt to present solutions in the context of the larger problem to give a sense of perspective.
The OGC OWS-3 Interoptability experiment suffered from these same problems that plague the GeoTools community - as an experiment they decided to adopt the following text from the WFS 1.1 specification.
Any valid URI value can be assigned to the srsName attribute. However, in order to enhance interoperability, a web feature service must be able to process srsName attribute values with the following format models:
In these format models, the values <EPSG code> are placeholders for actual EPSG code values. Here is an example of the srsName where the assigned value follows one of the required format models:
This experiment (2004) was a success, and it is expected the OGC will revise a future WMS specification with this in mind (NOTE: they may have already done so with WMS 1.3.0 (2006), see next section).
The current OGC WMS specification requires that CRS communication by reference assume the format "authority:code". As such, this is the form used to communicate CRS by reference whether the implementations obey the EPSG axis ordering or not. The crux of problem two is the question: When I see the code "EPSG:xxxx", what do I assume about axis ordering? Using a URN dodges this question by referring to a CRS with the format "urn:ogc:def:crs:EPSG:xxxx". Using this alternate format is essentially a contract that both parties will adhere to the axis ordering defined by the external authority, and hence resolves the ambiguity.
Note that this usage is not allowed by the current WMS standard. One may not legally include a "CRS=urn:ogc:def:crs:EPSG:xxxx" parameter in a GetMap request as of WMS 1.3.0. It may be allowed by WFS and WCS standards, but I have not investigated this. As such, the adoption of this convention permits interoperability with all implementations which adopt this same convention. Implementations which have elected to solve this problem in a different nonstandard way will remain ambiguous.
In order to be effective, this proposed solution must really be accepted by a much larger audience than just one server and one client implementation (Geoserver and GeoTools/Udig). The only way this convention will resolve any of our persistent troubles is if it becomes a de facto or real standard. However, if you must round up a large body of people to agree to follow a new convention, why not invest your efforts convincing them to agree to follow conventions we already have?
Exploring this solution has revealed that a convention can indeed solve the problem theoretically without solving it practically. Actually, solving practical problems theoretically is the easy part. Our URN solution will indeed solve the problem if we wave a magic pixie wand and brainwash all development teams on all implementations of all relevant specifications to adopt it. However, if we did have such a magic pixie wand, we could just brainwash everyone into conforming to the existing spec.
|OGC URN Summary|
The two most important lessons learned from exploration of the URN solution are:
A unique feature of the WMS 1.3.0 specification which I hope to see promulgated to the WCS and WFS specifications is the ability to communicate the definition of a CRS between client and server. This is accomplished by providing a URL to a publicly accessible file which explicitly defines the CRS. Close examination of this specification (which was published in 2006) reveals that it may well have been informed by the result of the previously mentioned OWS-3 experiment. OGC's choice to use a URL to refer to a definition certainly seems to be "informed by" the option to use a URL in OWS-3. However, the new WMS specification seems to have refined the concept by not requiring a particular URL (e.g. http://www.opengis.net/gml/srs/epsg.xml#<EPSG code>), but allowing users to specify any URL which contains a CRS definition. Is there a way to use this to solve all three problems?
Communicating the definition of CRSes defined by alternate authorities eliminates the need for client and server to agree on a namespace for that authority. The new CRS is fully identified by the publicly accessible file. This includes the authority name and code as well as the complete definition of all CRS components (ellipsoid, projection, projection parameters...) Clients do not even need to be aware of the authority's existance to use the CRS.
Communicating the CRS definition instead of just referring to the authority and code may seem unnecessarily verbose, but it certainly qualifies as something different. As this feature was not available until recently, it has not had time to grow a culture of misuse. One may reasonably assume that any CRS provided by this mechanism is what is meant, and by now one hopes that people realize that axis order is important.
Communicating the CRS definition makes possible the publishing of data in nonstandard CRSes. No longer is the Weather Research and Forecasting model required to spit out data on a grid framed by a CRS blessed by the EPSG. (This is good, because it doesn't.)
So this solution at least addresses all three problems. Revisiting the lesson from our URN solution: is it widely accepted? On the surface, it would surely seem so. It is not only vetted by the OGC but also by ISO. The problem comes when one attempts to discern the format of the CRS description. The contents of the file pointed to by the URL is not defined. It could be GML. It could also be WKT. An Arc/Info projection file may also qualify (but might not because its missing some metadata to make it 19111 compliant...) But wait: The file could also be in Microsoft Word format! It doesn't need to be a text file! So now we are back to "every implementation for itself". Whatever convention we adopt (in terms of file format) may not be compatible with the arbitrary choices made by other implementations.
Having said this, this provision in the current WMS spec is certainly a step in the right direction. Clients can (and should) support as many formats as possible, and servers can (and should) try to offer CRS definitions in a few commonly understood formats (e.g., WKT and GML). Both clients and servers should view this alternate means of expressing CRS as a chance to start over and wipe the slate clean. There should no longer be any non conforming axes swapping with this new system.
|Communicate CRS Definition Summary|
This solution is being offered by OGC and ISO, not by Geoserver or GeoTools implementors. The solution is not total, in that implementations are free to choose the file format used to express the CRS definition. However, there is a well-constrained process used by the server to indicate a desire to communicate a CRS definition instead of an identifier.
This solution involves educating the providers of data about this interoperability issue. Think of it this way: if you had a bunch of documents you needed to publish to the web in PDF format, would you produce these PDF documents with software which made the text come out sideways? What if some clients/customers/taxpayers had viewers which rotated the text and some did not? Do you serve both rotated and non-rotated files? Do you even intend to support the quirks of all possible PDF readers, or only the ones which implement Adobe's spec? Hopefully, you decided that supporting all possible options is beyond the scope of your task. All you need to do is ensure that the data which leaves your building is correctly encoded into PDF and then it becomes the reader's problem. After all, if one PDF reader isn't working, your customers can always download a different one which is known to work.
This issue with geospatial data is exactly the same. You as a data provider just need to ensure that your data is encoded correctly as it leaves your building. After that, the client is responsible for making sense of it. If necessary, you can post links to clients which are known to work. If you have an old server which incorrectly encodes data, please please please correct the problem when you upgrade.
One final example: traffic lights. You may adopt the convention that red-means-stop and green-means-go. Or you may adopt the convention red-means-go and green-means-stop. As long as everyone is doing the same thing there is no problem because both conventions are workable technical solutions. However, a society where some people adopt one convention and others adopt the opposite has a built-in method of population control.
|Education of Data Providers Summary|
In some respects, this controversy is not a technical problem. There is no shortage of viable technical solutions. The main problem is that some data providers are using tools which believe "red-means-stop" and others are using tools which believe "red-means-go". Convincing data providers to select and configure tools such that they offer conforming data will indeed force the issue for clients. A successful campaign will phase out the nonconforming installations over time, leading to a much brighter day for interoperability.