Tending the OpenStreetMap Garden

Gardening Blog Photo1

Yesterday I was investigating the OpenStreetMap elevation tag, when I was surprised to find that the third most common value is '0.00000000000'! Now I have my suspicions about any value below ten, but here are 13,832 features in OpenStreetMap that have their elevation mapped to within 10 picometres - roughly one sixth of the diameter of a helium atom - of sea level. Seems unlikely, to be blunt.

It could of course be hundreds of hyper-accurate volunteer mappers, but I immediately suspect an import. Given the spurious accuracy and the tendancy to cluster around sea level, I also suspect it's a broken import where these picometre-accurate readings are more likely to mean "we don't know" than "exactly at sea level". Curious, I spent a few minutes using the overpass API and found an example - Almonesson Lake. The NHD prefix on the tags suggests it came from the National Hydrography Dataset and so, as supected, not real people with ultra-accurate micrometres.

But what concerns me most is when I had a quick look at the data layer for that lake - it turns out that there are three separate and overlapping lakes! We have the NHD import from June 2011. We have an "ArcGIS Exporter lake" from October that year, both of which simply ignore the original lake created way back in Feb 2009, almost 5 years ago. There's no point in having 3 slightly different lakes, and if anyone were to try to fix a misspelling, add an attribute or tweak the outline they would have an unexpectedly difficult task. There is, sadly, a continual stream of imports that are often poorly executed and, worse, rarely revisted and fixed, and this is just one case among many.

Mistakes are made, of course, but it's clear that data problems like these aren't being noticed and/or fixed in a reasonable timescale - even just this one small example throws up a score of problems that all need to be addressed. Most of my own editing is now focussed on tending and fixing the data that we already have, rather than adding new information from surveys. And as the size of the OpenStreetMap dataset increases, along with the seemingly perpetual and often troublesome importing of huge numbers of features, OpenStreetMap will need to adjust to the increasing priority for such data-gardening.


This post was posted on 10 December 2013 and tagged imports, OpenStreetMap