OpenStreetMap final part: filling the address hierarchy

In the very first part , we cut out a conditional city from a large data set and left only data with an address in it. The addresses were interpreted as belonging to this city. Those. knew exactly which country they were in, which region, and so on. But what if we need the addresses of not one locality, but the whole region, or maybe even several countries? How do you know where he is from?

And although in OpenStreetMap it is possible to indicate on each house in which country, region and further down the hierarchy it is, in Russia an abbreviated method is used - i.e. only street and house number. All monkey work on structuring the address will be done for us by the computer. He will do it faster and more correctly, if, of course, all the necessary data will be at his disposal.

Training

I will experiment in Saransk , or rather, in its urban district - by cutting it out with a rectangle, with the following coverage: the lower border (45 54), the upper one (45.5 54.3). I save the cut from the dump in pbf format, because the following tool works with it:

osmconvert -b=45,54,45.5,54.3 RU-local.o5m -o=SaranskGO.pbf

Now the whole idea is to add tags to all buildings with an address in which settlement they are located. This will be calculated by the entry of the geometry of the house into the contour of the settlement. For this, we need the OsmAreaTag plugin for osmosis (a more detailed description of the plugin from the author ). The author posted the compiled version of the plugin here . Osmosis itself can be picked up from the github . This is a Java application, so it's clear what it won't work without.

Installing the plugin

osmosis osmareatag plugins , . , windows c:\Users\<>\.openstreetmap\osmosis\plugins c:\Users\<>\AppData\Roaming\openstreetmap\osmosis\plugins. , osmareatag-1.3.zip plugins.

. :

<?xml version="1.0" encoding="UTF-8"?>
<tag-processing>  
  <area id="national-boundary" cache-file="national-boundary.idx">
    <match type="relation">
      <tag k="boundary" v="administrative"/>
      <tag k="admin_level" v="2"/>
    </match>
  </area>

  <transform>
    <name>Country</name>
    <match>
      <tag k="building" v=".*"/>
      <tag k="addr:housenumber" v=".*"/>
      <inside area="national-boundary"/>
    </match>
    <output>
      <add-tag k="addr:country" v="${ISO3166-1}" context-area="national-boundary"/>
    </output>
  </transform>
</tag-processing>

— . area id, , . match , OSM, . , .. . cache-file OSM . - , - , . , .

— , transform. match , : inside , area .

, output , , , , , national-boundary ISO3166-1. , .

, , , . . , , .

:

<?xml version="1.0" encoding="UTF-8"?>
<tag-processing>
  <area id="place">
    <match>
      <tag k="place" v="city|town|village|hamlet|isolated_dwelling|allotments"/>
    </match>
  </area>

  <transform>
    <name>Place</name>
    <match>
      <tag k="building" v=".*"/>
      <tag k="addr:housenumber" v=".*"/>
      <inside area="place"/>
    </match>
    <output>
      <add-tag k="addr:city-auto" v="${name}" context-area="place"/>
    </output>
  </transform>
</tag-processing>

addr:city-auto, , OSM. osm-xml, . :

call osmosis-0.48.3\bin\osmosis.bat --read-pbf SaranskGO.pbf --lp --tag-area-content file=tag-building-addr-place.xml --write-xml SaranskGO.place.osm

tag-building-addr-place.xml - , .

.. - , . , . . .

  <way id="103738775" version="2" timestamp="2019-09-20T18:28:15Z" uid="10124028" user="MarinaAR" changeset="74731679">
    <nd ref="1197639591"/>
    <nd ref="1197639690"/>
    <nd ref="1197639206"/>
    <nd ref="1197639237"/>
    <nd ref="1197639591"/>
    <tag k="building" v="yes"/>
    <tag k="addr:city" v=""/>
    <tag k="addr:street" v=" "/>
    <tag k="addr:housenumber" v="5"/>
    <tag k="addr:city-auto" v=""/>
  </way>

, . , , , CSV, QGIS. OSM, OSM , .. - . CSV.

Figure 1 addr: city is not the same as addr: city-auto
.1 addr:city addr:city-auto

It can be seen that entire villages are designated incorrectly. This is just rubbish in the name of the settlement. This is the confusion of the city of Saransk and the municipality of the same name, which includes several settlements. Or vice versa, in the place of the village name, the name of the rural settlement is entered there. On the territory of the city itself, you can see several dozen points where there were misprints in the name. As I said before: leave this business to machines, where you can make a mistake, a person will make a mistake.

Now only the name of the settlement has been assigned. The same can be done by analogy for binding to settlements and regions of countries.




All Articles