PostGis. How do I find an error in a spatial query?

image


Good afternoon! I am Victor, a developer at Gems development. Every day, our team works with spatial data of varying complexity and quality. When performing a spatial intersection operation with Postgis in Postgresql, we encountered the following error:



XX000: GEOSIntersects: TopologyException: side location conflict at 10398.659 3844.9200000000001



The request leading to the error looks like this:



select q1.key,st_asGeoJson(geoloc)
    from usahalinsk.V_GEO_OOPT q1 
        where ST_Intersects(geoloc,
                ST_GeomFromGeoJSON('{"type":"Polygon","coordinates":
                    [[[11165.15,2087.5],[11112,2066.6],[11127.6,2022.5],
                    [11122.6,2020.7],
                    [11122.25,2021.2],[11107.07,2015.7],
                    [11121,1947],[11123.48,1922.99],[11128.42,1874.4],
                    [11131.5,1875],[11140.96,1876.81],[11160.73,1880.59],
                    [11201.04,1888.3],[11194.2,1908],[11221.93,1916.57],
                    [11223.3,1917],[11165.15,2087.5]]]}'))



The solution to this problem blocks the work of users, since it does not allow building reports on the data and slows down the work of providing services. Many actions in the system we are developing, such as: preparing a layout for a land plot, preparing an urban planning plan for a land plot, and others, use spatial operations like this.



Let's assume that the problem is incorrect geometry. Often this error is generated by the intersection operation if the objects involved in the query have self-intersections or duplicate points. An example of these geometry errors can be seen below. (The polygon border intersects itself and there are two identical coordinates in the line)





We have conducted our own investigation to find the causes of the error and want to tell you about it.

We are currently using Postgis 2.4 and Postgresql 9.6. Let's go straight to practice. Let's check the constant geometry for validity and find that everything works correctly.





We can assume that the matter is in the table (view) usahalinsk.V_GEO_OOPT in which we are looking for intersections. In order to confirm the hypothesis, we will check these data too.





But we don't find errors here either. In addition, the data was not included in the sample at all. If they were, then the task would be solved by correcting the found entries through the Postgis st_makeValid function.



But there are no errors in the view, and the request is not executed. We suggest looking at his plan.





Note: in the real model we use three columns for geometry (for polygons, lines and points), but for brevity we will call this the geoloc field - it stores the geometry and displays it in the view.



Our view usahalinsk.V_GEO_OOPT is built as a selection from the table with spatial data usahalinsk.d_geometry and a spatial index is created on the field with geometry.



This means that when executing a query, the index is being read and somewhere in the table, not getting into our selection, there is invalid spatial data that was included in the index, since it is built across the entire table.



Let's try deleting the index:



DROP INDEX usahalinsk.d_geometry_cs1_all_sx;


And let's try to fulfill the problematic request.





It ran without errors. We confirm that the problem is in the index. You can return the index, but with the condition for the correct geometry:



CREATE INDEX d_geometry_cs1_all_sx
  ON usahalinsk.d_geometry
  USING gist(geoloc)
  where st_isvalid(geoloc)=true;


Let's check the implementation and see the plan.





The request ran without errors, and the index in the plan is also used. The disadvantage of such a solution may be the slowdown of the insert / update, tk. additionally, the condition will be checked when rebuilding the index.



Let's return this change back and still try to find what objects in the index are causing our query to fail.



DROP INDEX usahalinsk.d_geometry_cs1_all_sx;
 
CREATE INDEX d_geometry_cs1_all_sx
  ON usahalinsk.d_geometry
  USING gist
  (geoloc);


Let me remind you that we have the coordinates of the error location:



XX000: GEOSIntersects: TopologyException: side location conflict at 10398.659 3844.9200000000001



But if we search in the data or as a result of the IsValidReason function, which returns the reason for the error, then we will not find anything similar.



select key,ST_IsValidReason(geoloc)
from usahalinsk.d_geometry 
    where st_isvalid(geoloc)!=true
        and ST_AsText(geoloc) like '%3844.9200000000001%';
        
select key,ST_IsValidReason(geoloc)
from usahalinsk.d_geometry 
    where st_isvalid(geoloc)!=true
        and ST_IsValidReason(geoloc) like '%3844.9200000000001%';


You can use the following script to find objects that affect the query. We will check each object in the table and intersect it with the desired constant. During execution, we catch exceptions and check their contents. If the error contains the coordinates we need, then this is our problem geometry.



do
$$
declare
    tKey bigint;
    rec record;
    error_text text;
    -- 
    error_info text:='GEOSIntersects: TopologyException: side location conflict at 10398.659 3844.9200000000001';
begin
    --    
    for rec in(select key from usahalinsk.d_geometry)
    loop
        begin
            select key into tKey
            from (select * from usahalinsk.d_geometry q1 
                                --   
                        where q1.key=rec.key
                            and ST_Intersects(geoloc,
                                    -- 
                                    ST_GeomFromGeoJSON('{"type":"Polygon","coordinates":[[[11165.15,2087.5],
                                    [11112,2066.6],[11127.6,2022.5],[11122.6,2020.7],
                                    [11122.25,2021.2],[11107.07,2015.7],[11121,1947],                                                    [11123.48,1922.99],[11128.42,1874.4],
[11131.5,1875],[11140.96,1876.81],                                    [11160.73,1880.59],[11201.04,1888.3],
[11194.2,1908],[11221.93,1916.57],[11223.3,1917],
                                    [11165.15,2087.5]]]}'))) geoQ;        
        exception when others then
                --    
              GET STACKED DIAGNOSTICS error_text = MESSAGE_TEXT;
            --    ,     
            if error_text=error_info then
                raise info '%',rec.key;    
            end if;                  
        end;
    end loop;
end$$;


As a result, we get three geometry keys that are easy to fix:



update usahalinsk.d_geometry 
set cs1_geometry_polygone=st_collectionextract(st_makevalid(geoloc),3)
where key in(
1000010001988961,
1000010001989399,
1000010004293508);


I will answer the question that arises: "why is it impossible to correct all erroneous geometry in the table, so as not to selectively search for the reasons?" ...



The fact is that spatial data comes to our system from various sources (including from Rosreestr) and we cannot perform correction (as a rule, it is accompanied by distortion) of all data. Having received the necessary keys, we analyze what data they represent and whether it is possible to correct them.



The trivial task of finding the cause of the error can turn into a whole investigation with a correction script at the end.



A more complex version of the problem: what if the intersection is performed not with a constant, but with another table? Alternatively, intersect each of the participating objects in the first table with every object in the second. And catch exceptions.



How often do you run into geometry problems and how do you ensure the quality of your spatial data?



All Articles