This article was first published on Nebula Graph Community public number
This article mainly introduces Geospatial Data and its specific practice in Nebula Graph.
Geospatial Data in Practice in Nebula Graph
What is Geospatial Data
Geospatial Data is data that contains information about simple geospatial features, such as points, linestrings, polygons, or other more complex shapes.
Nebula Graph introduced full support for Geospatial Data in version 2.6, including storage, computation, and indexing of geospatial data. Nebula Graph currently supports geospatial data of type Geography, which models geographic location information represented by pairs of latitude and longitude coordinates on the earth's spatial coordinate system.
Geospatial Data -- Geospatial data usage
Create Schema
Here we only take Tag as an example. Of course, the Geography type can also be used as an attribute column on Edgetype.
Nebula currently supports three spatial data types: point, line, and polygon. Here's how to create a Geography type property and insert geospatial data into Nebula.
CREATE TAG any_shape(geo geography);
CREATE TAG only_point(geo geography(point));
CREATE TAG only_linestring(geo geography(linestring));
CREATE TAG only_polygon(geo geography(polygon));
When the specific geographic shape information is not specified after the geography
attribute, it means that the column can store data of any geographic shape; when the shape type is specified, it means that only the geographic data of the shape can be stored, such as geography(point)
, which means that the column can only store the geographic data of the shape. Stores the geolocation information of the point shape.
insert data
Insert data into column geo
of Tag any_shape
:
INSERT VERTEX any_shape(geo) VALUES "101":(ST_GeogFromText("POINT(120.12 30.16)"));
INSERT VERTEX any_shape(geo) VALUES "102":(ST_GeogFromText("LINESTRING(3 8, 4.7 73.23)"));
INSERT VERTEX any_shape(geo) VALUES "103":(ST_GeogFromText("POLYGON((75.3 45.4, 112.5 53.6, 122.7 25.5, 93.9 28.6, 75.3 45.4))"));
Insert data into column geo
of Tag only_point
:
INSERT VERTEX only_point(geo) VALUES "201":(ST_Point(120.12,30.16)"));;
Insert data to geo
of Tag only_linestring
:
INSERT VERTEX only_linestring(geo) VALUES "302":(ST_GeogFromText("LINESTRING(3 8, 4.7 73.23)"));
Insert data into column geo
of Tag only_polygon
:
INSERT VERTEX only_polygon(geo) VALUES "403":(ST_GeogFromText("POLYGON((75.3 45.4, 112.5 53.6, 122.7 25.5, 93.9 28.6, 75.3 45.4))"));
When the shape of the inserted geographic data does not meet the geographic shape requirements of the column, an error will be reported and cannot be inserted:
(root@nebula) [geo]> INSERT VERTEX only_polygon(geo) VALUES "404":(ST_GeogFromText("POINT((75.3 45.4))"));
[ERROR (-1005)]: Wrong value type: ST_GeogFromText("POINT((75.3 45.4))")
We can see that the geospatial data insertion method is rather peculiar, and it is very different from the insertion of basic types such as int, string, and bool.
Let's take ST_GeogFromText("POINT(120.12 30.16)")
as an example, ST_GeogFromText is a geographic location information parsing function, which accepts a string type of geographic location data represented in the WKT (Well-Known Text) standard format:
POINT(120.12 30.16)
represents a geographic point at 120°12' east longitude and 30°16' north latitude. The ST_GeogFromText function will parse and construct a geography data object from the wkt parameter, and then INSERT
statement will store it in Nebula in WKB (Well-Known Binary) standard.
Geospatial functions -- Geospatial functions
The geospatial functions supported by Nebula can be divided into the following broad categories:
Constructor
- ST_Point(longitude, latitude), construct a geography point object based on a pair of longitude and latitude
Analytic Functions
- ST_GeogFromText(wkt_string), parses geography object from wkt text
- ST_GeogFromWKB(wkb_string), parsing geography objects from wkb text # Not officially supported because Nebula doesn't yet support binary strings
formatting functions
- ST_AsText(geogrpahy), output the geogrpahy object in wkt text format
- ST_AsBinary(geography), output the geography object in wkb text format # Not yet officially supported, because Nebula does not yet support binary strings
conversion function
- ST_Centroid(geography), calculate the center of gravity of the geography object, the center of gravity is a geography point object
predicate function
- ST_Intersects(geography_1, geography_2), to determine whether two geography objects intersect
- ST_Covers(geography_1, geography_2), to determine whether the first geography object completely covers the second one
- ST_CoveredBy(geography_1, geography_2), the antonym of ST_Covers
- ST_DWithin(geography_1, geography_2, distance_in_meters), to determine whether the shortest distance between two geography objects is less than the given distance
metric function
- ST_Distance(geography_1, geography_2), calculates the distance between two geography objects
These function interfaces follow OpenGIS Simple Feature Access and ISO SQL/MM standards. For specific usage, see Nebula document
Geospatial index -- Geospatial index
What is a Geospatial Index?
Geospatial indexes are used for fast filtering of geographic shapes based on spatial predicate functions, such as ST_Intersects, ST_Covers, etc.
Nebula uses Google S2 library for spatial indexing.
The S2 library projects the earth's surface onto a circumscribed cube, then recursively performs n times the quadratic for each square surface of the cube, and finally uses a space-filling curve -- Hilbert curve to connect these small squares The center of the grid.
When n is infinite, this Hilbert curve almost fills the square.
The S2 library uses a Hilbert curve of order 30.
The following figure is a schematic diagram of filling the earth's surface with the Hilbert curve:
It can be seen that the surface of the earth is finally divided into cells by these Hilbert curves. For any geographic shape on the earth's surface, such as the location of a city, a river, or a person, we can use several such grids to completely cover the geographic shape.
Each cell has a unique int64 CellID to identify. Therefore, the spatial index of a geographic object is to build a collection of S2 lattices that completely cover the geographic shape.
When building an index of a geospatial object, a collection of distinct S2 cells is constructed that completely covers the object being indexed. Index queries based on spatial predicate functions quickly filter out large numbers of irrelevant geographic objects by finding the intersection between the set of S2 cells covering the queried object and the S2 cells covering the indexed object.
Create a geography index
CREATE TAG any_shape_geo_index on any_shape(geo)
For geographic data whose shape is point, it can be represented by an S2 cell whose level is 30, so a point corresponds to one index entry; for geographic data whose shape is linestring and polygon, we use multiple S2 cells of different levels. to cover, so it will correspond to multiple index entries;
Spatial indexes are used to speed up the lookup of all geo predicates, such as for the following statement
LOOKUP ON any_shape WHERE ST_Intersects(any_shape.geo, ST_GeogFromText("LINESTRING(3 8, 4.7 73.23)"));
When there is no spatial index on the geo column of any_shape, this statement will first read all the data of any_shape into memory, and then use it to calculate whether it intersects with the point (3.0, 8.0), which is generally expensive. When the data volume of any_shape is large, the computational overhead will be unacceptable.
When the geo column of any_shape has a spatial index, the statement will first use the spatial index to filter out most of the data that absolutely does not intersect with the line, and finally read the memory and some may intersect, so it needs to be done again. calculate. In this way, the spatial index quickly filters out most of the data that cannot be intersected at a small cost, and finally only a small part is accurately filtered, which greatly reduces the computational cost.
Exchange graph database technology? To join the Nebula exchange group, please fill in your Nebula business card first at , and the Nebula assistant will pull you into the group~~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。