admin管理员组

文章数量:1336304

I have some polygons which I am trying to upload to BigQuery. The polygons are creating shapely based on a Point (lat & long) and a radius, i.e., polygon = shapely.geometry.Point(lng, lat).buffer(r). They are first converted into a dataframe of geojsons using geojson.dumps(mapping(polygon)) then uploaded to BigQuery and finally converted to a GEOGRAPHY type using SELECT SAFE.st_geogfromgeojson(polygon, make_valid => TRUE) FROM table_name.

However, as a part of this process, some entries became NULL (there were no missing values in the uploaded dataframe). I suspect this has to do with shapely polygons not being completely recognized by BigQuery and the ones that are invalid were automatically removed.

Is there a way to make sure this does not happen?

I have some polygons which I am trying to upload to BigQuery. The polygons are creating shapely based on a Point (lat & long) and a radius, i.e., polygon = shapely.geometry.Point(lng, lat).buffer(r). They are first converted into a dataframe of geojsons using geojson.dumps(mapping(polygon)) then uploaded to BigQuery and finally converted to a GEOGRAPHY type using SELECT SAFE.st_geogfromgeojson(polygon, make_valid => TRUE) FROM table_name.

However, as a part of this process, some entries became NULL (there were no missing values in the uploaded dataframe). I suspect this has to do with shapely polygons not being completely recognized by BigQuery and the ones that are invalid were automatically removed.

Is there a way to make sure this does not happen?

Share asked Nov 19, 2024 at 18:28 Winston LiWinston Li 595 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

The query you used has SAFE. prefix - which is designed exactly to allow this to happen, and to convert something that BigQuery cannot accept into NULL.

Remove the SAFE. prefix, and the query would fail and you should get an error describing why something cannot be converted to Geography type.

What I usually do is create a table with failures:

CREATE TABLE tmp.failures AS
SELECT geoid, polygon
FROM table_name
WHERE polygon IS NOT NULL 
 AND SAFE.st_geogfromgeojson(polygon, make_valid => TRUE) IS NULL

This create a table with failures only - the rows where the original polygon is not NULL, but the result is NULL. You can then check the cause of each failure row one by one, by removing SAFE. prefix and checking error message, with queries like

SELECT st_geogfromgeojson(polygon, make_valid => TRUE)
WHERE geoid = 12345

本文标签: pythonBigQuery does not recognize Shapely PolygonsStack Overflow