admin管理员组

文章数量:1402939

I've a Snowflake table with a GEOGRAPHY datatype (polygons), which I try to convert to e GeoPandas DataFrame. The Snowpark Dataframe holds the SHAPE column as (GeographyType)

Calling

gpd.GeoDataFrame(session.table(table_name).to_pandas(),geometry="SHAPE",crs=4326)

results in an exception

  TypeError: Input must be valid geometry objects: {
  "coordinates": [
    [
      [
        [
          6.706410552876440e+00,
          4.658134676959030e+01,
          8.919489999999932e+02
        ],

I came up with an ugly workaround based on Convert pandas dataframe with a geojson-like column to Geopandas dataframe

from ast import literal_eval
from shapely.geometry import shape

df_pd = df_sp.to_pandas()

geom = df_pd.pop("SHAPE").apply(lambda x: shape(literal_eval(x)))
gpd.GeoDataFrame(df_pd, geometry=geom,crs=4326)

Is there a better option?

I've a Snowflake table with a GEOGRAPHY datatype (polygons), which I try to convert to e GeoPandas DataFrame. The Snowpark Dataframe holds the SHAPE column as (GeographyType)

Calling

gpd.GeoDataFrame(session.table(table_name).to_pandas(),geometry="SHAPE",crs=4326)

results in an exception

  TypeError: Input must be valid geometry objects: {
  "coordinates": [
    [
      [
        [
          6.706410552876440e+00,
          4.658134676959030e+01,
          8.919489999999932e+02
        ],

I came up with an ugly workaround based on Convert pandas dataframe with a geojson-like column to Geopandas dataframe

from ast import literal_eval
from shapely.geometry import shape

df_pd = df_sp.to_pandas()

geom = df_pd.pop("SHAPE").apply(lambda x: shape(literal_eval(x)))
gpd.GeoDataFrame(df_pd, geometry=geom,crs=4326)

Is there a better option?

Share Improve this question asked Mar 21 at 10:06 Raphael RothRaphael Roth 27.2k19 gold badges97 silver badges150 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 2 +200

The idea is to setup GEOGRAPHY_OUTPUT_FORMAT and use shapely.wkt.loads:

For sample data:

CREATE OR REPLACE TABLE geospatial_table (id INTEGER, SHAPE GEOGRAPHY);
INSERT INTO geospatial_table VALUES
  (1, 'POINT(-122.35 37.55)'),
  (2, 'LINESTRING(-124.20 42.00, -120.01 41.99)');
SELECT * FROM geospatial_table;

Code:

import geopandas
import shapely

from snowflake.snowpark.context import get_active_session
session = get_active_session()

session.sql("ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';").collect()

pd = session.table('geospatial_table').to_pandas()
pd['SHAPE'] = pd['SHAPE'].apply(shapely.wkt.loads)

geopandas.GeoDataFrame(pd,geometry="SHAPE",crs=4326)

Output:

EDIT:

This works unless you have Z-Coordinates, in this case we get MULTIPOLYGONZ from Snowflake which cannot be parsed by shapely.wkt.loads

In that case a WKB may be a better choice:

CREATE OR REPLACE TABLE geospatial_table (id INTEGER, SHAPE GEOGRAPHY);
INSERT INTO geospatial_table VALUES
  (1, 'POINTZ(-122.35 37.55 0)'),
  (2, 'LINESTRINGZ(-124.20 42.00 0.5, -120.01 41.99 0.5)');
SELECT * FROM geospatial_table;


import geopandas
import shapely

from snowflake.snowpark.context import get_active_session
session = get_active_session()

session.sql("ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKB';").collect()

pd = session.table('geospatial_table').to_pandas()
pd['SHAPE'] = pd['SHAPE'].apply(shapely.wkb.loads)

geopandas.GeoDataFrame(pd,geometry="SHAPE",crs=4326)

Output:

本文标签: