admin管理员组

文章数量:1312966

I've started learning Azure Synapse (and other Azure data engineering related stuff). I wanted to run a sample queries from Azure Synapse gallery, unfortunatelly in majority of cases I am getting errors:

"cannot be opened because it does not exist or it is used by another process"

It works when I am querying files on my storage account, it works when I am adding datasets from Synapse gallery and querying it.

It works when I am trying to query, for example file (from synapse gallery):

.csv

but it fails when I am trying to query files from

for example:

.csv

I am watching some tutorial, and on video it works without any other additional settings.

/* Explicitly specify schema */
select top 10 *
from openrowset(
        bulk '.csv',
        format = 'csv',
        parser_version ='2.0',
        firstrow = 2
    ) with (
        date_rep date 1,
        cases int 5,
        geo_id varchar(6) 8
    ) as rows

/* Windows style new line */
SELECT *
FROM OPENROWSET(
        BULK '.csv',
        FORMAT = 'CSV', PARSER_VERSION = '2.0',
        FIELDTERMINATOR =',',
        ROWTERMINATOR = '
'
    )
WITH (
    [country_code] VARCHAR (5) COLLATE Latin1_General_BIN2,
    [country_name] VARCHAR (100) COLLATE Latin1_General_BIN2,
    [year] smallint,
    [population] bigint
) AS [r]
WHERE
    country_name = 'Luxembourg'
    AND year = 2017;

1st is running ok, 2nd ends with error. Both from synapse gallery.

Any ideas how to run it and/or why is it not running? Thank you in advance

I've started learning Azure Synapse (and other Azure data engineering related stuff). I wanted to run a sample queries from Azure Synapse gallery, unfortunatelly in majority of cases I am getting errors:

"cannot be opened because it does not exist or it is used by another process"

It works when I am querying files on my storage account, it works when I am adding datasets from Synapse gallery and querying it.

It works when I am trying to query, for example file (from synapse gallery):

https://pandemicdatalake.blob.core.windows/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.csv

but it fails when I am trying to query files from

https://sqlondemandstorage.blob.core.windows

for example:

https://sqlondemandstorage.blob.core.windows/public-csv/population/population.csv

I am watching some tutorial, and on video it works without any other additional settings.

/* Explicitly specify schema */
select top 10 *
from openrowset(
        bulk 'https://pandemicdatalake.blob.core.windows/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.csv',
        format = 'csv',
        parser_version ='2.0',
        firstrow = 2
    ) with (
        date_rep date 1,
        cases int 5,
        geo_id varchar(6) 8
    ) as rows

/* Windows style new line */
SELECT *
FROM OPENROWSET(
        BULK 'https://sqlondemandstorage.blob.core.windows/public-csv/population/population.csv',
        FORMAT = 'CSV', PARSER_VERSION = '2.0',
        FIELDTERMINATOR =',',
        ROWTERMINATOR = '
'
    )
WITH (
    [country_code] VARCHAR (5) COLLATE Latin1_General_BIN2,
    [country_name] VARCHAR (100) COLLATE Latin1_General_BIN2,
    [year] smallint,
    [population] bigint
) AS [r]
WHERE
    country_name = 'Luxembourg'
    AND year = 2017;

1st is running ok, 2nd ends with error. Both from synapse gallery.

Any ideas how to run it and/or why is it not running? Thank you in advance

Share Improve this question asked Feb 2 at 15:44 buksbuks 4351 gold badge7 silver badges22 bronze badges 1
  • can you confirm if you have created an external location ? – Dileep Raj Narayan Thumula Commented Feb 10 at 11:52
Add a comment  | 

1 Answer 1

Reset to default 0

As you mentioned you are able to query the file from the ADLS.

The below are the steps that help you to query using the OPENROWSET Function. You need to create an external table on top of the file that is sitting in the ADLS.

Because at the time of external table creation you will be creating DATA SOURCE.

IF NO DATA SOURCE: Using OPENROWSET can read file contents directly from the URL specified in the BULK option without requiring a DATA_SOURCE.

USING DATA SOURCE: OPENROWSET with DATA_SOURCE allows access to files stored in a specified storage account.

Know more about how to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics

SELECT TOP 100 * 
FROM OPENROWSET(
    BULK 'new/ecdc_cases.csv',  
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS [file];

Results:

Below is the script to create an external table:

CREATE DATABASE <Your Serverless DB NAME>
Use <Your Serverless DB NAME>
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<YOUR MASTER KEY PASSWORD>'
CREATE DATABASE SCOPED CREDENTIAL <YOUR DATABASE SCOPED CREDENTIALS>
WITH IDENTITY ='SHARED ACCESS SIGNATURE',
SECRET ='Your SAS Token'
GO;
CREATE EXTERNAL DATA SOURCE <YOUR DATA SOURCE NAME>
WITH( LOCATION     = 'https://<YOUR STORAGE ACCOUNT>.dfs.core.windows/'
      , CREDENTIAL = credentialobject20

);

CREATE EXTERNAL FILE FORMAT <YOUR FILE FORMAT>  
WITH (  
    FORMAT_TYPE = DELIMITEDTEXT,  
    FORMAT_OPTIONS (FIELD_TERMINATOR = ',', STRING_DELIMITER = '"', FIRST_ROW = 2)  
)
CREATE EXTERNAL TABLE  <YOUR EXTERNAL TABLE> (
    [dateID] INT,
    [product_id] varchar(500),
    [location_id] varchar(500),
    [visits] INT
)
WITH(
    LOCATION ='</FILE_PATH/TO.CSV>',
    DATA_SOURCE = <YOUR DATA SOURCE NAME>,
    FILE_FORMAT = <YOUR FILE FORMAT>  
)
GO

You can also refer to the Create and use native external tables using SQL pools in Azure Synapse Analytics

You can also create an external table using the file as well.

below are the steps:

Once you create the external table you will able to see the entire script which is auto genrated by the system and when you run the select statement it gives results.

Results:

本文标签: azure synapsehow to run queries from galleryStack Overflow