admin管理员组

文章数量:1125904

I’m facing an issue while trying to create an MLTable YAML file for a dataset in Azure ML.

I have a default datastore in my workspace containing two folders (OK and NOK) with images. My goal is to read all images and use the folder name as the label for each image.

Here’s what I’ve tried so far:

mltable_yaml = """
type: mltable
paths:
  - file: ./OK  
  - file: ./NOK 
transformations:
  - read_from_directory:
      image_column: image_url  
      folder_column: label  
      recursive: true         
"""

# Create directory and save MLTable
mltable_dir = "image_data"
os.makedirs(mltable_dir, exist_ok=True)
with open(os.path.join(mltable_dir, "MLTable"), "w") as f:
    f.write(mltable_yaml)

training_data = Input(
    type="mltable",
    path=mltable_dir
)

However, when I run the experiment, I encounter the following error:

MLTable input is invalid. UserErrorException:
    Message: Encountered user error while fetching data from Dataset. Error: UserErrorException:
    Message: MLTable yaml schema is invalid: 
Error Code: ScriptExecution.Validation
Validation Error Code: Invalid
Validation Target: Script
Native error: Dataflow script error: InvalidScriptElement("read_from_directory")
    ScriptError(InvalidScriptElement("read_from_directory"))
=> Invalid script element "read_from_directory"
    InvalidScriptElement("read_from_directory")
Error Message: Yaml script is invalid: InvalidScriptElement("read_from_directory").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b
    InnerException None
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "MLTable yaml schema is invalid: \nError Code: ScriptExecution.Validation\nValidation Error Code: Invalid\nValidation Target: Script\nNative error: Dataflow script error: InvalidScriptElement(\"read_from_directory\")\n\tScriptError(InvalidScriptElement(\"read_from_directory\"))\n=> Invalid script element \"read_from_directory\"\n\tInvalidScriptElement(\"read_from_directory\")\nError Message: Yaml script is invalid: InvalidScriptElement(\"read_from_directory\").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b"
    }
}
    InnerException UserErrorException:
    Message: MLTable yaml schema is invalid: 
Error Code: ScriptExecution.Validation
Validation Error Code: Invalid
Validation Target: Script
Native error: Dataflow script error: InvalidScriptElement("read_from_directory")
    ScriptError(InvalidScriptElement("read_from_directory"))
=> Invalid script element "read_from_directory"
    InvalidScriptElement("read_from_directory")
Error Message: Yaml script is invalid: InvalidScriptElement("read_from_directory").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b
    InnerException None
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "MLTable yaml schema is invalid: \nError Code: ScriptExecution.Validation\nValidation Error Code: Invalid\nValidation Target: Script\nNative error: Dataflow script error: InvalidScriptElement(\"read_from_directory\")\n\tScriptError(InvalidScriptElement(\"read_from_directory\"))\n=> Invalid script element \"read_from_directory\"\n\tInvalidScriptElement(\"read_from_directory\")\nError Message: Yaml script is invalid: InvalidScriptElement(\"read_from_directory\").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b"
    }
}
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "Encountered user error while fetching data from Dataset. Error: UserErrorException:\n\tMessage: MLTable yaml schema is invalid: \nError Code: ScriptExecution.Validation\nValidation Error Code: Invalid\nValidation Target: Script\nNative error: Dataflow script error: InvalidScriptElement(\"read_from_directory\")\n\tScriptError(InvalidScriptElement(\"read_from_directory\"))\n=> Invalid script element \"read_from_directory\"\n\tInvalidScriptElement(\"read_from_directory\")\nError M

From the error details, it seems like the read_from_directory element is not recognized, but I’m unsure how to structure the YAML to correctly map the folder name to the label.

How to resolve this?

I’m facing an issue while trying to create an MLTable YAML file for a dataset in Azure ML.

I have a default datastore in my workspace containing two folders (OK and NOK) with images. My goal is to read all images and use the folder name as the label for each image.

Here’s what I’ve tried so far:

mltable_yaml = """
type: mltable
paths:
  - file: ./OK  
  - file: ./NOK 
transformations:
  - read_from_directory:
      image_column: image_url  
      folder_column: label  
      recursive: true         
"""

# Create directory and save MLTable
mltable_dir = "image_data"
os.makedirs(mltable_dir, exist_ok=True)
with open(os.path.join(mltable_dir, "MLTable"), "w") as f:
    f.write(mltable_yaml)

training_data = Input(
    type="mltable",
    path=mltable_dir
)

However, when I run the experiment, I encounter the following error:

MLTable input is invalid. UserErrorException:
    Message: Encountered user error while fetching data from Dataset. Error: UserErrorException:
    Message: MLTable yaml schema is invalid: 
Error Code: ScriptExecution.Validation
Validation Error Code: Invalid
Validation Target: Script
Native error: Dataflow script error: InvalidScriptElement("read_from_directory")
    ScriptError(InvalidScriptElement("read_from_directory"))
=> Invalid script element "read_from_directory"
    InvalidScriptElement("read_from_directory")
Error Message: Yaml script is invalid: InvalidScriptElement("read_from_directory").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b
    InnerException None
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "MLTable yaml schema is invalid: \nError Code: ScriptExecution.Validation\nValidation Error Code: Invalid\nValidation Target: Script\nNative error: Dataflow script error: InvalidScriptElement(\"read_from_directory\")\n\tScriptError(InvalidScriptElement(\"read_from_directory\"))\n=> Invalid script element \"read_from_directory\"\n\tInvalidScriptElement(\"read_from_directory\")\nError Message: Yaml script is invalid: InvalidScriptElement(\"read_from_directory\").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b"
    }
}
    InnerException UserErrorException:
    Message: MLTable yaml schema is invalid: 
Error Code: ScriptExecution.Validation
Validation Error Code: Invalid
Validation Target: Script
Native error: Dataflow script error: InvalidScriptElement("read_from_directory")
    ScriptError(InvalidScriptElement("read_from_directory"))
=> Invalid script element "read_from_directory"
    InvalidScriptElement("read_from_directory")
Error Message: Yaml script is invalid: InvalidScriptElement("read_from_directory").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b
    InnerException None
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "MLTable yaml schema is invalid: \nError Code: ScriptExecution.Validation\nValidation Error Code: Invalid\nValidation Target: Script\nNative error: Dataflow script error: InvalidScriptElement(\"read_from_directory\")\n\tScriptError(InvalidScriptElement(\"read_from_directory\"))\n=> Invalid script element \"read_from_directory\"\n\tInvalidScriptElement(\"read_from_directory\")\nError Message: Yaml script is invalid: InvalidScriptElement(\"read_from_directory\").| session_id=1a30b15a-7e85-498b-b735-2348bfe0625b"
    }
}
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "Encountered user error while fetching data from Dataset. Error: UserErrorException:\n\tMessage: MLTable yaml schema is invalid: \nError Code: ScriptExecution.Validation\nValidation Error Code: Invalid\nValidation Target: Script\nNative error: Dataflow script error: InvalidScriptElement(\"read_from_directory\")\n\tScriptError(InvalidScriptElement(\"read_from_directory\"))\n=> Invalid script element \"read_from_directory\"\n\tInvalidScriptElement(\"read_from_directory\")\nError M

From the error details, it seems like the read_from_directory element is not recognized, but I’m unsure how to structure the YAML to correctly map the folder name to the label.

How to resolve this?

Share Improve this question asked Jan 9 at 2:14 Luis GustavoLuis Gustavo 31 silver badge1 bronze badge New contributor Luis Gustavo is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 1
  • check read_from_directory: is valid schema in mltable file. – JayashankarGS Commented Jan 9 at 3:35
Add a comment  | 

1 Answer 1

Reset to default 1

There is no read_from_directory tranformantion schema in MLTable, check this documentation.

For AutoML image classification you need data in .jsonl file with below fields, check this documentation/

{
   "image_url":"azureml://subscriptions/<my-subscription-id>/resourcegroups/<my-resource-group>/workspaces/<my-workspace>/datastores/<my-datastore>/paths/<path_to_image>",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":"class_name",
}

image_url and label are required fields, also you need to give image url as complete datastore path.

Follow below steps to create jsonl file.

First, you need datastore path to each image so you create new data asset and take the path.

from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes, InputOutputModes
from azure.ai.ml import Input

from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient

credential = DefaultAzureCredential()

ml_client = MLClient.from_config(credential)
my_data = Data(
    path="./images",
    type=AssetTypes.URI_FOLDER,
    description="Fridge-items images",
    name="items-images",
)

uri_folder_data_asset = ml_client.data.create_or_update(my_data)

Here, i am having OK and NOK folders inside images.

You will get path in uri_folder_data_asset.path.

Next create jsonl file using below code.

import os
import json

folders = {
    "OK": "./images/OK",
    "NOK": "./images/NOK"
}

mltable_dir = "image_data_mltable"
os.makedirs(mltable_dir, exist_ok=True)

output_file = "./image_data_mltable/image_data.jsonl"

with open(output_file, "w") as jsonl_file:
    for label, folder_path in folders.items():
        for file_name in os.listdir(folder_path):
            if file_name.lower().endswith((".jpg", ".jpeg", ".png", ".bmp", ".gif")):
                record = {
                    "image_url": os.path.join(folder_path.replace('./images/',uri_folder_data_asset.path), file_name).replace("\\", "/"),
                    "label": label
                }
                jsonl_file.write(json.dumps(record) + "\n")

print(f"JSONL file created: {output_file}")

and create mltable file.

mltable_yaml = """
paths:
  - file: ./image_data.jsonl
transformations:
  - read_json_lines:
        encoding: utf8
        invalid_lines: error
        include_path_column: false
  - convert_column_types:
      - columns: image_url
        column_type: stream_info       
"""

with open(os.path.join(mltable_dir, "MLTable"), "w") as f:
    f.write(mltable_yaml)

Use read_json_lines in transformation, check this on how to prepare image data.

Output:

Now use it as input.

import mltable

training_data  = Input(type=AssetTypes.MLTABLE, path="./image_data_mltable")

tbl = mltable.load(uri="./image_data_mltable")
tbl.to_pandas_dataframe()

You refer this sample github documentation for AutoML classification to know more about it.

本文标签: machine learningAzure AutoML Image Classification JobStack Overflow