admin管理员组文章数量:1123646
I have got 2 components in Azure Machine Learning. I have got 2 dataframes in the first component (called prep) which I want to pass into the next component (called middle) for further processing.
In the prep code, I have tried to save the dataframe into the component's output section, into a datastore and into the args location passed in as input parameters. As shown below:
print((Path(args.Y_df) / "Y_df.csv"))
df1.to_csv("./outputs/Y_df.csv")
df1.to_csv(args.Y_df.path)
df1.to_csv("azureml://subscriptions/subscription_id/resourcegroups/rg_group/workspaces/workspace_name/datastores/datastore_name/paths/azureml/forecast/testing/y_df.csv")
Out of these only the first method works. Now I want to pass this into the next component. So in the pipeline definition code, I have mentioned this:
def data_pipeline(
compute_train_node: str,
):
prep_node = prep()
transform_node = middle(Y_df=prep_node.outputs.Y_df,
S_df=prep_node.outputs.S_df)
I am trying to run a basic code in the middle component but it just does not get started. It fails with the following error:
Below are YAMLS for prep and middle: middle:
name: middle4 display_name: middle4
inputs: Y_df:
type: uri_file S_df:
type: uri_file
code: ./middle
environment: azureml:environment_name:4
command: >- python middle_script.py --Y_df ${{inputs.Y_df}}
--S_df ${{inputs.S_df}}
prep:
name: preprocessing24
display_name: preprocessing24
outputs:
Y_df:
type: uri_file
S_df:
type: uri_file
code: ./preprocessing
environment: azureml:environment_name:4
command: >-
python preprocessing_script.py
--Y_df ${{outputs.Y_df}}
--S_df ${{outputs.S_df}}
What am I doing wrong? How do I pass file from one component to the other?
Edit after trying out the method in the answer:
As of now, args.Y_df points to some random (probably default) file path instead of the one I have given it as part of the Output() function as mentioned in the answer. It then gives an error saying
OSError: Cannot save file into a non-existent directory: '/mnt/azureml/cr/j/32h438dshj537dj284ndhs630e1/cap/data-capability/wd/Y_df/testing'
Below is the code I have written for getting the path into the prep code. This path is used to save the dataframes as csv.
parser = argparse.ArgumentParser("prep")
parser.add_argument("--Y_df", type=str, help="Path of prepped data")
parser.add_argument("--S_df", type=str, help="Path of prepped data")
parser.add_argument("--clinical_actuals_path", type=str, help="Path of prepped data")
args = parser.parse_args()
I have got 2 components in Azure Machine Learning. I have got 2 dataframes in the first component (called prep) which I want to pass into the next component (called middle) for further processing.
In the prep code, I have tried to save the dataframe into the component's output section, into a datastore and into the args location passed in as input parameters. As shown below:
print((Path(args.Y_df) / "Y_df.csv"))
df1.to_csv("./outputs/Y_df.csv")
df1.to_csv(args.Y_df.path)
df1.to_csv("azureml://subscriptions/subscription_id/resourcegroups/rg_group/workspaces/workspace_name/datastores/datastore_name/paths/azureml/forecast/testing/y_df.csv")
Out of these only the first method works. Now I want to pass this into the next component. So in the pipeline definition code, I have mentioned this:
def data_pipeline(
compute_train_node: str,
):
prep_node = prep()
transform_node = middle(Y_df=prep_node.outputs.Y_df,
S_df=prep_node.outputs.S_df)
I am trying to run a basic code in the middle component but it just does not get started. It fails with the following error:
Below are YAMLS for prep and middle: middle:
name: middle4 display_name: middle4
inputs: Y_df:
type: uri_file S_df:
type: uri_file
code: ./middle
environment: azureml:environment_name:4
command: >- python middle_script.py --Y_df ${{inputs.Y_df}}
--S_df ${{inputs.S_df}}
prep:
name: preprocessing24
display_name: preprocessing24
outputs:
Y_df:
type: uri_file
S_df:
type: uri_file
code: ./preprocessing
environment: azureml:environment_name:4
command: >-
python preprocessing_script.py
--Y_df ${{outputs.Y_df}}
--S_df ${{outputs.S_df}}
What am I doing wrong? How do I pass file from one component to the other?
Edit after trying out the method in the answer:
As of now, args.Y_df points to some random (probably default) file path instead of the one I have given it as part of the Output() function as mentioned in the answer. It then gives an error saying
OSError: Cannot save file into a non-existent directory: '/mnt/azureml/cr/j/32h438dshj537dj284ndhs630e1/cap/data-capability/wd/Y_df/testing'
Below is the code I have written for getting the path into the prep code. This path is used to save the dataframes as csv.
parser = argparse.ArgumentParser("prep")
parser.add_argument("--Y_df", type=str, help="Path of prepped data")
parser.add_argument("--S_df", type=str, help="Path of prepped data")
parser.add_argument("--clinical_actuals_path", type=str, help="Path of prepped data")
args = parser.parse_args()
Share
Improve this question
edited 17 hours ago
Ameya Bhave
asked 21 hours ago
Ameya BhaveAmeya Bhave
1071 gold badge1 silver badge14 bronze badges
6
|
Show 1 more comment
1 Answer
Reset to default 0You have to give datastore path to the output of prep_node
like below.
from azure.ai.ml import MLClient, Input, Output
def data_pipeline(
compute_train_node: str,
):
prep_node = prep()
prep_node.outputs.Y_df= Output(type="uri_folder", path="azureml://datastores/<datastore_name>/paths/csvs/Y_df/")
prep_node.outputs.S_df= Output(type="uri_folder", path="azureml://datastores/<datastore_name>/paths/csvs/S_df/")
transform_node = middle(Y_df=prep_node.outputs.Y_df,
S_df=prep_node.outputs.S_df)
Here, i am giving Output
object with datastore path to Y_df
andS_df
.
Next, save csv files in prep
component like below.
df1.to_csv(Path(args.Y_df) / "Y_df.csv")
df2.to_csv(Path(args.S_df) / "S_df.csv")
If you want to save 2 files in single folder giving single output to prep
component and access them with that folder in next component.
本文标签: Move data from one component to the next in Azure Machine LearningStack Overflow
版权声明:本文标题:Move data from one component to the next in Azure Machine Learning - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736579864a1944936.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
prep_node.outputs.Y_df= "path"
before passing into further component. – JayashankarGS Commented 21 hours agoprep_node= prep()
addprep_node.outputs.Y_df= "path1"
,prep_node.outputs.S_df= "path2"
and pass furthertransform_node = middle(Y_df=prep_node.outputs.Y_df, S_df=prep_node.outputs.S_df)
– JayashankarGS Commented 20 hours ago