admin管理员组

文章数量:1122846

I have several JSON files and use Dataflow, but the issue is that the field 'buildingCode' does not exist in every records. In other JSON files, the field is either named differently as "controlCode" or is missing entirely.

The Dataflow Filter error message is NO field named "buildingCode" in the hierarchical structure.

in Filter:

notEquals(coalesce(redactableData.buildingCode,''),'1') ||
notEquals(coalesce(redactableData.controlCode,''),'1')

json:

{
"tableData": {
                "ns": "meta",
                "distribution": {
                    "ns": "meta",
                    "code": "A",
                    "description": "Building only"
                },
                "distributionReasons": {
                    "ns": "meta",
                    "distributionReason": [
                        {
                            "num": "1",
                            "ns": "meta",
                            "forceList": "true",
                            "description": "Permit"
                        }
                    ]
                },
                "buildingCode": "10"
            }
}

I have several JSON files and use Dataflow, but the issue is that the field 'buildingCode' does not exist in every records. In other JSON files, the field is either named differently as "controlCode" or is missing entirely.

The Dataflow Filter error message is NO field named "buildingCode" in the hierarchical structure.

in Filter:

notEquals(coalesce(redactableData.buildingCode,''),'1') ||
notEquals(coalesce(redactableData.controlCode,''),'1')

json:

{
"tableData": {
                "ns": "meta",
                "distribution": {
                    "ns": "meta",
                    "code": "A",
                    "description": "Building only"
                },
                "distributionReasons": {
                    "ns": "meta",
                    "distributionReason": [
                        {
                            "num": "1",
                            "ns": "meta",
                            "forceList": "true",
                            "description": "Permit"
                        }
                    ]
                },
                "buildingCode": "10"
            }
}
Share Improve this question asked Nov 21, 2024 at 15:49 thichxaithichxai 1,1311 gold badge7 silver badges17 bronze badges 2
  • Are you getting any error? – Rakesh Govindula Commented Nov 21, 2024 at 15:50
  • Yes. error message NO field named "buildingCode" in the hierarchical structure. after I saved the filter activity. – thichxai Commented Nov 21, 2024 at 16:06
Add a comment  | 

1 Answer 1

Reset to default 0

Based on how you want to get the required JSON files, you can follow any of the below approach. You need to use Delimited text dataset for the source of the dataflow.

  1. If you want to check required condition for each file using dataflow and copy the files using dataflow, you can try this approach. I have tried with normal JSON datasets and based on my observation, you can check whether the particular hierarchy is present or not but extracting the value of it dynamically without knowing it, might not be possible with JSON datasets. So, I have used csv datasets for this use case. Create the source delimited text dataset for the JSON file with below configurations.

    The JSON will be taken as a string column _col_0 in the dataflow and you can use the below condition in the filter transformation.

    instr(replace({_col0_},'controlCode','buildingCode'),'"buildingCode": "1"')==0
    

    This expression, first replaces the key name controlCode with buildingCode and then it checks the for the string "buildingCode": "1". If that string does not exist, it will give true otherwise it will be false and the data will be empty.

    Next, use another delimited text dataset for the sink of the dataflow with below configurations.

    In the sink settings, set the <target_filename>.json.

    Here, if the JSON file satisfies the required condition, it will be copied to the target and if it does not, it will create empty JSON file in the target.

  2. If you want to check all files and get a list of file names, you can make use of sink cache. In the source dataset, don't give any file name and give the wild card file path in the source of the dataflow. Also, give a column name for the file path.

    It will filter the required JSON files as shown below.

    Use sink cache to get this files list from dataflow activity to the pipeline and you can loop through that list and use a copy activity to copy these files to the required target.

本文标签: Azure Data Factory field doesn39t in all of JSON filesStack Overflow