admin管理员组

文章数量:1122846

I'm trying to insert in a DF a JSON which is located in a key named "data" inside another JSON file by using Python, I'm reading this JSON from an API.

The original JSON is this one, and you can tell the JSON that I'm actually trying to insert in the DataFrame inside the "data" key:

{
  "header": {
    "responseId": "230e8981-636e-40bc-b64d-535d7ba27256",
    "responseTime": "2024-11-21 12:29:49.301 +0000",
    "version": "4.2.4",
    "url": "",
    "requestInfo": {
      "requestTime": "2024-11-21 12:29:49.079 +0000",
      "requestId": "1234",
      "domainName": "BR_NEGATIVE_CREDIT",
      "matchKey": "325",
      "format": "json",
      "subdomainName": "COMM_TRADE"
    }
  },
  "messages": [
    {
      "id": "325.Dspl1mJr/2gmmuGDpxiuR8tuzIt0z9bqXCpOgXnb4efYB2GdA8aPDoG1w7IJxCRHcmHz91FtHqOFUPe+Gm5JHw==.XX.2022-04-18.000000802405423.1730897105532.e2e902af7a6efac86de4a656c2bdd88872a98721eb25322259975612fcc8d9e5",
      "code": "OK",
      "message": "Record for \"325.Dspl1mJr/2gmmuGDpxiuR8tuzIt0z9bqXCpOgXnb4efYB2GdA8aPDoG1w7IJxCRHcmHz91FtHqOFUPe+Gm5JHw==.XX.2022-04-18.000000802405423.1730897105532.e2e902af7a6efac86de4a656c2bdd88872a98721eb25322259975612fcc8d9e5\" found.",
      "detail": null,
      "type": "GET",
      "level": "INFO",
      "transactionId": "230e8981-636e-40bc-b64d-535d7ba27256",
      "additionalInfo": null
    }
  ],
  "data": {
    "325.Dspl1mJr/2gmmuGDpxiuR8tuzIt0z9bqXCpOgXnb4efYB2GdA8aPDoG1w7IJxCRHcmHz91FtHqOFUPe+Gm5JHw==.XX.2022-04-18.000000802405423.1730897105532.e2e902af7a6efac86de4a656c2bdd88872a98721eb25322259975612fcc8d9e5": "{\"ingestionHeader\": {\"geo\": \"BR\", \"dataDomain\": \"BR_NEGATIVE_CREDIT\", \"dataSubDomain\": \"COMM_TRADE\", \"dataViewName\": \"NPUR_BR_NEGATIVE_CREDIT_COMM_TRADE_VIEW\", \"dataViewVersion\": \"1.0.0\", \"correlationId\": \"b8703cc5-3785-4cc7-a9ba-d5bd2214c17a\", \"sourceId\": \"211949fb-64da-4b7b-8e71-ee0a88339d89\", \"regulatory\": false, \"sourceTimestamp\": 1730897571270, \"recordUUID\": \"1b165b5c-f2cd-4e4f-8dc9-16b28110cd7c\", \"sourceRecordUUID\": \"1b165b5c-f2cd-4e4f-8dc9-16b28110cd7c\", \"recordNumber\": 8, \"entityKey\": \"325\"}}"
  },
  "histories": null,
  "schemaKey": "NPUR_BR_NEGATIVE_CREDIT_COMM_TRADE_VIEW-1.0.0-Draft-9"
}

I'm trying to use the following function to insert the data into the DataFrame:

df = pd.json_normalize(data["data"])

But instead of returning the data, what is being returned is a hash value:

T2JqAQIWYXZyby5zY2hlbWHw5gF7InR5cGUiOiJyZWNvcmQiLCJuYW1lIjoiQlZTQ25zbXJUcmFkZSIsIm5hbWVzcGFjZSI6ImNvbS5lcXVpZmF4LmJ2cy5uZWdhdGl2ZS5tb2RlbCIsImZpZWxkcyI6W3sibmFtZSI6ImluZ2VzdGlvbkhlYWRlciIsInR5cGUiOnsidHlwZSI6InJlY29yZCIsIm5hbWUiOiJJbmdlc3Rpb25IZWFkZXIiLCJuYW1lc3BhY2UiOiJjb20uZXF1aWZheC5mYWJyaWMuaW5nZXN0aW9uLmNvbW1vbnMuZG9tYWluIiwiZmllbGRzIjpbeyJuYW1lIjoiZ2VvIiwidHlwZSI6eyJ0eXBlIjoic3RyaW5nIiwiYXZyby5qYXZhLnN0cmluZyI6IlN0cmluZyJ9fSx7Im5hbWUiOiJkYXRhRG9tYWluIiwidHlwZSI6eyJ0eXBlIj

When I try to use the Bruno Application to test it, the data is returned completly fine.

Is there a way to load this JSON located in this data key in a DF ?

I'm trying to load a JSON file from an API using Python, but instead of returning the actual data, it has been returning a hash value.

I'm trying to insert in a DF a JSON which is located in a key named "data" inside another JSON file by using Python, I'm reading this JSON from an API.

The original JSON is this one, and you can tell the JSON that I'm actually trying to insert in the DataFrame inside the "data" key:

{
  "header": {
    "responseId": "230e8981-636e-40bc-b64d-535d7ba27256",
    "responseTime": "2024-11-21 12:29:49.301 +0000",
    "version": "4.2.4",
    "url": "",
    "requestInfo": {
      "requestTime": "2024-11-21 12:29:49.079 +0000",
      "requestId": "1234",
      "domainName": "BR_NEGATIVE_CREDIT",
      "matchKey": "325",
      "format": "json",
      "subdomainName": "COMM_TRADE"
    }
  },
  "messages": [
    {
      "id": "325.Dspl1mJr/2gmmuGDpxiuR8tuzIt0z9bqXCpOgXnb4efYB2GdA8aPDoG1w7IJxCRHcmHz91FtHqOFUPe+Gm5JHw==.XX.2022-04-18.000000802405423.1730897105532.e2e902af7a6efac86de4a656c2bdd88872a98721eb25322259975612fcc8d9e5",
      "code": "OK",
      "message": "Record for \"325.Dspl1mJr/2gmmuGDpxiuR8tuzIt0z9bqXCpOgXnb4efYB2GdA8aPDoG1w7IJxCRHcmHz91FtHqOFUPe+Gm5JHw==.XX.2022-04-18.000000802405423.1730897105532.e2e902af7a6efac86de4a656c2bdd88872a98721eb25322259975612fcc8d9e5\" found.",
      "detail": null,
      "type": "GET",
      "level": "INFO",
      "transactionId": "230e8981-636e-40bc-b64d-535d7ba27256",
      "additionalInfo": null
    }
  ],
  "data": {
    "325.Dspl1mJr/2gmmuGDpxiuR8tuzIt0z9bqXCpOgXnb4efYB2GdA8aPDoG1w7IJxCRHcmHz91FtHqOFUPe+Gm5JHw==.XX.2022-04-18.000000802405423.1730897105532.e2e902af7a6efac86de4a656c2bdd88872a98721eb25322259975612fcc8d9e5": "{\"ingestionHeader\": {\"geo\": \"BR\", \"dataDomain\": \"BR_NEGATIVE_CREDIT\", \"dataSubDomain\": \"COMM_TRADE\", \"dataViewName\": \"NPUR_BR_NEGATIVE_CREDIT_COMM_TRADE_VIEW\", \"dataViewVersion\": \"1.0.0\", \"correlationId\": \"b8703cc5-3785-4cc7-a9ba-d5bd2214c17a\", \"sourceId\": \"211949fb-64da-4b7b-8e71-ee0a88339d89\", \"regulatory\": false, \"sourceTimestamp\": 1730897571270, \"recordUUID\": \"1b165b5c-f2cd-4e4f-8dc9-16b28110cd7c\", \"sourceRecordUUID\": \"1b165b5c-f2cd-4e4f-8dc9-16b28110cd7c\", \"recordNumber\": 8, \"entityKey\": \"325\"}}"
  },
  "histories": null,
  "schemaKey": "NPUR_BR_NEGATIVE_CREDIT_COMM_TRADE_VIEW-1.0.0-Draft-9"
}

I'm trying to use the following function to insert the data into the DataFrame:

df = pd.json_normalize(data["data"])

But instead of returning the data, what is being returned is a hash value:

T2JqAQIWYXZyby5zY2hlbWHw5gF7InR5cGUiOiJyZWNvcmQiLCJuYW1lIjoiQlZTQ25zbXJUcmFkZSIsIm5hbWVzcGFjZSI6ImNvbS5lcXVpZmF4LmJ2cy5uZWdhdGl2ZS5tb2RlbCIsImZpZWxkcyI6W3sibmFtZSI6ImluZ2VzdGlvbkhlYWRlciIsInR5cGUiOnsidHlwZSI6InJlY29yZCIsIm5hbWUiOiJJbmdlc3Rpb25IZWFkZXIiLCJuYW1lc3BhY2UiOiJjb20uZXF1aWZheC5mYWJyaWMuaW5nZXN0aW9uLmNvbW1vbnMuZG9tYWluIiwiZmllbGRzIjpbeyJuYW1lIjoiZ2VvIiwidHlwZSI6eyJ0eXBlIjoic3RyaW5nIiwiYXZyby5qYXZhLnN0cmluZyI6IlN0cmluZyJ9fSx7Im5hbWUiOiJkYXRhRG9tYWluIiwidHlwZSI6eyJ0eXBlIj

When I try to use the Bruno Application to test it, the data is returned completly fine.

Is there a way to load this JSON located in this data key in a DF ?

I'm trying to load a JSON file from an API using Python, but instead of returning the actual data, it has been returning a hash value.

Share Improve this question edited Nov 21, 2024 at 20:44 ouroboros1 13.6k7 gold badges35 silver badges53 bronze badges asked Nov 21, 2024 at 19:11 Tw1sterTw1ster 12 bronze badges 2
  • 1 Is this how you get it from the API? That's problematic, because the value of the sub key has an invalid JSON string. I.e., it ends with \"325}}", but it should end with \"325\"}}". – ouroboros1 Commented Nov 21, 2024 at 20:15
  • 1 My mistake, it does end with \"325\"}}". Do you think Bruno has some kind of background setup to automatically convert from this base64 encryption to JSON ? – Tw1ster Commented Nov 21, 2024 at 20:32
Add a comment  | 

1 Answer 1

Reset to default 1

Here's one approach:

import pandas as pd
import json

json_file = 'data.json'

with open(json_file, 'r') as file:
    json_data = json.load(file)

data = json.loads(*json_data['data'].values())

df = pd.DataFrame(data)

Output:

                                          ingestionHeader
geo                                                    BR
dataDomain                             BR_NEGATIVE_CREDIT
dataSubDomain                                  COMM_TRADE
dataViewName      NPUR_BR_NEGATIVE_CREDIT_COMM_TRADE_VIEW
dataViewVersion                                     1.0.0
correlationId        b8703cc5-3785-4cc7-a9ba-d5bd2214c17a
sourceId             211949fb-64da-4b7b-8e71-ee0a88339d89
regulatory                                          False
sourceTimestamp                             1730897571270
recordUUID           1b165b5c-f2cd-4e4f-8dc9-16b28110cd7c
sourceRecordUUID     1b165b5c-f2cd-4e4f-8dc9-16b28110cd7c
recordNumber                                            8
entityKey                                             325

Explanation

  • Use json.load to load the JSON file.
  • The value for the sub key is a JSON string. Access it via json_data['data'].values() and unpack (*) for use in json.loads.
  • Pass the result to pd.DataFrame.

Edit:

@juanpa-arrivillaga correctly points out that unpacking with * is a bit of an anti-pattern, as this will work only because "data" has exactly one sub key. It will be cleaner to get the value for that key from "messages":

data = json.loads(json_data['data'][json_data['messages'][0]['id']])

Or to turn dict.values into a list and access the first element:

data = json.loads(list(json_data['data'].values())[0])

本文标签: