admin管理员组

文章数量:1122832

I thought I was making my life easy and being future-conscious by saving some content as bits of JSON in custom post_meta fields. Unfortunately, WordPress doesn't agree and is making my life incredibly difficult.

I have a JSON string that looks essentially like this. This is just one bit, and the comment string is just some dummy unicode entities. The whole thing is generated w/ json_encode.

{
    "0": {
        "name": "Chris",
        "url": "testdomain",
        "comment": "\u00a5 \u00b7 \u00a3 \u00b7 \u20ac \u00b7 \u00b7 \u00a2 \u00b7 \u20a1 \u00b7 \u20a2 \u00b7 \u20a3 \u00b7 \u20a4 \u00b7 \u20a5 \u00b7 \u20a6 \u00b7 \u20a7 \u00b7 \u20a8 \u00b7 \u20a9 \u00b7 \u20aa \u00b7 \u20ab \u00b7 \u20ad \u00b7 \u20ae \u00b7 \u20af \u00b7 \u20b9"
    }
}

Unfortunately after I save it with update_post_meta, it comes out looking like this:

{
    "0": {
        "name": "Chris",
        "url": "testdomain",
        "comment": "u00a5 u00b7 u00a3 u00b7 u20ac u00b7 u00b7 u00a2 u00b7 u20a1 u00b7 u20a2 u00b7 u20a3 u00b7 u20a4 u00b7 u20a5 u00b7 u20a6 u00b7 u20a7 u00b7 u20a8 u00b7 u20a9 u00b7 u20aa u00b7 u20ab u00b7 u20ad u00b7 u20ae u00b7 u20af u00b7 u20b9"
    }
}

And with the slashes stripped, it can't be json_decoded back into useful content.

Any ideas why WordPress might be doing this, and if there is a way to avoid it? I can't use the JSON_UNESCAPED_UNICODE flag because this is a PHP 5.3.x install, and I've already tried encoding with htmlentities before the content is passed to json_encode, but that only captures a small subset of UTF-8 entities.

Thanks in advance!

(EDIT: FWIW, I know I could just save an array directly to post_meta and it'd be serialized and magic would happen but I just like the idea of having the data stored as JSON. If there isn't an easy, elegant solution I'll cave, but I'm very much hoping there is an easy, elegant solution!)

I thought I was making my life easy and being future-conscious by saving some content as bits of JSON in custom post_meta fields. Unfortunately, WordPress doesn't agree and is making my life incredibly difficult.

I have a JSON string that looks essentially like this. This is just one bit, and the comment string is just some dummy unicode entities. The whole thing is generated w/ json_encode.

{
    "0": {
        "name": "Chris",
        "url": "testdomain.com",
        "comment": "\u00a5 \u00b7 \u00a3 \u00b7 \u20ac \u00b7 \u00b7 \u00a2 \u00b7 \u20a1 \u00b7 \u20a2 \u00b7 \u20a3 \u00b7 \u20a4 \u00b7 \u20a5 \u00b7 \u20a6 \u00b7 \u20a7 \u00b7 \u20a8 \u00b7 \u20a9 \u00b7 \u20aa \u00b7 \u20ab \u00b7 \u20ad \u00b7 \u20ae \u00b7 \u20af \u00b7 \u20b9"
    }
}

Unfortunately after I save it with update_post_meta, it comes out looking like this:

{
    "0": {
        "name": "Chris",
        "url": "testdomain.com",
        "comment": "u00a5 u00b7 u00a3 u00b7 u20ac u00b7 u00b7 u00a2 u00b7 u20a1 u00b7 u20a2 u00b7 u20a3 u00b7 u20a4 u00b7 u20a5 u00b7 u20a6 u00b7 u20a7 u00b7 u20a8 u00b7 u20a9 u00b7 u20aa u00b7 u20ab u00b7 u20ad u00b7 u20ae u00b7 u20af u00b7 u20b9"
    }
}

And with the slashes stripped, it can't be json_decoded back into useful content.

Any ideas why WordPress might be doing this, and if there is a way to avoid it? I can't use the JSON_UNESCAPED_UNICODE flag because this is a PHP 5.3.x install, and I've already tried encoding with htmlentities before the content is passed to json_encode, but that only captures a small subset of UTF-8 entities.

Thanks in advance!

(EDIT: FWIW, I know I could just save an array directly to post_meta and it'd be serialized and magic would happen but I just like the idea of having the data stored as JSON. If there isn't an easy, elegant solution I'll cave, but I'm very much hoping there is an easy, elegant solution!)

Share Improve this question edited May 25, 2012 at 20:48 Chris Van Patten asked May 25, 2012 at 20:39 Chris Van PattenChris Van Patten 6501 gold badge5 silver badges14 bronze badges
Add a comment  | 

8 Answers 8

Reset to default 11

Doesn't look like there's any way to avoid it.

The update_metadata() function, which is ultimately responsible for saving the meta, explicitly runs a stripslashes_deep() on the meta value. This function will even strip slashes from array elements, if the value were an array.

Theres a filter that's run AFTER that called sanitize_meta, which you could hook in to. But at that point, your slashes have already been stripped, so you can't reliably determine where they needed to be added back in (or at least, I don't know how you would tell the difference between quoting legitimate JSON delimiters vs bits of values).

Can't speak to why it does this, but it does. Probably because it's eventually run through wpdb->update, which needs the strings unescaped.

As you feared, you're probably better off just storing the value as an array, which'll get serialized (as you said). If you want it as JSON later, you can just run it through json_encode().

There is an elegant way to handle this!

Pass the JSON encoded string through wp_slash(). That function will escape the leading slash of each encoded unicode character, which will prevent update_metadata() from stripping them.

You can cheat to wordpress with something like this:

$cleandata = str_replace('\\', '\\\\', json_encode($customfield_data, true));

This is that easy *elegant solution*...

This function does the transformation using preg_replace:

function preg_replace_add_slash_json($value) {
    return preg_replace('/(u[0-9a-fA-F]{4})/i', '\\\$1', $value);
}

Before each "uXXXX" (X=0..F, hexadecimal) sequence it adds backslash. Before submitting to DB, call this function.

For anyone still struggling with saving a json encoded unicode string via wp_update_post, the following worked for me. Found in class-wp-rest-posts-controller.php

// convert the post object to an array, otherwise wp_update_post will expect non-escaped input.
wp_update_post( wp_slash( (array) $my_post ) ); 

Here's an example:

$objectToEncodeToJson = array(
  'my_custom_key' => '<div>Here is HTML that will be converted to Unicode in the db.</div>'
);

$postContent = json_encode($objectToEncodeToJson,JSON_HEX_TAG|JSON_HEX_QUOT);

$my_post = array(
  'ID'           => $yourPostId,
  'post_content' => $postContent
);

wp_update_post( wp_slash( (array) $my_post ) );

An interest way around this is to encode to base64 see example below.

$data = Array(0 => array('name' => 'chris' , 'URL' => "hello.com"));

$to_json = json_encode($data);

echo $to_json  . "<br />";
//echos [{"name":"chris","URL":"hello.com"}] 

$to_base64 =  base64_encode($to_json);

Echo $to_base64 . "<br />";
//echos W3sibmFtZSI6ImNocmlzIiwiVVJMIjoiaGVsbG8uY29tIn1d

$back_to_json =  base64_decode($to_base64);

Echo $back_to_json . "<br />";
//echos [{"name":"chris","URL":"hello.com"}]

$back_to_aray = json_decode($back_to_json);

print_r($back_to_aray) ;
//echos  Array ( [0] => stdClass Object ( [name] => chris [URL] => hello.com ))

I know this is an old question, but it's still an issue impacting developers today. So as a reference, here's a good thread from the wordpress core issues tracker that I found informative: https://core.trac.wordpress.org/ticket/21767

You can use the WordPress stripslashes_deep() function.

<?php stripslashes_deep($your_json);?>

For reference visit here

本文标签: post metaWordPress is stripping escape backslashes from JSON strings in postmeta