admin管理员组

文章数量:1405504

I'm working on a WordPress theme using Sage 10, and I'm having trouble parsing the content created with the Classic Editor (without blocks) in the single.blade.php file.

Here’s how I’m retrieving the post content:

$content = get_post_field('post_content', get_the_ID());

Then, I pass the $content to a parsing function:

function parse_classic_content($content) {
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'));
libxml_clear_errors();

$body = $dom->getElementsByTagName('body')->item(0);
$children = iterator_to_array($body->childNodes);
$newContent = '';
$count = 0;

foreach ($children as $child) {
    $newContent .= $dom->saveHTML($child);
    $count++;

    if ($count % 10 === 0) {
        $newContent .= '<p>separator every 10 items</p>';
    }
}

return preg_replace('~<\?xml.*?~', '', $newContent);
}

The issues I’m facing:

  1. Element Removal: Sometimes some elements are missing or not rendered as expected.
  2. Separator Logic: I don't understand why the separator element is not being rendered every 10 elements. When I inspect the DOM, I see it placed every 5, 6, or 7 elements instead.

Am I making a mistake somewhere in my logic? Could there be an issue with how I'm cycling through the elements or modifying the DOM?

Nice to have: I would like to check, inside the loop, if there are any anchor tag inside the paragraph elements. How can I do that?

Any advice or insights would be appreciated. Thanks!

I'm working on a WordPress theme using Sage 10, and I'm having trouble parsing the content created with the Classic Editor (without blocks) in the single.blade.php file.

Here’s how I’m retrieving the post content:

$content = get_post_field('post_content', get_the_ID());

Then, I pass the $content to a parsing function:

function parse_classic_content($content) {
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'));
libxml_clear_errors();

$body = $dom->getElementsByTagName('body')->item(0);
$children = iterator_to_array($body->childNodes);
$newContent = '';
$count = 0;

foreach ($children as $child) {
    $newContent .= $dom->saveHTML($child);
    $count++;

    if ($count % 10 === 0) {
        $newContent .= '<p>separator every 10 items</p>';
    }
}

return preg_replace('~<\?xml.*?~', '', $newContent);
}

The issues I’m facing:

  1. Element Removal: Sometimes some elements are missing or not rendered as expected.
  2. Separator Logic: I don't understand why the separator element is not being rendered every 10 elements. When I inspect the DOM, I see it placed every 5, 6, or 7 elements instead.

Am I making a mistake somewhere in my logic? Could there be an issue with how I'm cycling through the elements or modifying the DOM?

Nice to have: I would like to check, inside the loop, if there are any anchor tag inside the paragraph elements. How can I do that?

Any advice or insights would be appreciated. Thanks!

Share Improve this question asked Mar 7 at 15:10 user16469315user16469315 891 gold badge1 silver badge12 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Am I making a mistake somewhere in my logic?

I guess you want to count elements, and there you make the mistake to count nodes.

Let's say we create a HTML document with an implied body and a headline:

    $html = Dom\HTMLDocument::createFromString
    (
        source: "<title>{$title}</title><h1>{$title}</h1>",
        options: LIBXML_NOERROR,
    );

Then we insert the numbers from one up to twenty-five after the headline, separated by a break each:

    foreach (range(1, 25) as $index =>  $number)
    {
        if ($index)
        {
            $append[] = $html->createElement('br');
        }

        $append[] = "\n{$f->format($number)}\n";
    }

    $body = $html->querySelector('body');
    $body->append(...$append);
<h1>Untitled Document</h1>

one

<br>

two

<br>

three

<br>

...

Now we buffer all child-nodes of the body and at every tenth entry the separator:

    $cached = iterator_to_array($body->childNodes);

    $count = 0;
    foreach ($cached as $child)
    {
        $buffer[] = $html->saveHtml($child);

        if (++$count % 10 === 0)
        {
            $buffer[] = "<p>separator every {$f->format(10)} items: {$f->format($count)}</p>";
        }
    }

This similar, but perhaps more clarified example then shows, that every fifth number, the every tenth separator is inserted:

four

<br>

five

<p>separator every ten items: ten</p>
<br>

six

<br>

seven

As the output shows, you should perhaps count the elements if you mean it, not the child nodes that can contain text-nodes as well. The text nodes are the written-out numbers, the br-tags are actual tags. Both count as one node.

本文标签: phpIssues with Parsing Classic EditorStack Overflow