admin管理员组文章数量:1125618
I'm working on updating/replacing certain data in WordPress blog posts(1000+) from 2 categories from which I need to extract specific HTML content from two p tags and replace one of the p tags with new content. I'm trying to target specific elements within the HTML, such as link/image and price/discount price. Any assistance would be very welcomed.
I am attempting to extract the discount and original price from this element. From another category this element discount price is wrapped in del tags.
<p><span style="font-weight:bold;font-style: italic"><a target="_blank" href="/
" rel="nofollow sponsored noopener">117 EUR</a></span> instead of 296 EUR</p>
<p><span style="font-weight:bold;font-style: italic"><a target="_blank" href="/
" rel="nofollow sponsored noopener"><del>117 EUR</del></a></span> instead of 296 EUR</p>
Also I am trying to extract the url and image link from this element, and replace it with a new HTML structure.
<p><a style="font-size: 26px;text-decoration: none" target="_blank" href="/
" rel="nofollow sponsored noopener">Go to: <img decoding="async" width="130" style="border-radius:20px" src=".png"></a> </p>
The new replace structure for the second HTML element is
This is my attempt at achieving the desired result, however for some reason, the first 6, 7 posts get replaced properly and then other posts are replaced but broken. The content is the same in every post, except for the prices... I have tried with and without the date and tax_query, the result was the same. I am not that good at regex patterns, I am not sure if that or what is causing this behavior.
$args = [
'post_type' => 'post',
'posts_per_page' => $limit,
'offset' => $offset,
'date_query' => [['after' => '1 month ago']],
'tax_query' => [[
'relation' => 'OR',
[ 'taxonomy' => 'category', 'field' => 'slug', 'terms' => ['category-1', 'category-2']],
]],
];
$query = new WP_Query($args);
if ($query->have_posts()) {
$processed_posts = 0;
while ($query->have_posts()) {
$query->the_post();
$post_id = get_the_ID();
$post_content = get_post_field('post_content', $post_id);
$dom = new DOMDocument();
@$dom->loadHTML(mb_convert_encoding($post_content, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
error_log("Processing post ID: " . $post_id);
$oldStructure = $xpath->query("//a[contains(text(), 'Go to:') and @target='_blank' and @rel='nofollow sponsored noopener']")->item(0);
error_log("XPath query result for post ID: " . $post_id . ": " . ($oldStructure ? "Found" : "Not found"));
if ($oldStructure) {
error_log("Old structure found!");
$productUrlNode = $xpath->query('//a[@style="font-size: 26px;text-decoration: none"]/@href')->item(0);
$productUrl = $productUrlNode ? $productUrlNode->nodeValue : '';
$imageSrcNode = $xpath->query('//p/a[@style="font-size: 26px;text-decoration: none"]/img/@src')->item(0);
$imageSrc = $imageSrcNode ? $imageSrcNode->nodeValue : '';
$discountPriceNodeQuery = "//p/span/a/del/text() | //p/span/a[not(del)]/text()";
$discountPriceNodes = $xpath->query($discountPriceNodeQuery);
$discountPrice = '';
foreach ($discountPriceNodes as $node) {
$textContent = $node->nodeValue;
if (preg_match('/(\d+)/', $textContent, $matches)) {
$discountPrice = $matches[1];
break;
}
}
$originalPrice = '';
$pNodeTexts = $xpath->query("//p[contains(., 'instead of')]");
foreach ($pNodeTexts as $textNode) {
if (preg_match('/instead of (\d+)/', $textNode->nodeValue, $matches)) {
$originalPrice = $matches[1];
break;
}
}
$pElement = $dom->createElement('p');
$anchor = $dom->createElement('a');
$anchor->setAttribute('href', esc_url($productUrl));
$anchor->setAttribute('class', 'product-link_wrap');
$anchor->setAttribute('target', '_blank');
$buttonBlock = $dom->createElement('div');
$buttonBlock->setAttribute('class', 'button-block');
$anchor->appendChild($buttonBlock);
$buttonImage = $dom->createElement('div');
$buttonImage->setAttribute('class', 'button-image');
$img = $dom->createElement('img');
$img->setAttribute('src', esc_url($imageSrc));
$img->setAttribute('alt', 'Product Image');
$img->setAttribute('class', 'webpexpress-processed');
$buttonImage->appendChild($img);
$buttonBlock->appendChild($buttonImage);
$productTitle = $dom->createElement('div');
$productTitle->setAttribute('class', 'product-title');
$productName = $dom->createElement('span', esc_html(get_the_title($post_id)));
$productName->setAttribute('class', 'product-name');
$productTitle->appendChild($productName);
$buttonBlock->appendChild($productTitle);
$productPrices = $dom->createElement('div');
$productPrices->setAttribute('class', 'prices-container');
$productOriginalPrice = $dom->createElement('span', esc_html($originalPrice));
$productDiscountPrice = $dom->createElement('span', esc_html($discountPrice));
$productPrices->appendChild($productOriginalPrice);
$productPrices->appendChild($productDiscountPrice);
$buttonBlock->appendChild($productPrices);
$buttonContainer = $dom->createElement('div');
$buttonContainer->setAttribute('class', 'button-container');
$button = $dom->createElement('button');
$button->setAttribute('class', 'product-button');
$span = $dom->createElement('span', 'Go to Product');
$button->appendChild($span);
$svg = $dom->createElement('svg');
$svg->setAttribute('xmlns', '');
$svg->setAttribute('enable-background', 'new 0 0 24 24');
$svg->setAttribute('viewBox', '0 0 24 24');
$path = $dom->createElement('path');
$path->setAttribute('d', 'M15.5,11.3L9.9,5.6c-0.4-0.4-1-0.4-1.4,0s-0.4,1,0,1.4l4.9,4.9l-4.9,4.9c-0.2,0.2-0.3,0.4-0.3,0.7c0,0.6,0.4,1,1,1c0.3,0,0.5-0.1,0.7-0.3l5.7-5.7c0,0,0,0,0,0C15.9,12.3,15.9,11.7,15.5,11.3z');
$svg->appendChild($path);
$button->appendChild($svg);
$buttonContainer->appendChild($button);
$buttonBlock->appendChild($buttonContainer);
$pElement->appendChild($anchor);
$oldStructure->parentNode->replaceChild($anchor, $oldStructure);
$updated_content = $dom->saveHTML();
error_log("New content: " . $updated_content);
$result = wp_update_post([
'ID' => $post_id,
'post_content' => $updated_content,
]);
if ($result === 0 || $result === false) {
error_log("Failed to update post ID: " . $post_id);
} else {
error_log("Successfully updated post ID: " . $post_id);
}
This is the correctly replaced element:
[29-Feb-2024 11:26:21 UTC] Processing post ID: 56526
[29-Feb-2024 11:26:21 UTC] XPath query result for post ID: 56526: Found
[29-Feb-2024 11:26:21 UTC] Old structure found!
[29-Feb-2024 11:26:21 UTC] Img: .png
[29-Feb-2024 11:26:21 UTC] URL:
[29-Feb-2024 11:26:21 UTC] discount: 12
[29-Feb-2024 11:26:21 UTC] original: 133
The incorrectly replaced element:
[29-Feb-2024 11:26:21 UTC] Processing post ID: 56510
[29-Feb-2024 11:26:21 UTC] XPath query result for post ID: 56510: Found
[29-Feb-2024 11:26:21 UTC] Old structure found!
[29-Feb-2024 11:26:21 UTC] Img:
[29-Feb-2024 11:26:21 UTC] URL:
[29-Feb-2024 11:26:21 UTC] discount:
[29-Feb-2024 11:26:21 UTC] original: 340
the incorrectly replaced element is being stripped of the p tag and therefore causing it to break, I assume...
<span style="font-weight: bold; font-style: italic;"><a href="; target="_blank" rel="nofollow sponsored noopener">28 EUR</a></span> instead of 340 EUR
本文标签: Extracting and Replacing HTML Post content with PHP DOM
版权声明:本文标题:Extracting and Replacing HTML Post content with PHP DOM 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736633846a1945834.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论