DOMDocument::loadHTML(): warning - htmlParseEntityRef: Entity 中没有名称

2022-08-30 18:54:36

我发现了几个类似的问题,但到目前为止,还没有一个能够帮助我。

我正在尝试在HTML块中输出所有图像的“src”,所以我正在使用.这种方法确实有效,但我在某些页面上收到警告,我无法弄清楚原因。一些帖子建议压制警告,但我宁愿找出为什么生成警告。DOMDocument()

警告: DOMDocument::loadHTML(): htmlParseEntityRef: Entity 中没有名称, 行: 10

生成错误的一个例子是 -post->post_content

On Wednesday 21st November specialist rights of way solicitor Jonathan Cheal of Dyne Drewett will be speaking at the Annual Briefing for Rural Practice Surveyors and Agricultural Valuers in Petersfield.
<br>
Jonathan is one of many speakers during the day and he is specifically addressing issues of public rights of way and village greens.
<br>
Other speakers include:-
<br>
<ul>
<li>James Atrrill, Chairman of the Agricultural Valuers Associates of Hants, Wilts and Dorset;</li>
<li>Martin Lowry, Chairman of the RICS Countryside Policies Panel;</li>
<li>Angus Burnett, Director at Martin & Company;</li>
<li>Esther Smith, Partner at Thomas Eggar;</li>
<li>Jeremy Barrell, Barrell Tree Consultancy;</li>
<li>Robin Satow, Chairman of the RICS Surrey Local Association;</li>
<li>James Cooper, Stnsted Oark Foundation;</li>
<li>Fenella Collins, Head of Planning at the CLA; and</li>
<li>Tom Bodley, Partner at Batcheller Monkhouse</li>
</ul>

我可以发布更多示例,说明如果这有帮助,可以包含哪些内容?post->post_content

我已允许临时访问开发站点,因此您可以看到一些示例[注意 - 由于问题已得到解答,链接不再可访问] -

有关如何解决此问题的任何提示?谢谢。

$dom = new DOMDocument();
$dom->loadHTML(apply_filters('the_content', $post->post_content)); // Have tried stripping all tags but <img>, still generates warning
$nodes = $dom->getElementsByTagName('img');
foreach($nodes as $img) :
    $images[] = $img->getAttribute('src');
endforeach;

答案 1

这个正确答案来自@lonesomeday的评论。

我最好的猜测是,在HTML的某个地方有一个未转义的&。这将使解析器认为我们在实体引用中(例如©)。当它到达 ;时,它认为实体已经结束。然后,它意识到它所拥有的内容不符合实体,因此它会发出警告并以纯文本形式返回内容。


推荐