PHP

Parsing RSS Feed For Images

By August 1, 2017 No Comments

I’m trying to parse out an RSS feed for images. My original code grabbed the title, description, link, and pubDate just fine. I read a tutorial online on how to get the images out as well, but now that I’ve updated my pull it gives me the errror:
Fatal error: Call to a member function getAttribute() on a non-object in /home/ode/public_html/dev/cron-insights2.php on line 10

Here was my original code:

<?php

$rss = new DOMDocument();
$rss->load('http://feeds.feedburner.com/feedburner/Dvzy?format=xml');
$feed = array();

//FOR EACH ITEM RETURN RSS
foreach ($rss->getElementsByTagName('item') as $node) {

    $img = $rss->getElementsByTagName('img')->item(0)->getAttribute('src');

    $item = array(
        'title' => $node->getElementsByTagName('title')->item(0)->nodeValue,
        'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue,
        'link' => $node->getElementsByTagName('link')->item(0)->nodeValue,
        'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue,
        'image' => $img,
    );
    array_push($feed, $item);
}

echo '<pre>';
var_dump($feed);
echo '</pre>';

I was able to fix the issue by tweaking the code below:

<?php 

error_reporting(E_ALL);
echo '<meta charset="utf-8" />';
echo '<pre>';

$url = 'http://feeds.feedburner.com/feedburner/Dvzy?format=xml';
$xml = file_get_contents($url);
$xml = str_replace('<content:encoded>', '<content_encoded>', $xml);
$xml = str_replace('</content:encoded>', '</content_encoded>', $xml);
$obj = SimpleXML_Load_String($xml, 'SimpleXMLElement', LIBXML_NOCDATA);

// ACTIVATE THIS TO SEE THE XML
// echo htmlentities($xml);

// ACTIVATE THIS TO SEE THE OBJECT
// var_dump($obj);

// COLLECT INFORMATION HERE
$feed = [];

// A REGEX THAT FINDS IMAGE TAGS SRC ATTRIBUTE
$rgx
= '~'                  // REGEX DELIMITER
. 'src="'              // SRC INSIDE IMAGE TAG
. '('                  // START CAPTURE GROUP
. '[^"]*'              // ANYTHING UP TO END
. ')'                  // ENDOF CAPTURE GROUP
. '"'                  // END OF SRC ATTRIBUTE
. '~'                  // REGEX DELIMITER
. 'i'                  // FLAG: CASE INSENSITIVE
. 's'                  // FLAG: SINGLE LINE
;

foreach ($obj->channel->item as $item)
{
    $node['title'] = (string)$item->title;
    $node['desc']  = (string)$item->description;
    $node['link']  = (string)$item->link;
    $node['date']  = (string)$item->pubDate;

    $content       = $item->content_encoded;
    preg_match_all($rgx, $content, $match);
    $node['image'] = $match[0][0];
    $feed[] = $node;
}
var_dump($feed);

Leave a Reply