Monday, July 5, 2010

Fetch full posts content of FeedWordPress feeds

Hi :)

I use WordPress and FeedWordPress plugin to create a planet. It's great plugin. Some bloggers don't show full post content on their feeds. If you like to get the full content of posts, you can contact to blogger and ask his/her to enable full content on the feed or continue to read this article.
I create functions to get full content of posts.

Requirement

  • PHP with cURL support (Client URL Library)

  • Permissions to modify theme files

  • Text editor

  • basic php programming skills


Step 1 - Where is post content?
It's easy. just open the web page and see the page source.
For example, open http://zebardast.ir/en/linux-and-unix-bash-shell-aliases/ (Single post with full content) and see the page source.
On the page source you can see the content which is started by below code:
<div  class="postBody">

and ended by :
			</div> 

<div class="postFooter">

* It's not ended only by </div> because there is some divs on post content. So I add some html code after </div> which is unique.

Step 2 - Add started and ended code to `Custom Feed Settings`
Open the wordpress administration panel and go to the `Feed and Update Settings` page. Select the feed from drop down menu (Here `Saeid Zebardast's Blog`).
Add started and ended code to `Custom Feed Settings`:


Step 3 - Fetch full content from source and update post on wordpress
Open functions.php in text editor and add the below codes to the end of it:

<?php
function validLink($link) {
if(preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $link)) {
return true;
} else {
return false;
}
}


/**
* Get a web file (HTML, XHTML, XML, image, etc.) from a URL. Return an
* array containing the HTTP server response header fields and content.
*/
function get_web_page( $url )
{
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => false, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "ayy.ir spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);

$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );

$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
}


function before ($this, $inthat)
{
return substr($inthat, 0, strpos($inthat, $this));
};

function after ($this, $inthat)
{
if (!is_bool(strpos($inthat, $this)))
return substr($inthat, strpos($inthat,$this)+strlen($this));
};

function multi_between($this, $that, $inthat)
{
$counter = 0;
while ($inthat)
{
$counter++;
$elements[$counter] = before($that, $inthat);
$elements[$counter] = after($this, $elements[$counter]);
$inthat = after($that, $inthat);
}
return $elements;
}

function strbet($inputStr, $delimeterLeft, $delimeterRight, $debug=false) {
$posLeft=strpos($inputStr, $delimeterLeft);

if ( $debug ) {
echo $posLeft;
}

if ( $posLeft===false ) {
if ( $debug ) {
echo "Warning: left delimiter '{$delimeterLeft}' not found";
}
return false;
}
$posLeft+=strlen($delimeterLeft);
$posRight=strpos($inputStr, $delimeterRight, $posLeft);
if ( $posRight===false ) {
if ( $debug ) {
echo "Warning: right delimiter '{$delimeterRight}' not found";
}
return false;
}


if ( $debug ) {
echo $posLeft;
echo $posRight;
}

return substr($inputStr, $posLeft, $posRight-$posLeft);
}

?>

Close functions.php and open single.php in text editor. Add the below codes after `<?php if (have_posts()) : while (have_posts()) : the_post(); ?>`:

<?php
$my_content = get_the_content();
if (is_syndicated()) :

$syndication_permalink = get_post_meta(get_the_ID(),"syndication_permalink", true);
$syndication_source = get_post_meta(get_the_ID(),"syndication_source", true);
$syndication_source_uri = get_post_meta(get_the_ID(),"syndication_source_uri", true);

if (!validLink($syndication_permalink) && validLink($syndication_source_uri)) {
$syndication_permalink = $syndication_source_uri . "/" . $syndication_permalink;
}

$post_updated = get_post_meta(get_the_ID(),"post_updated", true);
if (empty($post_updated) || $post_updated == false) {

$start_content = get_feed_meta('start_content');
$end_content = get_feed_meta('end_content');

if (!empty ($start_content) && !empty($end_content)) {
$result = get_web_page($syndication_permalink);
$my_page = $result['content'];

if (!empty($my_page)) {
$valid_texts = array();
$valid_texts = strbet($my_page, $start_content, $end_content);
if (is_array($valid_texts)) {
$valid_texts = $valid_texts[0];
}

if (!empty($valid_texts)) {
$my_post = array();
$my_post['ID'] = get_the_ID();
$my_post['post_content'] = $valid_texts;
$my_content = $valid_texts;
wp_update_post($my_post);
update_post_meta(get_the_ID(), 'post_updated', true);
}
}
}
}

endif; //is_syndicated()
?>

After it, replace `the_content()` with:
 echo $my_content; 

Close text editor and Upload functions.php and single.php to your theme folder. Now go to the single post and see the full content.
Just try it!

See also
How do I get FeedWordPress to include the full content of posts, instead of just a short summary or excerpt of the text?

External links
WordPress
FeedWordPress (Homepage)
FeedWordPress (WordPress plugin directory)
Client URL Library

Good luck :)

22 comments:

  1. Hi, i was searching for something like this becouse I wamted to try feedwordpress with my 2 blogs' feeds (1 blog is the one which should be replicated and the other one is the testing blog). I did everything you explained but it doesn't work for me, I still get [...] and the content is cut. When I uploaded the files, functions.php gave me an error on line 1072 which is the line where your codes begins so I deleted the code "<?php" and it didn't give the error. But it still doesn't work. What can I do? Can you help me? If you want i can upload that php files and give it to you if it could be usefull.

    PS I know that it would work if i make the rss show the full content but i don't do that because i want it to work with other blogs' feeds which are cut like mine.

    PPS Sorry for my bad english I'm italian :)

    PPPS Feel free to mail me :D

    Have a nice day!

    ReplyDelete
  2. Hi Richard,

    You can upload the `single.php` and `functions.php`. I will fix them :)

    ReplyDelete
  3. Thank you very much Saeid, i compressed them in a .rar archive and I uploaded them on megaupload, here is the link: http://www.megaupload.com/?d=39FT002N
    In the archive i also put my main blog “single.php” and “functions.php” file so next time when I will try it on my main blog i won’t bother you again

    Again Thanks
    Bye bye!

    ReplyDelete
  4. hey richard,

    please check your mail :)

    ReplyDelete
  5. Gonna try it now ;)

    Thanks Saeid!

    ReplyDelete
  6. np, please send all files of theme to me :)

    ReplyDelete
  7. Sorry if i didn't send you the file, I didn't receive a notification of your reply.

    Sending it right now :D

    ReplyDelete
  8. Hey Saeid,
    did you received my mail?

    ReplyDelete
  9. hey richard, I get your theme files but I don't have time now. I will work on theme very soon.

    ReplyDelete
  10. Hey ok don't worry, i just wanted to know if you received it or not ;)
    See you soon, bye bye!

    ReplyDelete
  11. Hey Richard,

    please check your mail ;)

    ReplyDelete
  12. I did the steps but still its fetching before tag.
    The start value is starting with and end is like this



    i hope i have selected the correct Div, edited single.php and functions.php but it does not work.

    Any help ?

    ReplyDelete
  13. Hi,

    I tried this code and I think it needs to be altered for wp 3.1. I would really like to see how one would change this up for 3.1. If you have time, I know I would truly appreciate it.

    Thanks for the great information.

    John

    ReplyDelete
  14. hi saeid, same over here, i tried to get this to work but failed miserably. can i hire you to make this available as an add-on for fwp?

    ReplyDelete
  15. hi saeid
    i can't find what you mentioned in single.php


    my single.php only have this :















    can you please make this easier for us and make a ready for use files or make this for the default theme. thanx advanced

    ReplyDelete
  16. I've tried to get this working with the latest version of Wordpress, but I get an error after I change the "echo $my_content;" bit. Please help.

    ReplyDelete
  17. Sorry, I don't use FeedWordPress plugin any more.

    ReplyDelete
  18. Hi Please do have an alternatives to post full post content?

    ReplyDelete
  19. I personally still use FeedWordpress quite a bit. But recently I found the need to get some more fine grained tools for things here and there. I wound up getting http://codecanyon.net/item/web-grabber-wordpress-plugin/239696&ref=schirpich

    You can use it all by itself via short codes, or you can even integrate it directly with FeedWordpress via the key/value custom fields in combination with the Web Grabber short codes.

    I've found it to work really well for the simple fact that anyone using FeedWordpress isn't "only" trying to syndicate from other wordpress sites. And if there's anything you've learned from playing with FWP is that no two RSS feeds are alike, hehe

    ReplyDelete
  20. Parse error: syntax error, unexpected T_VARIABLE in /home/tazehneg/public_html/wp-content/themes/Tazehnegar/single.php on line 34

    ReplyDelete
  21. [...] Need to be aware that FeedWordPress can’t get content for partial rss feeds but plugins are available [...]

    ReplyDelete