Monday, May 11, 2009

Use cURL in case allow_url_fopen is off in PHP


In the previous post I have explained about Inbound/Back links.

When I tried to implement the yahoo site explorer api using php code, I faced some issues.

The sample php code uses get_file_contents for getting contents of web page from remote site (i-e from yahoo server).

But it didn't work in our server. After doing some analysis, I came to know that allow_url_fopen is off in our server. i-e In php.ini file the value for allow_url_fopen is set as 0.


Since the server is in shared hosting , we couldn't change the php.ini file value.

As a work around, normally we can use ini_set for setting the php.ini values in the php code itself.

So I tried to add "ini_set("allow_url_fopen",1);" in the php code.
But I didn't work. I came to know that some properties of php.ini can not be overwritten in the php code.

Finally, after searching the net I found an alternate way to resolve this issue.

cURL in php is not expecting allow_url_open to be on for getting contents from remote websites.

So, I created the below function getContents using cURL as an alternate for the get_file_contents.


function getContents($url)
{
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION ,1);
curl_setopt($ch, CURLOPT_HEADER,0); // DO NOT RETURN HTTP HEADERS
curl_setopt($ch, CURLOPT_RETURNTRANSFER ,1); // RETURN THE CONTENTS
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,0);
$Rec_Data = curl_exec($ch);
return $Rec_Data;
}



Apart from above change, we need to do below things to make the yahoo api work.


  • Get appId from yahoo.

  • Enhance the sample code to show the results in multiple pages. It is required because yahoo will return maximum of 100 links only for each request.



Once after making above changes the php code will look like as below.



function backLink($input_url) {
$api_service_url = "http://search.yahooapis.com/SiteExplorerService/V1/inlinkData";
$apiid = "get yours from yahoo";
$query = $input_url;
$entire_site = "";
$omit_inlinks = "domain";
$linksperrequest = 30;
$startposition = 1;
$totallinks=0;
$request_url = $api_service_url."?appid=".$apiid."&query=".urlencode($query)."&entire_site=".$entire_site."&omit_inlinks=".$omit_inlinks."&output=php";
$currentpos = 0;
if (isset($_GET['pageno']))
{
$currentpos=$_GET['pageno']-1;

}
// while ($currentpos++ >= 0) {
if ($currentpos++ >= 0) {
$requrl = sprintf("%s&start=%s&results=%s", $request_url, ($currentpos-1)*$linksperrequest+$startposition, $linksperrequest);
if (($content = getContents($requrl)) === FALSE ) {
echo "HTTP error: $requrl";
exit;
} else {
$data = unserialize($content);
if (array_key_exists("ResultSet", $data)) {

$totallinks=sizeof($data["ResultSet"]["Result"]);

for ($i=0; $i<sizeof($data["ResultSet"]["Result"]); $i++) {
$url = $data["ResultSet"]["Result"][$i]["Url"];
$title = $data["ResultSet"]["Result"][$i]["Title"];
$domain = 'http://'.parse_url($url, PHP_URL_HOST);
$backlinks[$domain][] = array($url, $title);
}
} else {
echo "Error: Bad response from server";
}
if (sizeof($data["ResultSet"]["Result"]) < $linksperrequest) break;
}
}
define('BACKLINK_LIMIT',2);
define('BACKLINK_TRUNCATE',1);
define('BACKLINK_ALL',0);

foreach ($backlinks as $domain => $links) {
if (count($links) > BACKLINK_LIMIT) $backlinks[$domain] = array_slice($links, 0, BACKLINK_TRUNCATE);
}

sort($backlinks[$domain]);
function print_backlinks($domain, $links, $num) {
$limit = $num ? min($num,count($links)) : count($links);
for ($i=0; $i < $limit; $i++) {
list($url,$title) = $links[$i];
echo '<li><a href="'.$url.'">'.$title.'</a></li>';
}
}
echo '<ul>';

foreach ($backlinks as $domain => $links) {
if (count($links) > BACKLINK_LIMIT) {
print_backlinks($domain, $links, BACKLINK_TRUNCATE);
} else {
print_backlinks($domain, $links, BACKLINK_ALL);
}
}
if ($totallinks==$linksperrequest)
{
$currentpos++;


echo "   <a href='".$_SERVER['PHP_SELF']."?pageno=".$currentpos."&inputurl=".$input_url."'><h3><font color='maroon'>Next Page</font></h3></a>";
}
}


One of our clients we have done a flight search website, using cURL.

We are doing software development and testing projects cost effectively and with quality.

For your software development needs, you can contact us at qualitypointmail@gmail.com

More Articles...

3 comments:

Unknown said...

Fatal error: Call to undefined function getcontents() in C:\xampp\htdocs\program crawler\inbound\fff2.php on line 22

Unknown said...

I have try your code, but ended like this, please tell me how to fix it.
Fatal error: Call to undefined function getcontents() in C:\xampp\htdocs\program crawler\inbound\fff2.php on line 22

Rajamanickam Antonimuthu said...

It seems you have used getcontents()with lower case "c".
Make sure that function definition and usage are in same case.

Let me know if you still see this issue.

Thanks,
Rajamanickam

Search This Blog