Foros del Web - Ver Mensaje Individual - Simular PERFECTAMENTE un navegador con php y leer cualquier contenido de sitios

ASCENDEDMASTERS · #1 (**permalink**) 02/10/2006, 06:21

actualmente uso la siguiente funcion (usando las librerías curl) para tomar el contenido de una web

Código PHP:

  <?

 
function GetHTML($d,$method,$vars,$ref='')

{

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL,$d);

    curl_setopt($ch, CURLOPT_REFERER, $ref);

    curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");

    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);

    curl_setopt($ch, CURLOPT_MAXREDIRS,3);

    curl_setopt($ch,CURLOPT_VERBOSE,0);   // me informará (si esta en cero) de todos los errores que halla curl

    curl_setopt($ch,CURLOPT_FOLLOWLOCATION,3);

    if ($method == 'POST')

    {

        curl_setopt($ch, CURLOPT_POST, 1);

        curl_setopt($ch, CURLOPT_POSTFIELDS, $vars);

    }

    $buffer = curl_exec ($ch);

    curl_close ($ch);

    unset($ch);

    return $buffer;

}

Pero el problema es que por ejemplo google se da cuenta y me dice unable tu aply your request o algo asi y otros sitios tambien. Que es lo que me falta agregar para simular perfectamente un navegador y no me rechasen?

Otra cosa, hay sitios que cuando intentas entrar con la direccion "http://dominio.algo" automaticamente tu navegador te redirecciona a (sies que lo tiene) el dominio con su subdominio, osea http://www.dominio.algo/
como hago para que ucando leo el contenido de ese sitio no me muestre el cartel del servidor

Código:

Found
The document has moved here.

Apache/1.3.29 Server at dominio.algo Port XX

Que le falta?