Foros del Web - Ver Mensaje Individual

jahman · #1 (**permalink**) 03/05/2010, 08:01

Hola a todos tengo un funcion para hacer un Scraping HTML en C# con regex, lo q hace es apartir de un string[] de direccion y un string[] Regex voy buscando la data q especifico....

en general funciona, cuando hago .*?( (\\d+)?(?: ?\\d){8,10}) para obtener el tlf pero en el regex <title>([^<]+)</title> para obtener el titulo de la pagina no muestra nada se q lo agarra ya que al hacer el debug lo puedo ver pero no lo imprime.

lo q esta en rojo en el codigo es donde recibo y debo mostrar pare cuando vienen envueltos en html tag <> no muestra nada unauqe lo haya encontrado...
espero qme puedan ayuda. Gracias.

Código:

    public string[] GetUrl = new string[] { "http://www.cafesor.no/kontakt/","http://www.asylet.no/","http://www.cafekaos.no/info.html","http://jekylls.no/html/kontakt.html" };
    public string[] RegexString = new string[] { ".*?( (\\d+)?(?: ?\\d){8,10})","<title>([^<]+)</title>" };



public void regex_Click(object sender, EventArgs e)
    {
        ResultRegex.Text = "";
        for (int i = 0; i < GetUrl.Length; i++)
        {
            string pagesource = getHtml(GetUrl[i]);
            ResultRegex.Text += GetUrl[i].ToString() + " <br />";
            for(int j = 0; j< RegexString.Length; j++) 
            {
            Regex objNotNaturalPattern = new Regex(@RegexString[j]);
            MatchCollection matches = objNotNaturalPattern.Matches(pagesource);
            
            foreach (Match match in matches)
                ResultRegex.Text += "-"+match.Value.ToString() + "<br />";
            }
        }