Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3
  1. #1
    Regular Coder
    Join Date
    Jun 2007
    Posts
    310
    Thanks
    86
    Thanked 3 Times in 3 Posts

    Help with a parser script please?

    Hello,

    I'm in the middle of writing a parser to ditch all the junk at the start of a file I wish to read, then when it finds a certain tag <HEAD> I want the details outputted to the page - ignoring everything before that tag, but i'm having problems and nothing is outputted.

    Anyone kindly advise, think i'm nearly there?

    Thanks

    PHP Code:
    <?php

    //* Open the file
    $lines file('file.html');

    //* Need to ditch the junk at the start of the file so initially set $find_line = false
    $find_line false;

    //* Loop through the lines
    foreach ( $lines as $line ) {

    //* Check find the line where the <HEAD> tag is and then set $find_line = true
    if(strpos($line'<HEAD>')){
      
    $find_line true;
    }

    //* Now it's 'true' echo out details ignorning everything before <head>
    if($find_line){
        echo 
    $val trim ($line); 
    }

    //End foreach

    ?>
    Thanks

    Chris

  • #2
    Master Coder
    Join Date
    Jun 2003
    Location
    Cottage Grove, Minnesota
    Posts
    9,502
    Thanks
    8
    Thanked 1,089 Times in 1,080 Posts
    Not sure if this helps or not .... I found this using Google:

    For getting the <title>, but I suppose it could just as well be <head> ...

    $data = file_get_contents("SOME URL");
    $pattern = "/<title>([^<]*)<\/title>/";
    preg_match($pattern, $data, $matches);
    //$matches is an array;
    $matches[1] will get the the data in the first () or the title $title = $matches[1];

  • #3
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    Using that concept mlseim posted:

    Code:
    $data = file_get_contents("SOME URL");
    $pattern = "~<head>(.*)~is";
    preg_match($pattern, $data, $matches);
    print($matches[1]);
    Use $matches[0] instead of 1 if you want to print the head tag too.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •