Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    New to the CF scene
    Join Date
    Feb 2007
    Posts
    7
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Perl regex question

    A quick question with a regex problem that seems to be stumping me. Hopefully someone has some thoughts.

    Code:
    open(FDATA, $fullPath) or die "Can't open $fullPath : $!";
    
    while(<FDATA>) {
        if($_ =~ /<BODY>(.*)<\/BODY>/gis) {
            print "Match: $1\n";
        }
    close(FDATA);
    }
    I'm basically just comparing this code against an html file or xml file with a body tag somewhere in it. The regex seems to be good from what I can tell. I ran it through an online regex comparison from regexbuddy and it seems to return properly. But for some reason it isn't returning anything here. Any thoughts?

  • #2
    Master Coder
    Join Date
    Apr 2003
    Location
    in my house
    Posts
    5,211
    Thanks
    39
    Thanked 201 Times in 197 Posts
    Not tested but I would try:

    Code:
      if($_ =~ /[<BODY>|</BODY>]/i ){
    print qq(do summat);
      }
    you may need to escape the < and > but I doubt it.

    bazz
    "The day you stop learning is the day you become obsolete"! - my late Dad.

    Why do some people say "I don't know for sure"? If they don't know for sure then, they don't know!
    Useful MySQL resource
    Useful MySQL link

  • #3
    New to the CF scene
    Join Date
    Feb 2007
    Posts
    7
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Sorry, I'm a little rusty with regexs, so while that may work, I'm not sure how to use it. I should have been more explicit. Am I mistaken that your code is simply matching <BODY> or </BODY>? I'm trying to return everything between the BODY tags. I'm using (.*) because I need to be able to reference it somehow to use it elsewhere, such as with $1. If your code is returning what's between the body tags, how would I reference a match using your syntax?

    Regardless, thanks for the input. It is appreciated.

  • #4
    Super Moderator
    Join Date
    May 2005
    Location
    Southern tip of Silicon Valley
    Posts
    2,874
    Thanks
    2
    Thanked 164 Times in 159 Posts
    In almost all cases, using a regex to parse html is the wrong approach. Normally you'd want to use one of the html parsers on cpan. However, in this case, you might get by with a regex (actually 2 regex's and the flip-flop operator).

    Code:
    my $fullPath = '/some/path/file';
    open( my $FDATA, '<', $fullPath)
      or die "Can't open $fullPath : $!";
    
    my $body;
    while(<$FDATA>) {
        if(/<BODY>/i .. /<\/BODY>/i ) {
            $body .= $_;
        }
    }
    close $FDATA;
    
    print $body;


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •