Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    New Coder
    Join Date
    Dec 2010
    Posts
    36
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Parsing a webpage, regex issue

    Hi, I have a webpage that I need to parse some information from, the information is in the form of a table with 4 different fields.

    This is the HTML of the page:
    Code:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html>
    <head>
    	<title>Account Listing</title>
    	<link rel="STYLESHEET" type="text/css" href="pop.css">
    	<link rel="STYLESHEET" type="text/css" href="account.css">
    </head>
    
    <body style="color:#FFFFFF;" bgcolor="#000000" bottommargin="0" leftmargin="0" rightmargin="0" topmargin="0">
    <div style="font-family:Verdana,sans-serif;font-size:8pt;padding:5px;">	
    	You have <b>1260.65</b> points<br>
    
    	<b>15/25</b> characters on this server
    </div>
    <table width="437" border="0" cellspacing="0" cellpadding="0">
      <tr>
        <td align="center"><font color="#666666" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>PP</b></font></td>
        <td><font color="#666666" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>Name</b></font></td>
        <td align="center"><font color="#666666" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>Level</b></font></td>
    
        <td><font color="#666666" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>Crew</b></font></td>
        <td></td>
      </tr>
       	 	<tr>
    	  	<td align="center" style="background-color:#333333">
    	  			  			<img src="images/ppnostar.jpg" border="0">
    	  			  	</td>
      	 	<td style="background-color:#333333"><font color="#FFFF00" face="Verdana, Arial, Helvetica, sans-serif" size="1">
    
      	 		<b>Pimpa</b>
      	 	</font></td>
    			<td align="center" style="background-color:#333333"><font color="#FFFFFF" face="Verdana, Arial, Helvetica, sans-serif" size="1">
    				<b>75</b>
    			</font></td>
        	<td style="background-color:#333333"><font color="#999999" face="Verdana, Arial, Helvetica, sans-serif" size="1">
        		<b> •Ou†war Immor†als•</b>
    
        	</font></td>
        	<td style="background-color:#333333">
        		<a target="_top" href="http://sigil.***********/world.php?suid=2198627&serverid=1"><font color="#00FF00" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>PLAY!</b></font></a>
        	</td>
      	</tr>
      	 	 	<tr>
    	  	<td align="center" style="background-color:#000000">
    	  			  			<img src="images/ppnostar.jpg" border="0">
    
    	  			  	</td>
      	 	<td style="background-color:#000000"><font color="#FFFF00" face="Verdana, Arial, Helvetica, sans-serif" size="1">
      	 		<b>Eag1e</b>
      	 	</font></td>
    			<td align="center" style="background-color:#000000"><font color="#FFFFFF" face="Verdana, Arial, Helvetica, sans-serif" size="1">
    				<b>72</b>
    			</font></td>
        	<td style="background-color:#000000"><font color="#999999" face="Verdana, Arial, Helvetica, sans-serif" size="1">
    
        		<b>• Freedom Figh†ers •</b>
        	</font></td>
        	<td style="background-color:#000000">
        		<a target="_top" href="http://sigil.***********/world.php?suid=2236250&serverid=1"><font color="#00FF00" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>PLAY!</b></font></a>
        	</td>
      	</tr>
      	 	 	<tr>
    	  	<td align="center" style="background-color:#000000">
    
    	  			  			<img src="images/ppnostar.jpg" border="0">
    	  			  	</td>
      	 	<td style="background-color:#000000"><font color="#FFFF00" face="Verdana, Arial, Helvetica, sans-serif" size="1">
      	 		<b>K1NGBILLY</b>
      	 	</font></td>
    			<td align="center" style="background-color:#000000"><font color="#FFFFFF" face="Verdana, Arial, Helvetica, sans-serif" size="1">
    				<b>66</b>
    			</font></td>
    
        	<td style="background-color:#000000"><font color="#999999" face="Verdana, Arial, Helvetica, sans-serif" size="1">
        		<b>• Freedom Figh†ers •</b>
        	</font></td>
        	<td style="background-color:#000000">
        		<a target="_top" href="http://sigil.***********/world.php?suid=2462246&serverid=1"><font color="#00FF00" face="Verdana, Arial, Helvetica, sans-serif" size="1"><b>PLAY!</b></font></a>
        	</td>
      	</tr>
    A friend made a Regex for this before and it still works within my VB.Net application however I have tried to convert it to PHP and have had no success.

    Here is the working regex that was used in the VB app:
    Code:
    </tr>.*?<tr>.*?<td.*?>.*?<td.*?>.*?<b>(?'Name'.+?)</b>.*?<td.*?>.*?<b>(?'Level'\d+?)</b>.*?<td.*?>.*?<b>(?'Crew'.*?)</b>.*?<td.*?>.*?<a .*?href=.*?suid=(?'CharacterId'\d+?)&.*?</td>
    I also don't have a great knowledge of regex and so can't figure out where to edit to make it work within PHP!

    Any help converting this would be appreciated.

    Thanks,
    David.

  • #2
    New to the CF scene
    Join Date
    May 2011
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I m not much of a programmer. But I may be able to help. Please send me the url you need to fetch the data from and i will see what I can do.

    I cannot promise you anything, as I said I m not much of a programmer

  • #3
    New Coder
    Join Date
    Dec 2010
    Posts
    36
    Thanks
    4
    Thanked 0 Times in 0 Posts
    The page is password protected, which is why I put a sample of the page up, I managed to understand everything what the regular expression was doing in VB but I don't know how to use the groups in PHP, that will be my task for the next few hours

  • #4
    Regular Coder
    Join Date
    May 2011
    Posts
    241
    Thanks
    1
    Thanked 57 Times in 56 Posts
    Try the following

    PHP Code:
    $data "Page data.....";
    $pattern "#</tr>.*?<tr>.*?<td.*?>.*?<td.*?>.*?<b>(?'Name'.+?)</b>.*?<td.*?>.*?<b>(?'Level'\d+?)</b>.*?<td.*?>.*?<b>(?'Crew'.*?)</b>.*?<td.*?>.*?<a .*?href=.*?suid=(?'CharacterId'\d+?)&.*?</td>#si";
    preg_match($pattern$data$matches);
    print_r($matches); 

  • Users who have thanked gvre for this post:

    david56connor (05-31-2011)

  • #5
    New Coder
    Join Date
    Dec 2010
    Posts
    36
    Thanks
    4
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by gvre View Post
    Try the following

    PHP Code:
    $data "Page data.....";
    $pattern "#</tr>.*?<tr>.*?<td.*?>.*?<td.*?>.*?<b>(?'Name'.+?)</b>.*?<td.*?>.*?<b>(?'Level'\d+?)</b>.*?<td.*?>.*?<b>(?'Crew'.*?)</b>.*?<td.*?>.*?<a .*?href=.*?suid=(?'CharacterId'\d+?)&.*?</td>#si";
    preg_match($pattern$data$matches);
    print_r($matches); 
    It works! Thanks very much!

    Now to figure out how I will iterate over the results


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •