Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3

Thread: XPath Values

  1. #1
    Regular Coder
    Join Date
    Jul 2007
    Location
    Scotland
    Posts
    134
    Thanks
    12
    Thanked 0 Times in 0 Posts

    XPath Values

    Hi Guys,

    I'm having trouble extracting certain pieces of data from a webpage using dom.

    html:

    PHP Code:
    <div align="center" id="content"><link href="css/style.css" rel="stylesheet" type="text/css"><style type="text/css"
    .
    style1hhh {color#FF0000}
    </style><acronym title="Affiliate Info: Pays  % on Level1  Seller Accepts   PAYPAL "><table width="95%" border="0" align="center" cellpadding="0" cellspacing="3"><tr align="left"><td colspan="2" class="acat" >How to produce methane gas from manure <font color="#567faf"> $35.00</font></td>
    </
    tr><tr align="left"><td width="26">&nbsp;</td>
    <
    td class="subtitle_s"><em><font color="#333333"><span>Produce methane gas at home for cookingheating, and making electricity.</></font></em></td>
    </
    tr><tr align="left"><td>&nbsp;</td>
    <
    td><a href=a.page.php?id=41008&u=sspence >Promote</a> |  <a href=http://site.com/r/41008/XXXXX/ target=_blank>Visit site</a><acronym> | [ APS: <span onClick="window.open('aps.php','','width=500, height=300');" style="color:#0000FF; 
    text-decoration:underlinecursor:pointer;">0.81</span>* ]</acronym></td>
    </tr><tr><td colspan="
    2"><hr size="1" noshade></td></tr></table></acronym>


    <link href="
    css/style.css" rel="stylesheet" type="text/css"><style type="text/css"> 
    .style1hhh {color: #FF0000}
    </style><acronym title="
    Affiliate InfoPays  on Level1  Seller Accepts   PAYPAL "><table width="95%" border="0" align="center" cellpadding="0" cellspacing="3"><tr align="left"><td colspan="2" class="acat" >Profitable Recipes e-Book Package <font color="#567faf"> $7.00</font></td>
    </tr><tr align="left"><td width="26">&nbsp;</td>
    <
    td class="subtitle_s"><em><font color="#333333"><span class="subtitle_s" onmouseover="DivSetVisible(true,'description2', 500);" onmouseout="DivSetVisible(false, 'description2', 500);"Instantly OWN Master Resale Rights To The Hottest 100Profitable Cooking E-books On The WebEvery item in this monster collection  comes complete with individual sales pagesAm...</span><div id='description2' style='position:absolute; width:500px; padding:4px; display:none; z-index:100; font-family:Verdana, Arial, Helvetica, sans-serif; font-size:13px;font-weight:normal;' class="cattable"Instantly OWN Master Resale Rights To The Hottest 100Profitable Cooking E-books On The WebEvery item in this monster collection  comes complete with individual sales pagesAmazing Collection of Fast Selling cooking e-books That People Will Be Literally Throwing Money At You To Buy From Your Web Site.</div></font></em></td>
    </
    tr><tr align="left"><td>&nbsp;</td>
    <
    td><a href=a.page.php?id=80024&u=revenue >Promote</a> |  <a href=http://site.com/r/80024/XXXXX/ target=_blank>Visit site</a><acronym> | [ APS: <span onClick="window.open('aps.php','','width=500, height=300');" style="color:#0000FF; 
    text-decoration:underlinecursor:pointer;">0.59</span>* ]</acronym></td>
    </tr><tr><td colspan="
    2"><hr size="1" noshade></td></tr></table></acronym>
    <link href="
    css/style.css" rel="stylesheet" type="text/css"><style type="text/css"> 
    .style1hhh {color: #FF0000}


    </style><acronym title="
    Affiliate InfoPays  on Level1  Seller Accepts   PAYPAL "><table width="95%" border="0" align="center" cellpadding="0" cellspacing="3"><tr align="left"><td colspan="2" class="acat" >Cookin' Kids <font color="#567faf"> $17.00</font></td>
    </tr><tr align="left"><td width="26">&nbsp;</td>
    <
    td class="subtitle_s"><em><font color="#333333"><span class="subtitle_s" onmouseover="DivSetVisible(true,'description3', 500);" onmouseout="DivSetVisible(false, 'description3', 500);">Cookin' Kids ebook is for kids who like to cook! Very original and unique ebook with themes, recipes, fun facts, games, jokes, cooking definitions, safety info, and more.  It also ...</span><div id='description3' style='position:absolutewidth:500pxpadding:4pxdisplay:nonez-index:100font-family:VerdanaArialHelveticasans-seriffont-size:13px;font-weight:normal;' class="cattable">Cookin' Kids ebook is for kids who like to cookVery original and unique ebook with themesrecipesfun factsgamesjokescooking definitionssafety info, and more.  It also makes a great present for your favorite kid!</div></font></em></td>
    </
    tr><tr align="left"><td>&nbsp;</td>
    <
    td><a href=a.page.php?id=77957&u=Margret >Promote</a> |  <a href=http://site.com/r/77957/XXXXX/ target=_blank>Visit site</a><acronym> | [ APS: <span onClick="window.open('aps.php','','width=500, height=300');" style="color:#0000FF; 
    text-decoration:underlinecursor:pointer;">0.15</span>* ]</acronym></td>
    </tr><tr><td colspan="
    2"><hr size="1" noshade></td></tr></table></acronym>
    <link href="
    css/style.css" rel="stylesheet" type="text/css"><style type="text/css"> 
    .style1hhh {color: #FF0000}


    </style><acronym title="
    Affiliate InfoPays  on Level1  Seller Accepts   PAYPAL "><table width="95%" border="0" align="center" cellpadding="0" cellspacing="3"><tr align="left"><td colspan="2" class="acat" >Guide to Organic Cooking! - The Healthy Way of Living! - eBook only <font color="#567faf"> $19.97</font></td>
    </tr><tr align="left"><td width="26">&nbsp;</td>
    <
    td class="subtitle_s"><em><font color="#333333"><span class="subtitle_s" onmouseover="DivSetVisible(true,'description4', 500);" onmouseout="DivSetVisible(false, 'description4', 500);">Pays 70% - HealthHobby and Fitness Guide about Organic Cooking including Shopping and Gardening Tips and Recipes. If you want to cook and eat healthier and do your part to protec...</span><div id='description4' style='position:absolute; width:500px; padding:4px; display:none; z-index:100; font-family:Verdana, Arial, Helvetica, sans-serif; font-size:13px;font-weight:normal;' class="cattable">Pays 70% - HealthHobby and Fitness Guide about Organic Cooking including Shopping and Gardening Tips and Recipes. If you want to cook and eat healthier and do your part to protect your family and help the environment... or you are interested in growing your own organic foods in your garden... then this eBook was written just for youHigh quality 98 page PDF eBook for immediate download.</div></font></em></td>
    </
    tr><tr align="left"><td>&nbsp;</td>
    <
    td><a href=a.page.php?id=57416&u=dts >Promote</a> |  <a href=http://site.com/r/57416/XXXXX/ target=_blank>Visit site</a><acronym> | [ APS: <span onClick="window.open('aps.php','','width=500, height=300');" style="color:#0000FF; 
    text-decoration:underlinecursor:pointer;">0.04</span>* ]</acronym></td>
    </tr><tr><td colspan="
    2"><hr size="1" noshade></td></tr></table></acronym>
    </div> 

    So far i have:

    PHP Code:
            // parse the html into a DOMDocument  
            
    $dom = new DOMDocument();
            
    $dom->loadHTML($html);
           
            
    $xpath   = new DOMXPath($dom);
            
    $results $xpath->query("//*[@class='acat']");
            
    //$results = $xpath->getElementsByTagName('a');
            ///html/body/div[3]/center/table/tbody/tr/td/table/tbody/tr/td[2]/div[2]/table/tbody/tr/td
            ///html/body/div[3]/center/table/tbody/tr/td/table/tbody/tr/td[2]/div[2]/table[2]/tbody/tr/td
            ///html/body/div[3]/center/table/tbody/tr/td/table/tbody/tr/td[2]/div[2]/table[3]/tbody/tr/td
            //$results = $xpath->query("/html/body/div/center/table/tr/td/table/tr/td/div[@id='content']/table/tr/td");
            
    foreach ($results as $result) {

                
    // Title
                
    $title $result->nodeValue;
                
                print 
    $title;
                print 
    "<br /><br />";



    If i change: $results = $xpath->query("//*[@class='acat']"); to $results = $xpath->query("//*[@class='subtitle_s']");

    The first one returns the title (which is correct), if i replace it with the second query it returns the description (also correct)

    i can't seem to retrieve both at the same time.

    any help would be appreciated

    thanks guys

    Graham

  • #2
    Senior Coder Dormilich's Avatar
    Join Date
    Jan 2010
    Location
    Behind the Wall
    Posts
    3,469
    Thanks
    13
    Thanked 361 Times in 357 Posts
    that’s right, you can’t. simply because the XPath query returns a list containing all matches (and you process the list item-wise).
    The computer is always right. The computer is always right. The computer is always right. Take it from someone who has programmed for over ten years: not once has the computational mechanism of the machine malfunctioned.
    André Behrens, NY Times Software Developer

  • #3
    Regular Coder
    Join Date
    Jul 2007
    Location
    Scotland
    Posts
    134
    Thanks
    12
    Thanked 0 Times in 0 Posts
    Hi Dorm,

    ah because i specify exactly to look for: $xpath->query("//*[@class='acat']"); that returns me only those node values, so i would need to go back up the tree a bit? i have tried searching for the exact xpath using firebug but reading up on firebug a a lot of people say it gives you the wrong path and adds a tbody tag, is there a fairly accurate way to find the correct xpaths?

    cheers mate

    Graham


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •