Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    Senior Coder
    Join Date
    May 2006
    Posts
    1,680
    Thanks
    28
    Thanked 4 Times in 4 Posts

    Using php - simplexml Some help needed.

    Hi,

    I am looking at a download from clickbank and I notice that
    it has two files, a very small on suffixed with .dtd which I
    list below, and a huge file 26 Mb suffixed with .xml

    Here is the .dtd


    <!ELEMENT Catalog ( Category* ) >
    <!ELEMENT Category ( Name, Site*, Category* ) >
    <!ELEMENT Commission ( #PCDATA ) >
    <!ELEMENT Description ( #PCDATA ) >
    <!ELEMENT EarnedPerSale ( #PCDATA ) >
    <!ELEMENT TotalEarningsPerSale ( #PCDATA ) >
    <!ELEMENT TotalRebillAmt ( #PCDATA ) >
    <!ELEMENT HasRecurringProducts ( #PCDATA ) >
    <!ELEMENT Gravity ( #PCDATA ) >
    <!ELEMENT Id ( #PCDATA ) >
    <!ELEMENT Name ( #PCDATA ) >
    <!ELEMENT PercentPerSale ( #PCDATA ) >
    <!ELEMENT PopularityRank ( #PCDATA ) >
    <!ELEMENT Referred ( #PCDATA ) >
    <!ELEMENT Site ( Commission? | Description+ | EarnedPerSale? | TotalEarningsPerSale? | TotalRebillAmt? | Gravity? | Id+ | PercentPerSale? | PopularityRank+ | Referred? | Title+ | HasRecurringProducts )* >
    <!ELEMENT Title ( #PCDATA ) >


    And here is the first few lines on the .xml file.


    <?xml version="1.0" encoding="ISO-8859-1"?>
    <!DOCTYPE Catalog SYSTEM "marketplace_feed_v1.dtd">
    <Catalog>
    <Category>
    <Name>Business to Business</Name>
    <Site>
    <Id>REGEASY</Id>
    <PopularityRank>1</PopularityRank>
    <Title><![CDATA[Registry Easy - #1 Converting Registry Cleaner & System Optimizer.]]></Title>
    <Description><![CDATA[Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.]]></Description>
    <HasRecurringProducts>false</HasRecurringProducts>
    <Gravity>226.333</Gravity>
    <EarnedPerSale>31.7204</EarnedPerSale>
    <PercentPerSale>75.0</PercentPerSale>
    <TotalEarningsPerSale>31.7204</TotalEarningsPerSale>
    <TotalRebillAmt>0.0</TotalRebillAmt>
    <Referred>68.0</Referred>
    <Commission>75</Commission>
    </Site>
    <Site>
    <Id>BRYXEN4</Id>
    <PopularityRank>2</PopularityRank>
    <Title><![CDATA[Keyword Elite 2.0: The New Generation Of Keyword Research Software!]]></Title>
    <Description><![CDATA[Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.]]></Description>
    <HasRecurringProducts>true</HasRecurringProducts>
    <Gravity>229.6</Gravity>
    <EarnedPerSale>65.1052</EarnedPerSale>
    <PercentPerSale>48.0</PercentPerSale>
    <TotalEarningsPerSale>74.1738</TotalEarningsPerSale>
    <TotalRebillAmt>15.2186</TotalRebillAmt>
    <Referred>79.0</Referred>
    <Commission>50</Commission>
    </Site>


    Ok - so that shows the header info and the first two lines of data.

    Now, the first line of the header info refers to the .dtd file.

    If I just use the info in the .dtd file to create a table
    with columns ( fields) as it states.

    Or I could just create the table structure from looking at the first few
    records in the xml file that I have shown.

    Once I have done that, I guess that I write a php script
    to open the file and then step through each row and pull out the contents that is found between the tags.

    As it finds each tag it can locate the contents and update the table.

    So:
    PHP Code:
    $CB_file file('clickbank.xml');

    for(
    $i=0$i<count($CB_file); $i++) {
        
    $arrayOfLine explode('???'$geo_arr[$i]);
         
        
    Update cbdb SET ????? = ??????
        
    $result mysql_query($sql) or die("could not CBDB"). mysql_error();  
       break;
       }
     } 
    Yes, I know that I have a lot of gaps to fill in

    But, my question is, can this approach work with a
    xml file of 28 Mb and based on the files that I have
    can you please help me fill in the gaps.

    PS I have JUST READ UP ON SIMPLEXML see below

    Thanks for any input and help.
    Last edited by jeddi; 10-10-2009 at 01:12 PM.

  • #2
    Senior Coder
    Join Date
    May 2006
    Posts
    1,680
    Thanks
    28
    Thanked 4 Times in 4 Posts
    Hi,

    Someone has suggested using "simplexml"

    I have read it up and as far as I can tell
    this is what I need to do:

    Code:
    $CB_file = file('clickbank.xml');
    
    $xmlstr ="<<<XML ".$CB_file." XML";
    
    // ( do I have to *** line breaks at all ? )
    
    // Then I continue to check validity:
    
    $xmlObject = simplexml_load_string($xmlstr); 
    
    // not sure about how I go to this next line
    
    $xml = new SimpleXMLElement($xmlstr);
    After this I guess that I need a foreach loop to work through the
    whole file ?

  • #3
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Code:
    $CB_file = file('clickbank.xml');
    $CB_file is a array not a string, read the documentation for file
    http://www.php.net/manual/en/function.file.php

    Code:
    $xmlstr ="<<<XML ".$CB_file." XML";
    this will build, is incorrect so I talk about intention, a string with the content of the $CB_file which as I said is a array not a string.

    use one of this if you want to build $xmlstr
    Code:
    $xmlstr = join($CB_file);
    $xmlstr = implode($CB_file); // same thing as previous
    $xmlstr = file_get_contents('clickbank.xml');
    read the manual for join, implode and file_get_contents( only one step)
    http://www.php.net/manual/en/function.join.php
    http://www.php.net/manual/en/function.implode.php
    http://www.php.net/manual/en/functio...t-contents.php

    Code:
    // ( do I have to *** line breaks at all ? )
    only to be easy to read for you

    Code:
    // Then I continue to check validity:
    
    $xmlObject = simplexml_load_string($xmlstr);
    there is a simplexml_lod_file and you can avoid previous unnecessary steps( in my opinion)
    http://www.php.net/manual/en/functio...-load-file.php

    Code:
    // not sure about how I go to this next line
    
    $xml = new SimpleXMLElement($xmlstr);
    is allready loaded, see line $xmlObject

    always keep the manual closer, that's important.

    best regards

  • #4
    Senior Coder
    Join Date
    May 2006
    Posts
    1,680
    Thanks
    28
    Thanked 4 Times in 4 Posts
    Thanks, I read the manual and a couple of tutes.

    I now have something close to working

    But I get and error on trying to write to the database:
    it may be because I need to convert the data ?
    Expand|Select|Wrap|Line Numbers
    PHP Code:
    $sql "INSERT INTO `clickbank` (cat,id,pop)
    VALUES  ('$category->Name','$site->Id','$site->PopularityRank')"

    I noticed in the tute it said something that might apply:

    It gave this example:

    Expand|Select|Wrap|Line Numbers

    PHP Code:
    $xml ‘test_file.xml’;
    $xml simplexml_load_file($xml);
    $value_to_store = (string) $xml->make[0]->model;
    // This converts the "Mustang" SimpleXMLElement object to a string, making it disk storable. 
    Does this mean that I have to do this:
    PHP Code:
    $Db_id = (string) $xml->Category->$site->Id
    for each field?

    And is this enough? Or do I need to add counters to keep track of which row is being processed and then use something like:

    PHP Code:
    $Db_id = (string) $xml->Category[$cnt1]->$site->Id

    The error message I get from the script is :

    could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc,recurr,grav,earn,percent,totearn,rebill,refe r,comm) VALUES ' at line 1
    Your advice is appreciated

  • #5
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Quote Originally Posted by jeddi View Post
    Thanks, I read the manual and a couple of tutes.

    I now have something close to working

    But I get and error on trying to write to the database:
    it may be because I need to convert the data ?
    Expand|Select|Wrap|Line Numbers
    PHP Code:
    $sql "INSERT INTO `clickbank` (cat,id,pop)
    VALUES  ('$category->Name','$site->Id','$site->PopularityRank')"

    use var_dump to see what's in each variable, but I guess each are arrays, so you must extract only the values you need.

    And is this enough? Or do I need to add counters to keep track of which row is being processed and then use something like:

    PHP Code:
    $Db_id = (string) $xml->Category[$cnt1]->$site->Id
    yes, somethink like that, $cnt1 is the position of Category node in the tree.

    The error message I get from the script is :

    Your advice is appreciated
    check my assumption about being an array there, also look to have a valid sql query. Try to see the query before you use it, write something like this right after $sql line:
    PHP Code:
    print '<pre>'.$sql.'</pre>'
    best regards

  • #6
    Senior Coder
    Join Date
    May 2006
    Posts
    1,680
    Thanks
    28
    Thanked 4 Times in 4 Posts
    OK - that was a good idea

    This is the output I get:

    INSERT INTO clickbank (cat,id,pop,title,desc,recurr,grav,earn,percent,totearn,rebill,refer,comm)
    VALUES
    ('Business to Business','REGEASY','1','Registry Easy - #1 Converting Registry Cleaner & System Optimizer.','Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! ww.cheesesoft.com/affiliates/registry-easy/.','false','226.333','31.7204','75.0','31.7204','0.0','68.0','75')

    Looks like the values are getting through fine.
    Could the problem be that grav is set up in the table as:

    double(5.2) maybe the 31.7204 doesn't fit.

    Actually I don't understand that number, is it supposed be 31720.40 dollars or 317,204 dollars ? Or only 31.72 dollars ?

    Or maybe it is something in that long description ?

    PS I had to edit the url that was in the desc because it got messed up in this forum post.

    it was
    Last edited by jeddi; 10-10-2009 at 07:53 PM.

  • #7
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Quote Originally Posted by jeddi View Post
    OK - that was a good idea

    This is the output I get:

    INSERT INTO clickbank (cat,id,pop,title,desc,recurr,grav,earn,percent,totearn,rebill,refer,comm)
    VALUES
    ('Business to Business','REGEASY','1','Registry Easy - #1 Converting Registry Cleaner & System Optimizer.','Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! ww.cheesesoft.com/affiliates/registry-easy/.','false','226.333','31.7204','75.0','31.7204','0.0','68.0','75')

    Looks like the values are getting through fine.
    Could the problem be that grav is set up in the table as:

    double(5.2) maybe the 31.7204 doesn't fit.

    Actually I don't understand that number, is it supposed be 31720.40 dollars or 317,204 dollars ? Or only 31.72 dollars ?

    Or maybe it is something in that long description ?

    PS I had to edit the url that was in the desc because it got messed up in this forum post.

    it was
    from http://dev.mysql.com/doc/refman/5.0/...-overview.html:
    DOUBLE[(M,D)]

    M is the total number of digits and D is the number of digits following the decimal point. If M and D are omitted, values are stored to the limits allowed by the hardware.
    I don't know, after the error message from your previous post seems that the problem is with the value for field cat or the left round bracket '(' after 'values'.
    check the type of the columns and if values are of same type.

    best regards

  • #8
    Senior Coder
    Join Date
    May 2006
    Posts
    1,680
    Thanks
    28
    Thanked 4 Times in 4 Posts
    OK - think I have found the problem

    I think it was the field na "desc" because it is used in the ORDER part of sql.

    It was a guess but when I changed the name to "descrip" the first three rows get processed OK

    This is my file structure now:
    PHP Code:

    $sql 
    "CREATE TABLE `clickbank` (
            `cb_id` smallint(8) NOT NULL AUTO_INCREMENT,
         `id` varchar(10) NOT NULL default 'none',
            `cat` varchar(50) NOT NULL default 'none',
        `pop` smallint(8) NOT NULL default '1',
        `title` varchar(100) NOT NULL default 'n',
        `descrip` varchar(300) NOT NULL default 'n',
        `recurr` char(1) NOT NULL default 'n',
        `grav` double(10,2) NOT NULL default '99.99',
        `earn`  double(10,2) NOT NULL default '99.99',
        `percent`  double(5,2) NOT NULL default '99.99',
        `totearn`  double(10,2) NOT NULL default '99.99',
        `rebill`  double(10,2) NOT NULL default '99.99',
        `refer`  double(10,2) NOT NULL default '99.99',
        `comm`  double(5,2) NOT NULL default '99.99',
        PRIMARY KEY (cb_id) 
    I still have a problem and it seems to be caused by single quotes in the description data.

    This is my out put
    0) Business to Business
    0) REGEASY1Registry Easy - #1 Converting Registry Cleaner & System Optimizer.Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates....72040.068.075

    INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn, rebill, refer, comm )
    VALUES ( 'Business to Business', 'REGEASY', '1', 'Registry Easy - #1 Converting Registry Cleaner & System Optimizer.', 'Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.', 'false',
    '226.333', '31.7204', '75.0', '31.7204', '0.0', '68.0', '75' )

    1) BRYXEN42Keyword Elite 2.0: The New Generation Of Keyword Research Software!Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliat...815.218679.050

    INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn, rebill, refer, comm )
    VALUES ( 'Business to Business', 'BRYXEN4', '2', 'Keyword Elite 2.0: The New Generation Of Keyword Research Software!', 'Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.', 'true', '229.6', '65.1052', '48.0', '74.1738', '15.2186', '79.0', '50' )

    2) MAVERICKCO3Maverick Coaching - Cell Phone Cash.Cell Phone Cash: A Brand New Course By Maverick Coaching Members Are Making At Least $279/Day With Cell Phones! Customers Get Our 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: Http://cellphonecash.maverickcoachin...212.563486.050

    INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn,rebill, refer, comm )
    VALUES ( 'Business to Business', 'MAVERICKCO', '3', 'Maverick Coaching - Cell Phone Cash.', 'Cell Phone Cash: A Brand New Course By Maverick Coaching Members Are Making At Least $279/Day With Cell Phones! Customers Get Our 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: ttp://cellphonecash.maverickcoaching.com/affiliates.php.', 'true',
    '674.459', '12.9828', '50.0', '25.5462', '12.5634', '86.0', '50' )

    could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: Http://cellph' at line 4

  • #9
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Code:
    'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: ttp://cellphonecash.maverickcoaching.com/affiliates.php.'
    that ' is the problem there.
    http://www.php.net/manual/en/functio...ape-string.php

    best regards


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •