Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3
  1. #1
    Regular Coder mlse's Avatar
    Join Date
    Mar 2005
    Posts
    624
    Thanks
    20
    Thanked 19 Times in 18 Posts

    Can't parse PHP using DOM or the older xml parser functions!

    Hi all,

    I am trying to read large PHP files into XML documents and having no luck!

    Both the XML Parser functions and the DOM Functions throw up errors when the PI data contains illegal XML chars (e.g. "&"), or even quoted XML fragments (E.g. "<?xml") (I am aware of the limitation that "?>" tags cannot be quoted ... but the manual doesn't say that other tags can't be quoted!).

    I thought the PI data handler in either case was supposed to be able to deal with this kind of thing!

    I can get round this with a few hacks (i.e. manually chopping out and then re-inserting everything within <?php ?> tags), but is there a way to force the XML functionality to behave properly without using CDATA? (Which I shouldn't have to use anyway if I have a PI handler registered!).
    Last edited by mlse; 02-18-2008 at 07:19 PM.

  • #2
    Regular Coder
    Join Date
    Aug 2002
    Location
    Oregon, United States of America
    Posts
    882
    Thanks
    1
    Thanked 9 Times in 9 Posts
    Can you post the XML you are using, and the PHP you are using to read it?
    If I'm postin here, I NEED YOUR HELP!!

  • #3
    Regular Coder mlse's Avatar
    Join Date
    Mar 2005
    Posts
    624
    Thanks
    20
    Thanked 19 Times in 18 Posts
    Hi,

    Yep, there's two blocks of code that I've tried.

    Firstly, here's the PHP to be read in by the PI handler:

    PHP Code:
    <?php

    $string 
    "Hello World!";
    $ref =& $string;

    echo (
    '<?xml version="1.0" encoding="utf-8" ?'.'>'.
          
    '<mydocument>'.
          
    '  <mytag />'.
          
    '</mydocument>');
    ?>
    And here's the code:

    DOM Built-in (filename domparser.php):
    PHP Code:
    $doc= new DOMDocument();
    $doc->loadXML(file_get_contents("tobeparsed.php")); 
    Generates the following warning: Warning: DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in domparser.php on line 2

    The other way uses the XML functions and registers a PI handler with an instantiated xml parser resource. Here's the gist of it:

    PHP Code:
        $parser xml_parser_create();
        
        
    xml_parser_set_option($xparserXML_OPTION_CASE_FOLDINGFALSE);
        
        
    $handstat = array();
        
    $handstat[] = xml_set_object($xparser$this);
        
    $handstat[] = xml_set_element_handler($xparser"tag_start""tag_end");
        
    $handstat[] = xml_set_character_data_handler($xparser"tag_data");
        
    $handstat[] = xml_set_default_handler($xparser"tag_default");
        
    $handstat[] = xml_set_processing_instruction_handler($xparser"tag_pi");
        
    $handstat[] = xml_set_external_entity_ref_handler ($xparser"tag_entref");
        
    $handstat[] = xml_set_notation_decl_handler($xparser"tag_notdec");

        foreach (
    $handstat as $retn)
          {
            if (
    $retn === FALSE)
              {
                
    xml_parser_free($xparser);
                throw new 
    Exception("Handler registration failure: ".implode(":"$handstat));
              }
          }

        
    $status xml_parse($xparser$xmlstrTRUE);

        if (
    $status === 0)
          
    $errmsg = ("XML: error in file '".$this->m_uri."' at line ".xml_get_current_line_number($xparser).": ".xml_error_string(xml_get_error_code($xparser)));

        
    xml_parser_free($xparser); 
    Now this actually aborts completely (within xml_parse - i.e. the PI handler function is never called) with tobeparsed.php as it is, however, if I used '<'.'?xml ... in the echo statement, the file is parsed correctly. The DOM method still throws up a warning though.

    I have got round this now with a couple of hacks, but it would be nice to get the internal calls to the PI handler(s) to work correctly!
    Last edited by mlse; 02-19-2008 at 12:02 PM.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •