Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    Regular Coder
    Join Date
    Aug 2002
    Location
    Silicon Valley, CA
    Posts
    980
    Thanks
    0
    Thanked 0 Times in 0 Posts

    RSS Element Parsing Issue

    I'm working on developing an RSS/RDF/Atom Parser in JavaScript. I've already successfully implemented complete support for RSS 0.9x and 2.0. So far, so good. However, I've run into two minor problems. One is mentioned here, and one is in another post.

    If you have an RSS feed that looks something like this:

    Code:
    <?xml version="1.0" encoding="iso-8859-1"?>
    <rss version="2.0">
    
    	<channel> 
    		<title>Skyzyx.com</title>
    		<link>http://www.skyzyx.com/</link>
    		<description>Advocacy, Evangelism, and Propaganda for Standards-Compliant Design.</description>
    
    		<item>
    			<title>SpamMaster Joe</title>
    			<link>http://www.skyzyx.com/archives/000160.php</link>
    			<description>Joe Lieberman spammed me today! Well, kind of...</description>
    		</item>
    
    	</channel>
    </rss>
    Assuming that I've created a variable to represent the <channel> tag, I could use the following code to parse out the value of the first <description> tag:

    Figure 1:
    Code:
    RSSChannel.getElementsByTagName("description")[0].firstChild.data;
    For the sake of naming purposes, let's assume that <item> has a child of <description>. <item> is a child of <channel>, and so is its "brother", <description>. Therefore we have two <description> tags with an Uncle-Nephew relationship. This is how I will explain them.

    If the Uncle <description> exists (which it should, since it's a required element, but I digress), then the above code will parse out the value of the Uncle <description>. So far, so good.

    Keeping the same assumptions, the following code will parse out the value of the Nephew <description> tag:

    Figure 2:
    Code:
    RSSChannel.getElementsByTagName("description")[0].getElementsByTagName("description")[0].firstChild.data;
    Excellent! The problem is this: If the Uncle <description> tag is omitted for whatever reason by the creator of the RSS feed, then the code from Figure 1 will default to the Nephew <description> tag instead of returning false (which is what I've programmed the script to do if the sought-out element does not exist). This simply won't do.

    Is there any way (or ideas!) to keep the parsing of elements contained within its own "generation" without trying to parse a nested element of the same name?

  • #2
    Master Coder
    Join Date
    Feb 2003
    Location
    Umeň, Sweden
    Posts
    5,575
    Thanks
    0
    Thanked 83 Times in 74 Posts
    Code:
    getChildrenByTagName(oParent, sTagName){
        var
            i=0,
            aResults=[],
            oTemp;
        while((oTemp=oParent.childNodes.item(i++)))
            if(oTemp.nodeName==sTagName)
                aResults.push(oTemp);
        return aResults;
    }
    You can do it using DOM2 NodeFilters and either Treewalkers or NodeIterators, or you can use DOM3 XPath, to create dynamic collections. This is far more browser compatible, however.
    liorean <[lio@wg]>
    Articles: RegEx evolt wsabstract , Named Arguments
    Useful Threads: JavaScript Docs & Refs, FAQ - HTML & CSS Docs, FAQ - XML Doc & Refs
    Moz: JavaScript DOM Interfaces MSDN: JScript DHTML KDE: KJS KHTML Opera: Standards

  • #3
    Regular Coder
    Join Date
    Aug 2002
    Location
    Silicon Valley, CA
    Posts
    980
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Originally posted by liorean
    You can do it using DOM2 NodeFilters and either Treewalkers or NodeIterators, or you can use DOM3 XPath, to create dynamic collections. This is far more browser compatible, however.
    Eh?

    Thanks for the code. As far as that goes, would I be able to add this to the DOM with, say, a prototype, so that I can use it in the same manner as getElementById or getElementsByTagName?

    What would be the recommended syntax for utilizing this as-is?

  • #4
    Master Coder
    Join Date
    Feb 2003
    Location
    Umeň, Sweden
    Posts
    5,575
    Thanks
    0
    Thanked 83 Times in 74 Posts
    Because of bad browser behavior, you shouldn't try to prototype it. Only moz allows prototyping to all the main DOM objects. Opera allows prototyping only to the Node object. (Unless they have added more since I tried it last time, op7.02.) I have no idea how Saf handles it. Iew doesn't, however.

    No, instead of using it the old way
    Code:
    [object DOMElement].getChildrenByTagName([string TagName]);
    You should use if as a global (or local) function, use it as-is,
    Code:
    getChildrenByTagName([object DOMElement], [string TagName]);
    It returns an array with the elements. (Note that it's an array, not a collection, so it will not be dynamically updated if you change the DOM, in difference to the regular DOM methods.)
    liorean <[lio@wg]>
    Articles: RegEx evolt wsabstract , Named Arguments
    Useful Threads: JavaScript Docs & Refs, FAQ - HTML & CSS Docs, FAQ - XML Doc & Refs
    Moz: JavaScript DOM Interfaces MSDN: JScript DHTML KDE: KJS KHTML Opera: Standards

  • #5
    Regular Coder
    Join Date
    Aug 2002
    Location
    Silicon Valley, CA
    Posts
    980
    Thanks
    0
    Thanked 0 Times in 0 Posts
    That makes sense. Thanks!

    Originally posted by liorean
    It returns an array with the elements. (Note that it's an array, not a collection, so it will not be dynamically updated if you change the DOM, in difference to the regular DOM methods.)
    Since it's just parsing out an RSS feed, I don't think I'll need to update any of the document, so this should work fine.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •