Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 18
  1. #1
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts

    compare XML files, text vs numbers

    AFAIK, when comparing data in 2 xml files, everything is passed as text or characters.
    This means that data in the format 0.5 in one file will appear different to data in the second file reading .5 as the precision is different.

    Is there a way to convert data to numbers/decimals first in XML and compare?

  • #2
    Senior Coder
    Join Date
    Jan 2011
    Location
    Missouri
    Posts
    4,452
    Thanks
    23
    Thanked 631 Times in 630 Posts
    What are you using to compare the two files? Maybe things would be a little eaier if you included the xml files your talking about.

    Because "data in the format 0.5 " makes no sense to me and neither does "convert data to numbers/decimals"

  • #3
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by sunfighter View Post
    What are you using to compare the two files? Maybe things would be a little eaier if you included the xml files your talking about.

    Because "data in the format 0.5 " makes no sense to me and neither does "convert data to numbers/decimals"
    file 1
    <multiplyfactor>0.5</multiplyfactor>

    file2
    <multiplyfactor>.5</multiplyfactor>

    When you compare these as text in xml they are different.

  • #4
    Moderator
    Join Date
    May 2002
    Location
    Hayward, CA
    Posts
    1,461
    Thanks
    1
    Thanked 23 Times in 21 Posts
    qwertyjjj, XML has no concept of numbers, believe it or not. A parsed DOM of your document will look like this:

    multiplyfactor (element)
    -- .5 (text node, whose value is a string)

    Your best bet would be to have some custom scripting to identify elements containing numbers, and then convert their contents to actual numbers using parseInt or whatever your language's equivalent is.
    "The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
    June 30, 2001
    author, Verbosio prototype XML Editor
    author, JavaScript Developer's Dictionary
    https://alexvincent.us/blog

  • #5
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Alex Vincent View Post
    qwertyjjj, XML has no concept of numbers, believe it or not. A parsed DOM of your document will look like this:

    multiplyfactor (element)
    -- .5 (text node, whose value is a string)

    Your best bet would be to have some custom scripting to identify elements containing numbers, and then convert their contents to actual numbers using parseInt or whatever your language's equivalent is.
    What is the best way to do this?
    Give each XML item an attribute with the datatype or the schema should contain the element datatype and then the software can check the schema?

  • #6
    New to the CF scene
    Join Date
    Mar 2011
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    have you tried using a xml diff tool?
    I use liquid studio and that has a fairly decent compare / diff tool.
    http://www.liquid-technologies.com/Compare-XML.aspx

  • #7
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by xmlguy View Post
    have you tried using a xml diff tool?
    I use liquid studio and that has a fairly decent compare / diff tool.
    http://www.liquid-technologies.com/Compare-XML.aspx
    but programatically by software?
    Surely in code, you can check a schema t get a datatype?

  • #8
    Senior Coder
    Join Date
    Aug 2006
    Posts
    1,346
    Thanks
    11
    Thanked 288 Times in 287 Posts
    Quote Originally Posted by qwertyjjj View Post
    Surely in code, you can check a schema t get a datatype?
    I don't know how that's going to help, though. Both your examples (0.5 and .5) would be valid against the same schema, so what would that tell you? I agree, the way to do this seems to be to load the two xml files into a program and "walk" the objects comparing them.

    Dave

  • #9
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by tracknut View Post
    I don't know how that's going to help, though. Both your examples (0.5 and .5) would be valid against the same schema, so what would that tell you? I agree, the way to do this seems to be to load the two xml files into a program and "walk" the objects comparing them.

    Dave
    couldn't the schema have precision and scale?
    ie everything has to have something before the decimal place?

  • #10
    Senior Coder
    Join Date
    Aug 2006
    Posts
    1,346
    Thanks
    11
    Thanked 288 Times in 287 Posts
    Quote Originally Posted by qwertyjjj View Post
    couldn't the schema have precision and scale?
    ie everything has to have something before the decimal place?
    Unfortunately I'm no master of the schema, but logically I'm going to guess that this is not a "schema issue" as both those numbers are completely legitimate representations of "one half". You may need to write a little test example and see if there's a way to get a schema validation to fail one and accept the other.

    Dave

  • #11
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by tracknut View Post
    Unfortunately I'm no master of the schema, but logically I'm going to guess that this is not a "schema issue" as both those numbers are completely legitimate representations of "one half". You may need to write a little test example and see if there's a way to get a schema validation to fail one and accept the other.

    Dave
    I guess the problem is how to tell the software to check them as a number rather than a string so that it doens;t find a difference when it compares the,
    If it knows it's a decimal, then it sees 0.5 the same as .5, which is correct.

  • #12
    Moderator
    Join Date
    May 2002
    Location
    Hayward, CA
    Posts
    1,461
    Thanks
    1
    Thanked 23 Times in 21 Posts
    Let's step back a bit. First and foremost: what is going to consume the XML? Specifically, what programming or scripting language is that consumer written in?

    This is most important, since XML without something to parse it is just a string of characters. The language will place constraints and expose capabilities that XML itself doesn't have.
    "The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
    June 30, 2001
    author, Verbosio prototype XML Editor
    author, JavaScript Developer's Dictionary
    https://alexvincent.us/blog

  • #13
    Senior Coder
    Join Date
    Jan 2011
    Location
    Missouri
    Posts
    4,452
    Thanks
    23
    Thanked 631 Times in 630 Posts
    It's easy enough, after parsing the xml, to insure that you have integers when you do math by forcing them to be numbers.

  • #14
    New Coder
    Join Date
    Dec 2007
    Posts
    65
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by sunfighter View Post
    It's easy enough, after parsing the xml, to insure that you have integers when you do math by forcing them to be numbers.
    Ok, but imagine that xml file has 100 different elements.
    How does the parser know which is meant to be a decimal, which a string, which a date, etc.

    You either hardcode it in the software or you check the schema?

  • #15
    Moderator
    Join Date
    May 2002
    Location
    Hayward, CA
    Posts
    1,461
    Thanks
    1
    Thanked 23 Times in 21 Posts
    Quote Originally Posted by qwertyjjj View Post
    Ok, but imagine that xml file has 100 different elements.
    How does the parser know which is meant to be a decimal, which a string, which a date, etc.

    You either hardcode it in the software or you check the schema?
    Pretty much.
    "The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
    June 30, 2001
    author, Verbosio prototype XML Editor
    author, JavaScript Developer's Dictionary
    https://alexvincent.us/blog


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •