Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    New Coder
    Join Date
    Mar 2012
    Posts
    81
    Thanks
    7
    Thanked 0 Times in 0 Posts

    My regular expression is very slow in JS

    This:

    /[^\(]+\.(gif|jpg|jpeg|png|svg|bmp)[^;\s]*\)/gi

    is very slow in JS.

    I have tested this regular expression in the same code:

    /a/gi

    and it runs ultra quick, so it must be the regex, not the JS code.

    Any ideas to speed it up?

  • #2
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    18,145
    Thanks
    203
    Thanked 2,547 Times in 2,525 Posts
    What is that code supposed to do?

    *\ what is the \?

    It is your responsibility to die() if necessary….. - PHP Manual

    All the code given in this post has been tested and is intended to address the question asked.
    Unless stated otherwise it is not just a demonstration.

  • #3
    Regular Coder
    Join Date
    Apr 2012
    Location
    St. Louis, MO
    Posts
    985
    Thanks
    7
    Thanked 101 Times in 101 Posts
    It _looks_ like it allows anything except forward slash and left parenthesis, then a period, and one of six image formats, followed by anything that isn't a semi-colon or whitespace.
    ^_^

    If anyone knows of a website that can offer ColdFusion help that isn't controlled by neurotic, pedantic jerks* (stackoverflow.com), please PM me with a link.
    *
    The neurotic, pedantic jerks are not the owners; just the people who are in control of the "popularity contest".

  • #4
    New Coder
    Join Date
    Mar 2012
    Posts
    81
    Thanks
    7
    Thanked 0 Times in 0 Posts
    Hi,

    Thanks for replying.

    My Regex is very poor and I put this together the best I can using various tutorials etc.

    If anything looks stupid then it probably is.

    It basically searches through a CSS file and returns an array of image URLs. That's all it needs to do. It doesn't need to to replace anything. It works, but takes ages to complete the processing.

  • #5
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    18,145
    Thanks
    203
    Thanked 2,547 Times in 2,525 Posts
    Please show all your (relevant) code.

    All the code given in this post has been tested and is intended to address the question asked.
    Unless stated otherwise it is not just a demonstration.

  • #6
    New Coder
    Join Date
    Mar 2012
    Posts
    81
    Thanks
    7
    Thanked 0 Times in 0 Posts
    This is the part causing the problem:

    var imgUrls = cssFile.match(/[^\(]+\.(gif|jpg|jpeg|png|svg|bmp)[^;\s]*\)/gi);

    If I change the above line to something like:

    var imgUrls = cssFile.match(/a/gi);

    ...then there is no problem.

  • #7
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    18,145
    Thanks
    203
    Thanked 2,547 Times in 2,525 Posts
    What is it exactly that you are trying to match?

    Does this help you?

    Code:
    <script type = "text/javascript">
    var cssFile = "abc.gif,something else, xyz.jpg, whatever, pqr.bmp";
    var s = cssFile.split(",")
    var imgUrls = [];
    
    for (var i =0; i<s.length; i++) {
    imgUrls[i] = s[i].match(/[a-z]+\.+(gif|jpg|jpeg|png|svg|bmp)/gi);
    }
    alert (imgUrls.join(" "));
    
    </script>
    Last edited by Philip M; 11-27-2012 at 07:27 PM.

    All the code given in this post has been tested and is intended to address the question asked.
    Unless stated otherwise it is not just a demonstration.

  • #8
    New Coder
    Join Date
    Mar 2012
    Posts
    81
    Thanks
    7
    Thanked 0 Times in 0 Posts
    Thanks Phillip,

    It solves the processing speed problem, but doesn't quite get all the images.

    It should sort through an entire CSS file and return all the images in that file (gif|jpg|jpeg|png|svg|bmp).

    The problems are:

    (1) it doesn't account for underscores So:
    'image_name.png' is picked up as 'name.png'

    (2) Numbers are ignored, so image_50.png is completely ignored

    (3) It only gets the filename. So, '../images/filename.png' only returns 'filename.png'. Of course if the path is the same for all images in the stylesheet then that's fine, but it might not always be the case.

    If doing it this way helps the speed problem then I can ensure that all images use the same path, but if it makes no difference to the speed then I'd like to get the full path.

    Thanks again.

  • #9
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    18,145
    Thanks
    203
    Thanked 2,547 Times in 2,525 Posts
    Code:
    <script type = "text/javascript">
    var cssFile = "abc.gif,something else, /images/xyz.jpg, whatever, pqr_50.bmp";
    var s = cssFile.split(",")
    var imgUrls = [];
    for (var i =0; i<s.length; i++) {
    imgUrls[i] = s[i].match(/\W?\w.+(gif|jpg|jpeg|png|svg|bmp)/gi);
    }
    alert (imgUrls.join(" "));
    </script>
    Last edited by Philip M; 11-27-2012 at 09:22 PM.

    All the code given in this post has been tested and is intended to address the question asked.
    Unless stated otherwise it is not just a demonstration.

  • #10
    New Coder
    Join Date
    Mar 2012
    Posts
    81
    Thanks
    7
    Thanked 0 Times in 0 Posts
    Thanks again and I really appreciate this, but when tested in a real CSS file it picks up loads of text before the filename as well.

    I'm guessing there's a way to limit the start of the string (file) returned.

    So as an example it returns:

    .class-name{background-image:url('../images/file.png

    ...rather than just:

    ../images/file.png

  • #11
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,399
    Thanks
    11
    Thanked 595 Times in 575 Posts
    the fewer unknowns, the faster it will run.

    in css, all your urls should appear sandwiched in either ',",or ().

    try only looking between those delimiters.

    also, many times it's faster to run three different specific regexps rather than using a range than only hits 2 or 3 different chars in real-life but potentially matches many more.




    EDIT:
    this performs over 50X faster than a match() based on the RX in post #1:
    Code:
    css.toLowerCase()
     .split(/(\.gif|jpg|jpeg|png|svg|bmp)/g)
     .join("\n")
     .match(/[\/.\-\w:]+\n(gif|jpg|jpeg|png|svg|bmp)/g)
     .join("`")
     .split("\n")
     .join("")
     .split("`")
     .filter(Boolean)
     .filter(function(a){return  this[a]?0:(this[a]=1)},{})
     .sort()

    performance improvements compared to code in post#1:
    1. using toLowerCase() to avoid the regexp-slowing "i" flag (2.3X faster)
    2. using string split()s instead of regexp split()s (10-40X faster)
    3. using end-of-line matching instead of floating matching (2-3X faster)
    4. using opt-in chars instead of exclusion list terms ( 1.5-3X faster)


    this also sorts the list and filters out invalid and duplicate items, feel free to remove the last there lines if you don't need that.

    again. i think it can be made even faster by looking at ["'()], but 50X is a good head-start.
    Last edited by rnd me; 11-28-2012 at 04:26 AM.
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/9/03) IE7:0.1, IE8:4.6, IE11:9.1, IE9:3.1, IE10:3.0, FF:17.2, CH:46, SF:11.4, NON-MOUSE:38%

  • #12
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    18,145
    Thanks
    203
    Thanked 2,547 Times in 2,525 Posts
    Quote Originally Posted by johnsmith153 View Post
    Thanks again and I really appreciate this, but when tested in a real CSS file it picks up loads of text before the filename as well.

    I'm guessing there's a way to limit the start of the string (file) returned.

    So as an example it returns:

    .class-name{background-image:url('../images/file.png

    ...rather than just:

    ../images/file.png
    If you specified exactly what it is you wanted then it might be possible to help you better. But if every time you introduce a new spec then it becomes frustrating for both of us.


    Code:
    <script type = "text/javascript">
    var cssFile = "abc.gif, something else, more nonsense, unwanted stuff/mydirectory/images/xyz.jpg, whatever, ../pqr_50.bmp";
    var s = cssFile.split(" ");  // split at spaces
    var imgUrls = [];
    for (var i =0; i<s.length; i++) {
    imgUrls[i] = s[i].match(/(\W\w+\W\w+)?(\W*?)\w+.(gif|jpg|jpeg|png|svg|bmp)/gi);
    }
    
    alert (imgUrls.join("  "));  // abc.gif /mydirectory/images/xyz.jpg ../pqr_50.bmp
    
    var newArr = []; 
    for (var i=0; i<imgUrls.length; i++) {   // if desired remove empty array elements
    if (imgUrls[i]) { 
    newArr.push(imgUrls[i]); 
    } 
    }  
    
    alert (newArr.join("  "));  // abc.gif /mydirectory/images/xyz.jpg ../pqr_50.bmp
    
    </script>
    Last edited by Philip M; 11-28-2012 at 11:38 AM.

    All the code given in this post has been tested and is intended to address the question asked.
    Unless stated otherwise it is not just a demonstration.

  • #13
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,399
    Thanks
    11
    Thanked 595 Times in 575 Posts
    Quote Originally Posted by Philip M View Post
    Code:
    var newArr = []; 
    for (var i=0; i<imgUrls.length; i++) {   // if desired remove empty array elements
    if (imgUrls[i]) { 
    newArr.push(imgUrls[i]); 
    } 
    }
    is shorter and faster as:
    Code:
    var newArr = imgUrls.filter(Boolean);
    or maybe you knew that but you like typing...

    aside, despite it's complexity, your whole posted code in #12 is still 3X slower than the one-liner i posted...
    Last edited by rnd me; 11-28-2012 at 02:02 PM.
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/9/03) IE7:0.1, IE8:4.6, IE11:9.1, IE9:3.1, IE10:3.0, FF:17.2, CH:46, SF:11.4, NON-MOUSE:38%

  • Users who have thanked rnd me for this post:

    johnsmith153 (11-28-2012)

  • #14
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    18,145
    Thanks
    203
    Thanked 2,547 Times in 2,525 Posts
    Quote Originally Posted by rnd me View Post
    is shorter and faster as:
    Code:
    var newArr = imgUrls.filter(Boolean);
    or maybe you knew that but you like typing...

    aside, despite it's complexity, your whole posted code in #12 is still 3X slower than the one-liner i posted...
    Yes, I am sure you are right. Yours could well be a few milliseconds faster. Who cares? I don't think my code is complex.

    Also, mine has the merit that it works. I tested your one-liner with my string and got

    stuff/mydirectory/images/xyz.jpg,../pqr_50.bmp

    var newArr = imgUrls.filter(Boolean); is a feature of HTML5 so only works in the most modern browsers. Oddly enough some benighted people are still running IE7 or IE8. I feel that there is no point is coding which does not work in all reasonably expected browsers (say IE7+). Especially when the older code does just exactly the same thing. Like typing? Not really, but I am not bone idle either.

    However, johnsmith153 (who for some reason is said to be infamous around these parts) seems to be happy, so I shall regard this thread as closed.
    Last edited by Philip M; 11-28-2012 at 03:53 PM. Reason: Correction

    All the code given in this post has been tested and is intended to address the question asked.
    Unless stated otherwise it is not just a demonstration.

  • #15
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,399
    Thanks
    11
    Thanked 595 Times in 575 Posts
    Quote Originally Posted by Philip M View Post
    mine has the merit that it works. I tested your one-liner with my string and got

    stuff/mydirectory/images/xyz.jpg,../pqr_50.bmp


    var newArr = imgUrls.filter(Boolean); is a feature of HTML5 so only works in the most modern browsers. Oddly enough some benighted people are still running IE7 or IE8. I feel that there is no point is coding which does not work in all reasonably expected browsers (say IE7+). Especially when the older code does just exactly the same thing. Like typing? Not really, but I am not bone idle either.

    um, the regexp is for CSS, not self-made-up testcases of url-ish junk. mine finds all the ones that the orig code does in the heap of css i tested:

    http://cdn.atlas.illinois.edu/css/?f...les/screen.css






    the old what about 3-versions-behind IE users? i expected it.
    this is an ECMA5 thing, not a HTML5 thing. why would HTML spec a core JS method anyway; that doesn't make sense... it's really from prototype.js, but better than that library's version because of the 2nd this argument.
    i really think that average coders can shim such basic backwards-compatibility if need be. maybe it is asking too much, i'll consider it. Then again, someone around here always seems to stick up for the 5 year old browser crowd...


    either way, no harm, a simple drop in can support all the way back to IE4 if need be:
    Code:
    Array.prototype.filter = [].filter || function(d, e) {
    	for (var f = this.length, b = [], a = 0; a < f; a++) if (a in this) {
    		var c = this[a];
    		d.call(e, c, a, this) && b.push(c)
    	}
    	return b
    };
    even with the compat dropin (which nobody types), the combined code is still shorter than doing it the loopy way.

    i really can't live without map/filter, so i don't understand how anyone doing any data processing would want to skip over that incredibly useful set of functionality. that's just me. It's the best thing to happen to JS since new Date().getFullYear()...


    if you want to safely live in the luxury of functional coding, include fallback compat drop-ins with your standard library.

    here are all 9 methods ( map, filter, every, some, lastIndexOf, indexOf, reduce, reduceRight, forEach ) in under 1000bytes:


    Code:
    (function (){var $="concat",B="length",C="call",e=Array.prototype,t,n,r={map:function(e,t){var n=this[$](),r=n[B],i=0,s=[];for(;i<r;i++)i in n&&(s[i]=e[C](t,n[i],i,n));return s},filter:function(e,t){var n=this[$](),r=n[B],i=0,s=[],o=0;for(;i<r;i++)i in n&&e[C](t,n[i],i,n)&&(s[o++]=n[i]);return s},every:function(e,t){var n=this[$](),r=n[B],i=0;return r&&n.filter(e,t)[B]==r},some:function(e,t){var n=this[$](),r=n[B],i=1;for(;r--;)if(r in n&&e[C](n,n[r],r,n)&&!--i)return!0;return!1},lastIndexOf:function(e,t){var n=this[$](),r=n[B],i=t||-1;for(;r>i;r--)if(r in n&&n[r]===e)return l;return-1},indexOf:function(e,t){var n=this[$](),r=n[B],i=t||0;for(;i<r;i++)if(i in n&&n[i]===e)return i;return-1},reduce:function(e,t){var n=this[$](),r=n[B],i=0,s=t||n[i++];for(;i<r;i++)s=e[C](null,s,n[i],i,n);return s},reduceRight:function(e,t){var n=this[$](),r=n[B],i=r-1,s=t||n[i--];for(;i>-1;i--)s=e[C](null,s,n[i],i,n);return s},forEach:function(e,t){this[$]().map(e,t)}};for(t in r)n=e[t],e[t]=n||r[t]})();
    unlike last time; look ma, no eval()!
    Last edited by rnd me; 11-28-2012 at 05:52 PM.
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/9/03) IE7:0.1, IE8:4.6, IE11:9.1, IE9:3.1, IE10:3.0, FF:17.2, CH:46, SF:11.4, NON-MOUSE:38%


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •