Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    New to the CF scene
    Join Date
    Aug 2005
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Thumbs up Find External Links with in the web Page

    Hi All,

    I am developing a Toolbar, My requirement is that i have to find external links with in a web page. I am trying to find externla links with in http://www.w3.org site. My problem is that It is showing the following links also as an external links http://jigsaw.w3.org/css-validator/, http://validator.w3.org/ etc., But they are internal links. Anybody help me to find exact external links with in a web page. I have written the following code to find external links:

    function DisplayExternalLinks()
    {
    var netDomains = document.location.host;
    if(netDomains.indexOf(" ") != -1)
    {
    netDomains = getDomains(netDomains);
    }

    var netDomainsArray = getDomainsArray(netDomains);

    for(var k = 0,n=1; k < document.links.length; k++)
    {
    if(document.links[k].hostname.length < 1)
    {
    continue;
    }

    if(document.links[k].target.length > 0)
    {
    continue;
    }
    var hostName = document.links[k].hostname.toLowerCase();
    for(var m = 0; m < netDomainsArray.length; m++)
    {
    if(netDomainsArray[m] != hostName)
    {
    var docLink=document.links[k];
    var im='<img src="http://www.rampweb.com/toolbar/images/external_link.gif"
    alt="External Link">';
    var h1= docLink.outerHTML;
    docLink.outerHTML='<span style=\"color:#91060A;font:x-small arial;\"><b>'+ im+' '+'</b></span> '+h1; n=n+1; continue;
    }//end of if()
    }//end of inner for()
    }//end of outer for loop

    if(n == 1)
    {
    alert('External Links are not found in this web page !');
    }
    }DisplayExternalLinks()

    function getDomains(netDomains)
    {
    var splitarray = netDomains.split("");
    netDomains = myArray.join("");
    return netDomains;
    }

    function getDomainsArray(netDomains)
    {
    netDomains = netDomains.toLowerCase();
    var myArray = netDomains.split(",");
    return myArray;
    }


    Thanks in Advance
    Prasad
    Last edited by dvpra_gnt; 08-19-2005 at 03:35 PM.

  • #2
    Senior Coder A1ien51's Avatar
    Join Date
    Jun 2002
    Location
    Between DC and Baltimore In a Cave
    Posts
    2,717
    Thanks
    1
    Thanked 94 Times in 88 Posts
    This forum is for completed scripts......NOT questions.....
    Tech Author [Ajax In Action, JavaScript: Visual Blueprint]

  • #3
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts
    what about this:
    PHP Code:
    <script type="text/javascript">
    var 
    param;
    function 
    findL(){
    var 
    myLinks = [];
    var 
    allT document.getElementsByTagName('*');
    for(var 
    i=0;i<allT.length;i++){
    if(
    allT[i].getAttribute('href')){
    myLinks[myLinks.length]=allT[i].href;
        for(var 
    j=0;j<myLinks.length-1;j++){
        
    check(allT[i].href,myLinks[j]);
        if(
    param==true){myLinks.splice(myLinks.length-1);break}
        }
    }
    }
    alert(myLinks)
    }
    function 
    check(a,b){
    a.split('//')[1].split('/')[0];
    b.split('//')[1].split('/')[0];
    param = (a==b);
    }
    onload=findL
    </script>
     
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

  • #4
    Senior Coder
    Join Date
    Feb 2003
    Posts
    1,665
    Thanks
    0
    Thanked 27 Times in 25 Posts
    How about checking the href value for the presence of http:// string and the absence of the w3.org string?

    e.g.
    Code:
    var anchors = document.links;
    for (i=0; i<anchors.length;i++) {
    	if ( (anchors[i].href.indexOf('http://') != -1) && (anchors[i].href.indexOf('w3.org') == -1) ) {
    		anchors[i].target = "_blank";
    	}
    }
    …sorta thing.


    [edit]

    The above method gives inconsistent results when tested offline and online.
    Oddly - and annoyingly - the browser implicitly credits all href values as having http:// present, even if that string isn't explicitly present in the href value.

    Here's an alternative check which appears to give better results…

    Code:
    if ( (anchors[i].href.indexOf(window.location.hostname) == -1) && (anchors[i].href.indexOf('w3.org') == -1) )
    It might prove useful in some way.
    Last edited by Bill Posters; 08-19-2005 at 06:18 PM.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •