Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
  1. #1
    New to the CF scene
    Join Date
    Aug 2006
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    regex help required to grab links from hrefs

    Hi,

    I am very new to PHP and have ben doing quite well (if I do say so myself) but I have hit a bit of a brick wall with regular expressions and wondered if anyone could point me in the right direction...

    What I am trying to do is strip the url and content of an href into two separate variables, so

    PHP Code:
    $link "<a href=\"/media/press_releases/\">News &amp; Press Releases</a>" 
    would end up as two variables...
    PHP Code:
    $url "/media/press_releases/";
    $title "News &amp; Press Releases"
    I assume I'm going to need to use preg_match, but my knowledge of regex's is abysmal and I'm not really understanding them properly.

    Any help that can be given would be very much appreciated.

    Thank you,

    dudey

  • #2
    Senior Coder chump2877's Avatar
    Join Date
    Dec 2004
    Location
    the U.S. of freakin' A.
    Posts
    2,798
    Thanks
    19
    Thanked 156 Times in 147 Posts
    I'm sure there's a way to combine this into one regex, and I'm not the king of regex either, but this would work, i think:

    PHP Code:
    $link "<a href=\"/media/press_releases/\">News &amp; Press Releases</a>"
    preg_match('(\b[a-zA-Z0-9]+://[^( |\>)]+\b)',$link,$matches);
    preg_match('/(<a)(.+?)(>)(.+?)(<\/a>)/',$link,$matches2);

    echo 
    "URL is: ".$matches[0]."<br>";
    echo 
    "Title is: ".$matches2[4]; 
    Regards, R.J.

    ---------------------------------------------------------

    Help spread the word! Like my YouTube-to-Mp3 Conversion Script on Facebook !! :)
    [Related videos and tutorials are also available at my YouTube channel and on Dailymotion]
    Get free updates about new software version releases, features, and bug fixes!

  • #3
    New to the CF scene
    Join Date
    Aug 2006
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hmmm, almost works ... apart from the url bit.

    Can you possibly explain to me what each part of your script is actually doing so that I might have a play with it (with hopefully a little understanding), for instance why specific part of the array?

    Thanks for your help.

    dudey

  • #4
    New to the CF scene
    Join Date
    Aug 2006
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Ah, I see ... I think...
    PHP Code:
    preg_match('/(<a)(.+?)(>)(.+?)(<\/a>)/',$link,$matches2);
    echo 
    "Title is: ".$matches2[4]; 
    the [4] is getting the fourth instance of whatever is between the brackets ... so in this case it would be the content between the closing > of the opening tag and the end tag of </a> ... so I guess '.+?' means 'any character' ?

  • #5
    New to the CF scene
    Join Date
    Aug 2006
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    still can't seem to get the url part of it to work though ... any ideas?

    Thanks for your help so far

    dudey

  • #6
    Senior Coder chump2877's Avatar
    Join Date
    Dec 2004
    Location
    the U.S. of freakin' A.
    Posts
    2,798
    Thanks
    19
    Thanked 156 Times in 147 Posts
    PHP Code:
    <?
     
    $link 
    '<a href="/media/press_releases/">News &amp; Press Releases</a>';
    preg_match('/(<a)(.*?)(href="|href=\')(.+?)("|\')(.*?)(>)([^<>]+?)(<\/a>)/i',$link,$matches);
     
    echo 
    "URL is: ".$matches[4]."<br>";
    echo 
    "Title is: ".$matches[8];
     
    ?>
    give that a whirl...i just fooled around with it some more...but it may not be perfect for finding all URLs though...

    The "URL" is matching whatever the forth parenthesized pattern found...the "Title" is matching whatever the eighth parenthesized pattern found....

    For the URL: (.+?) is any character repeated one or more times, but don;t be so "greedy" as to include the (") character as well....

    similar logic for the Title...

    you can also try looking here: http://www.codingforums.com/showthread.php?t=76949

    Edit: Added a couple of things to my regex pattern...
    Last edited by chump2877; 08-28-2006 at 03:18 PM.
    Regards, R.J.

    ---------------------------------------------------------

    Help spread the word! Like my YouTube-to-Mp3 Conversion Script on Facebook !! :)
    [Related videos and tutorials are also available at my YouTube channel and on Dailymotion]
    Get free updates about new software version releases, features, and bug fixes!

  • #7
    New to the CF scene
    Join Date
    Aug 2006
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Great stuff.
    Thanks very much, it is much appreciated.

    dudey

  • #8
    Senior Coder chump2877's Avatar
    Join Date
    Dec 2004
    Location
    the U.S. of freakin' A.
    Posts
    2,798
    Thanks
    19
    Thanked 156 Times in 147 Posts
    ...
    Regards, R.J.

    ---------------------------------------------------------

    Help spread the word! Like my YouTube-to-Mp3 Conversion Script on Facebook !! :)
    [Related videos and tutorials are also available at my YouTube channel and on Dailymotion]
    Get free updates about new software version releases, features, and bug fixes!


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •