Go Back   CodingForums.com > :: Server side development > PHP

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Old 09-24-2006, 05:53 PM   PM User | #1
ralph.m
New to the CF scene

 
Join Date: Sep 2006
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
ralph.m is an unknown quantity at this point
regular expression to extract html links from a string

hello--I have spent hours now trying to figure this out. it should be simple but I have found it to not be.

I have a string and need to extract the links.

So

Code:
hello today is a fine day to post a <a href="http://www.linkme.com">link</a> to my favorite website.
should return

Code:
<a href="http://www.linkme.com">link</a>
I need to use preg_match_all() to find all occurances of html links in my string. preg_match_all() will put all the occurances into an array.

So far I have found about a dozen instances of this on the web but they must not have been written for php bc I keep getting errors like "Unknown modifier '['" and others.

my own attempt at the regular expression I would need is

PHP Code:
$preg="'\<a{1}'.*?'\<\/a\>'/i'";
preg_match_all($preg,$contents$links); 
(1 occurance of "<a then zero or more occurances of any character (.), then "</a>". btw this probably sucks. as u can see i'm no good at regular expressions).

I would sincerely appreciate help bc I have just been working on this damn problem for so long.

Last edited by ralph.m; 09-24-2006 at 05:58 PM.. Reason: wanted to make it clear I wasn't crawling (I changed "page" to "string" in the title)
ralph.m is offline   Reply With Quote
Old 09-24-2006, 06:08 PM   PM User | #2
ralph.m
New to the CF scene

 
Join Date: Sep 2006
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
ralph.m is an unknown quantity at this point
i found this on one of the other threads...

PHP Code:
preg_match_all("/<a.*? href=\"(.*?)\".*?>(.*?)<\/a>/i",$string,$results); 
but it doesn't work bc when I use preg_replace with the same expression and replace links with '', there are still tons of links in the string...
ralph.m is offline   Reply With Quote
Old 09-24-2006, 06:49 PM   PM User | #3
Fumigator
Master Coder

 
Fumigator's Avatar
 
Join Date: Dec 2005
Location: Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
Posts: 6,398
Thanks: 40
Thanked 480 Times in 469 Posts
Fumigator is just really niceFumigator is just really niceFumigator is just really niceFumigator is just really niceFumigator is just really nice
That one you found will work for some anchor tags but not all. It doesn't find tags that use single quotes, for example. But it's a pretty good start.

This will include single quotes too:
PHP Code:
$preg "/<a.*? href=(\"|')(.*?)(\"|').*?>(.*?)<\/a>/i"
If you want to post an example of an anchor tag that didn't get picked up by that regex then we can try to modify it further.
__________________
Fumigator is offline   Reply With Quote
Old 12-19-2007, 07:09 AM   PM User | #4
urgido
Regular Coder

 
Join Date: Aug 2005
Posts: 247
Thanks: 6
Thanked 0 Times in 0 Posts
urgido is an unknown quantity at this point
and for extract links only in the form /somedir/ or /somedir ??? Links like /someurl.php or http://www.lalala.com/index.php will be ignored

Regards
urgido is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 08:58 AM.

Home - Contact Us - Archives - Link to CF - Resources - Top 

Powered by vBulletin® Version 3.8.2
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.