Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    New to the CF scene
    Join Date
    Apr 2004
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Arrow Feeding from WEBSITES of my choice...

    Hi!

    I hope this is the appropriate place to put this coz PHP can do anything. If this is not an appropriate board then ADMINs have Full RIGHT to KILL my posting.
    ............................nyways, I want to write a script in PHP so that I can post news or any info on my website from different web sources, like CNN, BBC etc etc, and also I should have control on its layout.
    I have seen many paid services like moreover.com etc, but would like to do something of my own because i am learning PHP and would like any DIRECTIONS ON HOW TO START. ALL I AM AKSING for is to be my NAVIGATOR.....................i will be the Driver and will have my CAR and WILL put my GAS. My car is running on PHP+MYSQL. ........

    Appreciate it.
    bLyZ
    [ FIGHTCLUB is my food for soul....... ]

  • #2
    Super Moderator
    Join Date
    May 2002
    Location
    Perth Australia
    Posts
    4,081
    Thanks
    11
    Thanked 99 Times in 97 Posts
    As you can guess we are not going to tell you how to grab copyrighted content as not only could it get both you and us into strife, its also bad juju in general.

    Luckily , many sites with content worth borrowing have ways and means to let you get at their headlines/content easily and legally , normally they are RSS/RDF XML feeds , if you like PHP so much , pop over to PHP.net as they have their headlines available in XML format & probably some pointers to common RSS/RDF parsing routines (or google for `PHP RDF RSS parser`)
    resistance is...

    MVC is the current buzz in web application architectures. It comes from event-driven desktop application design and doesn't fit into web application design very well. But luckily nobody really knows what MVC means, so we can call our presentation layer separation mechanism MVC and move on. (Rasmus Lerdorf)

  • #3
    New to the CF scene
    Join Date
    Apr 2004
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well! google and yahoo have news from different web sources.

  • #4
    Senior Coder missing-score's Avatar
    Join Date
    Jan 2003
    Location
    UK
    Posts
    2,194
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Google and yahoo probably have permission to use the sources.. becuase a big company like either of them could get into alot of trouble if they didnt...

  • #5
    Senior Coder
    Join Date
    Apr 2003
    Location
    Canada
    Posts
    1,063
    Thanks
    2
    Thanked 0 Times in 0 Posts
    similarily, I would like to be able to have the hockey scores updated on my site without having to enter them myself (as in getting them from another site)...
    of course, I doubt the hockey scores are copyrighted...
    so is there a way to get the http://www.rds.ca source and grab the score for the Tampa Bay Montreal Series ?
    Shawn

  • #6
    Regular Coder
    Join Date
    Oct 2003
    Posts
    603
    Thanks
    2
    Thanked 1 Time in 1 Post

  • #7
    Senior Coder
    Join Date
    Apr 2003
    Location
    Canada
    Posts
    1,063
    Thanks
    2
    Thanked 0 Times in 0 Posts
    will do, thanks
    Shawn

  • #8
    Senior Coder
    Join Date
    Apr 2003
    Location
    Canada
    Posts
    1,063
    Thanks
    2
    Thanked 0 Times in 0 Posts
    oh, but I have to pay for that, I want to learn how to do it myself...
    Shawn

  • #9
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Just to follow up on the copyright issue, if your using this for personal reasons, there is the often forgotten "fair-use" clause in copyright law. As long as you don't try to make money off the information, you have some latitude.

    I personnaly have a site that grabs headlines (parses HTML for h1-h3 tags and reads RSS feeds) and puts it on my "personal information" page. Then I have it password protected. Sure it's not super secure, but if anyone cme crying to me about no permission to use blah blah blah, I'd throw the fair use right back at them.

    At least that's the case in the US. Not sure about the rest of the world.

    Oh and one caveat, I don't "Break in" to other sites to grab content, simply read the existing public information. It's really no more than an automatic browser of sorts.

  • #10
    Senior Coder
    Join Date
    Apr 2003
    Location
    Canada
    Posts
    1,063
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Are you saying I wouldn't be able to get the hockey scores and put em on my website automatically?

    But anyways, is there a way to do it? If there is, where should I start?
    Thanks
    btw, I agree about the copyright thing, but some things just can't be copywrittable..
    Shawn

  • #11
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Well, it depends, if the site has an RSS feed, you'll need to parse that. Otherwise, your need to putz with regular expressions and parse their HTML page. VERY SLOW but it does the job.

    Here's some code I use:

    PHP Code:
    <?php
    $fh 
    fopen("http://www.yoursite.com/page.html","r");
    print 
    "<ul class=\"contentTxt\">";
    $d="";
    while(
    $data fgets($fh)) {
        
    //strip tabs and line returns along with other markup I dont care about
        
    $data preg_replace("/(\r|\n|\t|<b>|<\/b>|<img.*?>)*/","",$data);
        
    $d.=$data;
    }
    //grab the data between <font size=+2> tags
    preg_match_all("/<font size=\"\+2\"><a href=\"(.*?)\">(.*?)<\/font>/",$d,$matches);
    for(
    $i=0$i<count($matches[1]);$i++) {
        print 
    "<li><a href=\"http://ssa.usps.gov/redir.php?url=http://blue.usps.gov/news/link/".$matches[1][$i]."\">".$matches[2][$i]."</a></li>";
    }
    print 
    "</ul>";
    ?>

  • #12
    Senior Coder
    Join Date
    Apr 2003
    Location
    Canada
    Posts
    1,063
    Thanks
    2
    Thanked 0 Times in 0 Posts
    hmm, so basically read the file and regexp your way to the desired part... makes sence. A little complexe though, anyways.
    PHP Code:
    $data reg_replace("/(\r|\n|\t|<b>|<\/b>|<img.*?>)*/","",$data); 
    what is this line checking for exactly?
    Shawn

  • #13
    Senior Coder missing-score's Avatar
    Join Date
    Jan 2003
    Location
    UK
    Posts
    2,194
    Thanks
    0
    Thanked 0 Times in 0 Posts
    that line checks for an removes:

    \r (return)
    \n (newline)
    \t (tab)
    <b></b> Tags
    <img /> tags

  • #14
    Senior Coder
    Join Date
    Apr 2003
    Location
    Canada
    Posts
    1,063
    Thanks
    2
    Thanked 0 Times in 0 Posts
    thanks:

    PHP Code:
    preg_match_all("/<font size=\"\+2\"><a href=\"(.*?)\">(.*?)<\/font>/",$d,$matches); 
    I'm guessing this line takes all occurence of:
    <font size="+2">
    <a href="any text">
    any text
    </font>
    in $d
    and stores it in an array ($matches)

    Am I guessing correctly?
    Shawn

  • #15
    Senior Coder missing-score's Avatar
    Join Date
    Jan 2003
    Location
    UK
    Posts
    2,194
    Thanks
    0
    Thanked 0 Times in 0 Posts
    yup, yo got it


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •