Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts

    fsockopen, how to have the browser use the http response header

    All right, I've got this working from behind a proxy, now I would like to have the browser properly interpret the http response header rather than display it. Any ideas?

    Here's the code
    Code:
    <?php
    /*your proxy server address*/
    $proxy = "192.168.1.1";
    /*your proxy server port*/
    $port = "8080";
    /*the url you want to connect to*/
    $url = "http://www.codingforums.com";
    $fp = fsockopen($proxy, $port) or die("Unable to open");
    fputs($fp, "GET $url HTTP/1.1\r\nHost: $proxy\r\n\r\n");
    while(!feof($fp)){
    	$line = fgets($fp, 4000);
    	print($line);
    }
    fclose($fp);
    ?>
    The ultimate goal of this is to create an RSS reader, so any tips along those lines would be apprecitated. Also, I've been googleing ofr days on finding a RSS reader written in PHP that works through a proxy (ie uses fsockopen) any one have some code examples or links ???

    Thanks


    Screen shot of problem attached
    Attached Thumbnails Attached Thumbnails fsockopen, how to have the browser use the http response header-header.jpg  

  • #2
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I think I know how to handle your problem, but I fear I'm introducing another one you might not be aware of. Anyway, the response consists of HTTP headers and the content body. The body is what should be displayed, so we need to retrieve that and print it. Here's an example code for doing that (without the proxy part 'cause I'm not behind one, but the same code applies to it as well):

    Code:
    $fp = fsockopen ("www.codingforums.com", 80, $errno, $errstr, 30); 
    if (!$fp) { 
       echo "$errstr ($errno)<br>\n"; 
    } else { 
    	$out = '';
       fputs ($fp, "GET / HTTP/1.0\r\nHost: www.codingforums.com\r\n\r\n"); 
       while (!feof($fp)) { 
           $out .= fgets ($fp,128); 
       } 
       fclose ($fp); 
    } 
    $response = preg_split('/(Content-Type\:.+?)[\r\n]+/i', $out, -1, PREG_SPLIT_DELIM_CAPTURE);
    header($response[0]);
    header($response[1]);
    print $response[2];
    I just store the response in a string and split it at the point where the content begins. Anything up to this point, and the boundary itself (what's in the regex) can be sent through header(), the rest can be printed.

    And now the problem: You will see that no relative path to images etc. is been traversed, because the browser interprets it relative to your script file's address. Can be fixed though, it's just work... much work... find all paths and open a new connection, download the response and display according to it's content-type.

    Concerning useful RSS libraries: The ones I posted in your thread about RSS processing in the XML forum were'nt ok?

    EDIT: I'm going nuts with automatically parsed URLs... why is it impossible to edit the post that URLs aren't parsed? The checkbox reappears every time I try to edit my post. Duh. People, don't click the link, it doesn't work.
    Last edited by mordred; 09-09-2003 at 03:34 PM.
    De gustibus non est disputandum.

  • #3
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Actually, that shouldn't bee too much of a problem, because as I said, I'm going to use this for an RSS news feed, which will have absolute paths for the links, so the relative issue should be resolved.

    The real problem (what I posted earlier was simplified) was that because I was getting the http headers, the parser I have threw an error that the document was not well formed.

    I'll work with what you posted to eliminate the http headers and try again.


    Concerning useful RSS libraries: The ones I posted in your thread about RSS processing in the XML forum were'nt ok?
    They were great, only problem is all the RSS readers / aggregators I found were using "include()" or "fopen()" which don't work from behind my proxy, so the only way to bypass was to use fsockopen, thus the header problem and so on.
    Thanks for your help.

  • #4
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Hmm, it looks better, but I'm still getting the following headers, how could I modify the regex to handle them as well?

    <headers>
    X-Cache: MISS from myProxyServer.com
    X-Cache: MISS from myProxyDomain
    Proxy-Connection: close
    </headers>

  • #5
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I suppose you just want to get rid of them in the final output, i.e. deleting them with the help of a regexp? This should do:

    $out = preg_replace('/^(x-cache|proxy)(.+?)[\n\r]+/im', '', $out);

    // and then splitting the "cleaned" output
    $response = preg_split(...
    De gustibus non est disputandum.

  • #6
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Mmm, much better, but now another question, the XML parser function has this line:

    Code:
    	while ($data = fread($fp, 4096))
    		xml_parse($xml_parser, $data, feof($fp))
    			or die(sprintf("XML error: %s at line %d",
    				xml_error_string(xml_get_error_code($xml_parser)),
    				xml_get_current_line_number($xml_parser)));

    Which is using a file handle ($fp), and I've got the contents in a variable. What's the easiest way to modify this? Write the variable to a local file and then open it with the parser, or modify the parser line to read the variable? If the later, how would you do that?

  • #7
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts
    It depends a little on the context of your application. If you need to work on the whole XML file and it's not too big, I would first accumulate the file contents in a string variable and when it's complete, parse it with

    xml_parse($parser, $data);

    However, the approach you have above with the repeated calling of xml_parse() and only sending a chunk of data is better, provided you only do limited work on the XML file, and it's quite big and needs to be downloaded over the network. Say for instance you just want the first three <foobar> elements, and don't care about the other megabytes of the XML file, you'd go with this approach.

    You could check in your while loop if the extracted data is already behind the headers, and if so, set a marker that XML parsing should be used from now on, and parse the data chunks in the following iterations.
    Last edited by mordred; 09-11-2003 at 03:08 PM.
    De gustibus non est disputandum.

  • #8
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts

    Whoot!!!

    All I can say is, whoot! Thanks for the help mordred et al.

    So anyone else can have an RSS feeder, here are the files. They use fsockopen!

    File rssFeed.php (needs to be included via include)
    Code:
    <?php
    //SET PROXY SERVER IP HERE
    	$proxy = "ENTER PROXY IP HERE";
    //SET PROXY PORT HERE (usually 8080)
    	$port="8080";
    //SET YOUR EMAIL ADDRESS HERE
    	$email = "my.email@something.com";
    //SET RESULT LIMITS
    $limit=3;
    //TO DISPLAY AN RSS FEED CALL THE openRSS FUNCTION WITH THE
    //URL OF THE NEWS FEED
    //EXAMPLE:
    //	openRSS("http://z.about.com/6/g/stlouis/b/index.xml");
    //END EXAMPLE
    	print "\n<!-- BEGIN DATA / NEWS FEED  -->\n";
    	print "<!-- For further information contact ".$email." -->\n";
    
    //DO NOT EDIT BELOW UNLESS YOU KNOW WHAT YOU ARE DOING
    $i=0;
    // *************** BEGIN FUNCTIONS ********************
    include_once("rssFunctions.php");
    function parseRSSFeed( $fileName, $l ) {
    	global $url, $email, $limit;
    	$limit=$l;
    	$xml_parser = xml_parser_create();
    	xml_set_element_handler($xml_parser, "startElement", "endElement");
    	xml_set_character_data_handler($xml_parser, "characterData");
    
    	print "<ul>";
    	xml_parse($xml_parser, $fileName) or die(sprintf("XML error: %s at line %d ".xml_error_string(xml_get_error_code($xml_parser)).xml_get_current_line_number($xml_parser)));
    	print "\n</ul>\n";
    	print "<!-- END DATA / NEWSFEED -->\n";
    	xml_parser_free($xml_parser);
    }
    function openRSS( $file, $l=5 ) {
    //open a connection
    	global $proxy, $port;
    	$fp = fsockopen ($proxy, $port) or die("Unable to open proxy connection");
    	$out = '';
    //get the xml news feed
    	fputs ($fp, "GET $file HTTP/1.0\r\nHost: $proxy\r\n\r\n");
    //read news feed
    	while (!feof($fp)) {
    		$out .= fgets ($fp,128);
    	}
    //close news file
    	fclose ($fp);
    //clean up headers
    	$out = preg_replace('/^(age)(.+?)[\n\r]+/im', '', $out);
    	$out = preg_replace('/^(x-cache|proxy)(.+?)[\n\r]+/im', '', $out);
    	$response = preg_split('/(Content-Type\:.+?)[\r\n]+/i', $out, -1, PREG_SPLIT_DELIM_CAPTURE);
    	$contents = $response[2];
    //begin reading feed
    //set default variables
    	$insideitem = false;
    	$tag = "";
    	$title = "";
    	$description = "";
    	$link = "";
    //parse the xml newsfeed
    	$lim=$l;
    parseRSSFeed($contents, $lim);
    }
    // *************** END FUNCTIONS ********************
    ?>
    and the rssFunctions.php file (included in the file above)
    Code:
    <?php
    function startElement($parser, $name, $attrs) {
    	global $insideitem, $tag, $title, $description, $link;
    	if ($insideitem) {
    		$tag = $name;
    	} elseif ($name == "ITEM") {
    		$insideitem = true;
    	}
    }
    function endElement($parser, $name) {
    	global $insideitem, $tag, $title, $description, $link, $i, $limit;
    	if ($name == "ITEM") {
    
    		if(trim($title)!="Customize this feed") {
    			$i++;
    			if($i<=$limit) {
    				printf("\n\t<li><strong><a href='%s'>%s</a></strong> <br />%s",
    					trim($link),htmlspecialchars(trim($title)),htmlspecialchars(trim($description)));
    				printf("</li>");
    			}
    		}
    		$title = "";
    		$description = "";
    		$link = "";
    		$insideitem = false;
    	}
    }
    function characterData($parser, $data) {
    	global $insideitem, $tag, $title, $description, $link;
    	if ($insideitem) {
    		switch ($tag) {
    			case "TITLE":
    			$title .= $data;
    			break;
    			case "DESCRIPTION":
    			$description .= $data;
    			break;
    			case "LINK":
    			$link .= $data;
    			break;
    		}
    	}
    }
    ?>
    This will return an unordered list from the news feed.

    Thanks again.

  • #9
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Ok, not sure if I'm talking to myself here, but...

    I'm now going to try to modify this code to read the XML newsfeed from weather.com. I've signed up with them and they say that I need to store the feed in cache for at least 30 minutes.

    How would I do that?


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •