Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3
  1. #1
    New to the CF scene
    Join Date
    Jan 2004
    Location
    Romania
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Counting all the words from a file...

    I would like to count all the word from a given file / website...


    I mean count all the words freqvency ...

    What it's wrong in my code!!!!!!


    http://www.redhits.com/cl/419/rank.txt


    ????

  • #2
    Regular Coder
    Join Date
    Jun 2002
    Location
    Montreal, Canada
    Posts
    644
    Thanks
    0
    Thanked 0 Times in 0 Posts
    What exactly are you trying to calculate? The number of instances of each word or how many time words are found from your predefined dictionary array?

    I noticed you're receiving 500bytes from the file and parsing it, then continually looping through that till the file is completely read, the problrm with this is that if the 500 bytes ends half way through a word, it'll mess up all the calculations. I suggest you loop through the file first adding it all to a string then once that's done, do your calculations.

  • #3
    Super Moderator
    Join Date
    May 2002
    Location
    Perth Australia
    Posts
    4,073
    Thanks
    11
    Thanked 98 Times in 96 Posts
    when the code gets to looking like that ... there is 9/10 times an easier way (I found out the hard way ok )

    PHP Code:
    <?
    $yaks 
    str_replace
        array( 
    "\r\n" "\n" "\t" ) , 
        
    '' ,
        
    strip_tags
            
    implode'' file'http://www.redhits.com/cl/tfile.htm' ) ) 
          ) 
        );
    $bits =  array_count_valuesexplode' ' $yaks ) ) ;
    arsort$bits ) ;
    print_r$bits ) ;
    ?>
    note that PHP's strip tags is not to good at removing javascript (especially inline) and inline styles so you may ned to remove that stuff seperately.
    resistance is...

    MVC is the current buzz in web application architectures. It comes from event-driven desktop application design and doesn't fit into web application design very well. But luckily nobody really knows what MVC means, so we can call our presentation layer separation mechanism MVC and move on. (Rasmus Lerdorf)


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •