Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 13 of 13
  1. #1
    New Coder
    Join Date
    May 2007
    Posts
    93
    Thanks
    4
    Thanked 0 Times in 0 Posts

    paying to fix accents in a small script (10 lines of code)

    I have a script that will output the filenames of all files inside a zip. Problem is that the accents characters are not displayed (Ú Ŕ Ô ˘, etc)

    Code:
    header('Content-Type: text/html; charset=utf-8');
    setlocale(LC_ALL, 'fr_CA.UTF-8');
    ini_set('display_startup_errors', 1);
    error_reporting(E_ALL);
    ini_set('display_errors', 1);
    			$downloadlink = "test.zip";
    			$za = new ZipArchive();
    			$za->open($downloadlink);
    				$open = $za->open($downloadlink, ZIPARCHIVE::CHECKCONS);
    			for( $i = 0; $i < $za->numFiles; $i++ ){
    				$stat = $za->statIndex( $i );
    				$tounes = array( basename( $stat['name'] ) . PHP_EOL );
    				foreach($tounes as $toune) {
    echo "$toune<br>";
    				}
    	}
    You can download the file i am testing with (test.zip) here:
    pirate-punk.com/test.zip

    I need a fix that will display any filename with any accent.

    I've been desesperatly looking for help on multiple forums and nobody never came up with a solution. So just tell me your price (make it reasonnable please) and i will paypal it to you once you can confirm a solution has been found.

  • #2
    Regular Coder nomanic's Avatar
    Join Date
    Feb 2009
    Location
    United Kingdom
    Posts
    255
    Thanks
    9
    Thanked 33 Times in 33 Posts
    <DmncAtrny> I will write on a huge cement block "BY ACCEPTING THIS BRICK THROUGH YOUR WINDOW, YOU ACCEPT IT AS IS AND AGREE TO MY DISCLAIMER OF ALL WARRANTIES, EXPRESS OR IMPLIED, AS WELL AS DISCLAIMERS OF ALL LIABILITY, DIRECT, INDIRECT, CONSEQUENTIAL OR INCIDENTAL, THAT MAY ARISE FROM THE INSTALLATION OF THIS BRICK INTO YOUR BUILDING."
    <DmncAtrny> And then hurl it through the window of a Sony officer
    <DmncAtrny> and run like hell

    Portfolio, Tutorials - http://www.nomanic.biz/

  • #3
    Senior Coder
    Join Date
    Sep 2010
    Posts
    2,186
    Thanks
    15
    Thanked 253 Times in 253 Posts
    Did you try my suggestion of just unpacking the archive into a directory? Your accents are extended ascii characters, which php doesn't handle very well.
    Welcome to http://www.myphotowizard.net

    where you can edit images, make a photo calendar, add text to images, and do much more.


    When you know what you're doing it's called Engineering, when you don't know, it's called Research and Development. And you can always charge more for Research and Development.

  • #4
    New Coder
    Join Date
    May 2007
    Posts
    93
    Thanks
    4
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by DrDOS View Post
    Did you try my suggestion of just unpacking the archive into a directory? Your accents are extended ascii characters, which php doesn't handle very well.
    wouldn't work for me. I have 4000 zip files averaging 100mb, i can't extract them just to get the file names and i can't temporary extract it each time a visitor needs the file list because it would overload my cpu.

  • #5
    Senior Coder
    Join Date
    Sep 2010
    Posts
    2,186
    Thanks
    15
    Thanked 253 Times in 253 Posts
    Limiting your options doesn't improve your chance of success The unpacked files take up the same space on the server. If you have to add another hard drive, no big deal, it just adds backup for the files. Continually accessing the files in the archive uses cpu too, about the same time as unpacking the archive. php has a sleep function, you could let it sleep(10); between unpackings to allow the server to do other jobs.

    You have to choose between something that 'sorta' works and something that works.
    Welcome to http://www.myphotowizard.net

    where you can edit images, make a photo calendar, add text to images, and do much more.


    When you know what you're doing it's called Engineering, when you don't know, it's called Research and Development. And you can always charge more for Research and Development.

  • #6
    New Coder
    Join Date
    May 2007
    Posts
    93
    Thanks
    4
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by DrDOS View Post
    Limiting your options doesn't improve your chance of success The unpacked files take up the same space on the server. If you have to add another hard drive, no big deal, it just adds backup for the files. Continually accessing the files in the archive uses cpu too, about the same time as unpacking the archive. php has a sleep function, you could let it sleep(10); between unpackings to allow the server to do other jobs.

    You have to choose between something that 'sorta' works and something that works.
    i can't upgrade my dedicated server with a new HD, i would have to buy another server.

    Anyway i tried what you said:

    Code:
    <?php
    $root = $_SERVER['DOCUMENT_ROOT'];
    $zip = new ZipArchive;
    if ($zip->open('test.zip') === TRUE) {
        $zip->extractTo('$root/test/');
        $zip->close();
        echo 'ok';
    } else {
        echo 'failed';
    }
    
    ?>
    Once extracted, the accents in the file names are still missing.

  • #7
    Senior Coder
    Join Date
    Sep 2010
    Posts
    2,186
    Thanks
    15
    Thanked 253 Times in 253 Posts
    OK, I see that the file names are corrupted when php unzips the file, so you can't use it for that. However, when I do this:

    `unzip '86 Crew - 2000 - Bad Bad Reggae.zip' `;

    I get clean file names. Note the backward slashes. You can put that line in a php file, just as it is, and it will extract the zip file. This is on a Linux machine, with BASH. I'll look for code that will let you open, as opposed to extract, a single file from the folder. However that doesn't mean you're out of the woods if I find it, I will, the file names may cause other problems. And you still my end up having to extract all the folders. You might want to download all the zips to your own machine as backup in case there are problems.
    Welcome to http://www.myphotowizard.net

    where you can edit images, make a photo calendar, add text to images, and do much more.


    When you know what you're doing it's called Engineering, when you don't know, it's called Research and Development. And you can always charge more for Research and Development.

  • #8
    Regular Coder patryk's Avatar
    Join Date
    Oct 2012
    Location
    /dev/couch
    Posts
    398
    Thanks
    2
    Thanked 64 Times in 64 Posts
    here you go:
    Code:
    <?php
    header('Content-Type: text/plain; charset=utf-8');
    
    $data =  shell_exec('unzip -qq -l "test.zip" | awk \'{print $4}\'');
    //$data = shell_exec('unzip -ql test.zip');
    
    function uni2chr($o) {
    	$o[0] = str_replace('#U', '\u', $o[0]);
    	$tmp = '{"chr":"' . $o[0] . '"}';
    	$tmp2 = json_decode($tmp, 1);
    	return $tmp2['chr'];
    }
    
    $replaced = preg_replace_callback('|#U[0-9a-f][0-9a-f][0-9a-f][0-9a-f]|', 'uni2chr', $data);
    
    echo $replaced;
    ?>
    you'll need to mod it if you have spaces in filenames though. and read your PM's :P

    -------------------------------------------------------------------------------
    "Real Programmers can write assembly code in any language" - Larry Wall

  • #9
    Senior Coder
    Join Date
    Sep 2010
    Posts
    2,186
    Thanks
    15
    Thanked 253 Times in 253 Posts
    If patryks method works, that's a great thing, because the only way I've found to extract the files with a clean file name is, directly in the command line. Anything else corrupts them. Only the php chr and ord functions seen to handle them well.
    Welcome to http://www.myphotowizard.net

    where you can edit images, make a photo calendar, add text to images, and do much more.


    When you know what you're doing it's called Engineering, when you don't know, it's called Research and Development. And you can always charge more for Research and Development.

  • #10
    Regular Coder patryk's Avatar
    Join Date
    Oct 2012
    Location
    /dev/couch
    Posts
    398
    Thanks
    2
    Thanked 64 Times in 64 Posts
    Quote Originally Posted by DrDOS View Post
    If patryks method works, that's a great thing, because the only way I've found to extract the files with a clean file name is, directly in the command line. Anything else corrupts them. Only the php chr and ord functions seen to handle them well.
    if you have unzip installed then it will work and it is quasi command line sollution btw.
    Code:
    shell_exec('unzip -qq -l "test.zip" | awk \'{print $4}\'');
    returns something like that:
    Code:
    #U00e9_file2.txt
    e_file1.txt
    so if you'll replace hex unicode value of 'accented' characters, you'll get what you want (that's where preg_replace_callback() and json_decode() comes in)
    but like i said: it will only work if there are no spaces in file names. otherwise you'll have to use something else than ... | awk '{print $4}'

    -------------------------------------------------------------------------------
    "Real Programmers can write assembly code in any language" - Larry Wall

  • #11
    New Coder
    Join Date
    May 2007
    Posts
    93
    Thanks
    4
    Thanked 0 Times in 0 Posts
    Thanks for the answer guys.

    Just one question: does running the shell_exec "unzip" command actually extract temporary files on my server ? Because like i explained i am already very limited in drive space and i can't afford to extract files just to get the names (i already have 500GB of zip files).

    Also, will it require significally more CPU usage than the code i was originally using ?

  • #12
    Senior Coder
    Join Date
    Sep 2010
    Posts
    2,186
    Thanks
    15
    Thanked 253 Times in 253 Posts
    Quote Originally Posted by anarchoi View Post
    Thanks for the answer guys.

    Just one question: does running the shell_exec "unzip" command actually extract temporary files on my server ? Because like i explained i am already very limited in drive space and i can't afford to extract files just to get the names (i already have 500GB of zip files).

    Also, will it require significally more CPU usage than the code i was originally using ?
    I'm actually making some kind of progress on this. I've found that you can make a listing of the files in the folder, and that you can extract a file on command and possibly pipe to an application without having to download it. It will have the #U type filename, not the accented character, but you can use UTF8 to represent the accented characters on a web page, so everything should work.

    Now why I got involved with this is that I'm coding an encryption application, with web pages and all, that includes use of the accented characters. So I've got two projects that fit together. One thing I've found, that may be important, is that you can't use echo or print with them, you must use file_put_contents and file_get_contents. You can also put them in an array and implode the array, and use put_contents, to put the result in a file.

    One problem I have is that the sound on this buggy went out a couple of months ago ( it works for U_tube ) so I can't hear the mp3s.
    Welcome to http://www.myphotowizard.net

    where you can edit images, make a photo calendar, add text to images, and do much more.


    When you know what you're doing it's called Engineering, when you don't know, it's called Research and Development. And you can always charge more for Research and Development.

  • #13
    Regular Coder patryk's Avatar
    Join Date
    Oct 2012
    Location
    /dev/couch
    Posts
    398
    Thanks
    2
    Thanked 64 Times in 64 Posts
    Quote Originally Posted by anarchoi View Post
    Thanks for the answer guys.

    Just one question: does running the shell_exec "unzip" command actually extract temporary files on my server ? Because like i explained i am already very limited in drive space and i can't afford to extract files just to get the names (i already have 500GB of zip files).

    Also, will it require significally more CPU usage than the code i was originally using ?
    don't worry unzip with the '-l' switch won't extract anything. will just list files inside your zip archive. CPU usage should be the same. if anything it will actually take less CPU cycles since sh (most servers use sh shell) is way more efficient than PHP in that regard

    -------------------------------------------------------------------------------
    "Real Programmers can write assembly code in any language" - Larry Wall


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •