Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 12 of 12
  1. #1
    Regular Coder
    Join Date
    Dec 2007
    Posts
    269
    Thanks
    28
    Thanked 0 Times in 0 Posts

    regular expression for special character

    i am using following regex for usernames but why does it give error for turkish chars

    Code:
    "/^[a-zA-Z ]+$/"

  • #2
    Regular Coder PHP6's Avatar
    Join Date
    Aug 2008
    Location
    Czech Republic
    Posts
    238
    Thanks
    18
    Thanked 34 Times in 33 Posts
    I think that is because you have allowed chars only from a to z, from A to Z and space. All that chars are Latin ones... if you need Turkish chars you have add them as well...

  • #3
    Regular Coder
    Join Date
    Dec 2007
    Posts
    269
    Thanks
    28
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by PHP6 View Post
    if you need Turkish chars you have add them as well...
    i know i have to add them but i do not now how ?

  • #4
    Regular Coder PHP6's Avatar
    Join Date
    Aug 2008
    Location
    Czech Republic
    Posts
    238
    Thanks
    18
    Thanked 34 Times in 33 Posts
    Try to use the following syntax to add any UNICODE char to you list:

    \xhh - character with hex code hh
    For example if you will add \xAE that will allow users to use copy right sign

  • #5
    ess
    ess is offline
    Regular Coder
    Join Date
    Oct 2006
    Location
    United Kingdom
    Posts
    866
    Thanks
    7
    Thanked 30 Times in 29 Posts
    Use Unicode based character validation when accepting any language other than the English spoken and/or written language.

    For example, the expression /^[\u0041-\u007A]+$/ would validate any characters from 0041 to 007A, which corresponds to the basic Latin alphabet.

    Please visit http://www.unicode.org/ to identify the right character ranges for the languages you wish to support.

    Cheers
    ~E

  • #6
    Regular Coder PHP6's Avatar
    Join Date
    Aug 2008
    Location
    Czech Republic
    Posts
    238
    Thanks
    18
    Thanked 34 Times in 33 Posts
    I have just searched the PHP manual and I have found something interesting. I have no idea about that regular expression support this, may be some forum's users will like it too:

    Unicode character properties
    Since PHP 4.4.0 and 5.1.0, three additional escape sequences to match generic character types are available when UTF-8 mode is selected. They are:

    \p{xx}
    a character with the xx property

    \P{xx}
    a character without the xx property

    \X
    an extended Unicode sequence

    The property names represented by xx above are limited to the Unicode general category properties. Each character has exactly one such property, specified by a two-letter abbreviation. For compatibility with Perl, negation can be specified by including a circumflex between the opening brace and the property name. For example, \p{^Lu} is the same as \P{Lu}.

    If only one letter is specified with \p or \P, it includes all the properties that start with that letter. In this case, in the absence of negation, the curly brackets in the escape sequence are optional; these two examples have the same effect:

    \p{L}
    \pL

    Table 1. Supported property codes

    C Other
    Cc Control
    Cf Format
    Cn Unassigned
    Co Private use
    Cs Surrogate
    L Letter
    Ll Lower case letter
    Lm Modifier letter
    Lo Other letter
    Lt Title case letter
    Lu Upper case letter
    M Mark
    Mc Spacing mark
    Me Enclosing mark
    Mn Non-spacing mark
    N Number
    Nd Decimal number
    Nl Letter number
    No Other number
    P Punctuation
    Pc Connector punctuation
    Pd Dash punctuation
    Pe Close punctuation
    Pf Final punctuation
    Pi Initial punctuation
    Po Other punctuation
    Ps Open punctuation
    S Symbol
    Sc Currency symbol
    Sk Modifier symbol
    Sm Mathematical symbol
    So Other symbol
    Z Separator
    Zl Line separator
    Zp Paragraph separator
    Zs Space separator

    Extended properties such as "Greek" or "InMusicalSymbols" are not supported by PCRE.

    Specifying caseless matching does not affect these escape sequences. For example, \p{Lu} always matches only upper case letters.

    The \X escape matches any number of Unicode characters that form an extended Unicode sequence. \X is equivalent to (?>\PM\pM*).

    That is, it matches a character without the "mark" property, followed by zero or more characters with the "mark" property, and treats the sequence as an atomic group (see below). Characters with the "mark" property are typically accents that affect the preceding character.

    Matching characters by Unicode property is not fast, because PCRE has to search a structure that contains data for over fifteen thousand characters. That is why the traditional escape sequences such as \d and \w do not use Unicode properties in PCRE.

  • Users who have thanked PHP6 for this post:

    zodehala (09-08-2008)

  • #7
    Regular Coder PHP6's Avatar
    Join Date
    Aug 2008
    Location
    Czech Republic
    Posts
    238
    Thanks
    18
    Thanked 34 Times in 33 Posts
    Here is the working example. That should work for any language I have successfully tested it with UTF-8 chars on couple of languages (Czech, German and English) so it should work.

    PHP Code:
    if (preg_match("/^[\p{L}\s]+$/"$userName)) {
      echo 
    'user name is OK';

    Please post if it works for you, I need to save that topic for future purposes

  • Users who have thanked PHP6 for this post:

    zodehala (09-08-2008)

  • #8
    Regular Coder
    Join Date
    Dec 2007
    Posts
    269
    Thanks
    28
    Thanked 0 Times in 0 Posts
    thanx to all . it is ok.

  • #9
    Regular Coder
    Join Date
    Dec 2007
    Posts
    269
    Thanks
    28
    Thanked 0 Times in 0 Posts
    oo so sorry it is ok but just for turkish chars (ç,ğ,ö,ş,) not latin-1


    for example

    zodeş is ok but zode is not ok
    zode1 is ok but zodehala is not

    i want that Latin-1 + Turkish chars

    what will i do ?
    Last edited by zodehala; 09-09-2008 at 09:52 AM.

  • #10
    Regular Coder PHP6's Avatar
    Join Date
    Aug 2008
    Location
    Czech Republic
    Posts
    238
    Thanks
    18
    Thanked 34 Times in 33 Posts
    That looks very strnage, I have checked both zodes and zode and they were OK. Just in case try to add following a-zA-Z that should definitely help...

    PHP Code:
    <?php
    $userName 
    'zodeş';
    $userName 'zode';
    if (
    preg_match("/^[\p{L}a-zA-Z\s]+$/"$userName)) {
      echo 
    'user name is OK';

    ?>
    Since I do not understand that question properly may be you want to allow only Turkish chars and disable regular Latin ones?

  • #11
    Regular Coder
    Join Date
    Dec 2007
    Posts
    269
    Thanks
    28
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by PHP6 View Post
    Since I do not understand that question properly may be you want to allow only Turkish chars and disable regular Latin ones?
    it will be just letter (included turkish chars not number)


    code above does not run corretly. your code output is following


    zode -------------------------> user name is OK
    zodeş-------------------------->user name is OK
    çşğ----------------------------->user name is OK

    9zode ,zode9 or zo9de ----------> user name is OK (can a human name starting with number ? or can a human name included number ? in turkish no )


    it will be just letter (included turkish chars not number)

    is it clear ?

  • #12
    Regular Coder PHP6's Avatar
    Join Date
    Aug 2008
    Location
    Czech Republic
    Posts
    238
    Thanks
    18
    Thanked 34 Times in 33 Posts
    So as I see I got everything correctly. There should be some kind of error on your side, since that script works fine on my server...

    input:
    PHP Code:
    <?php
    $userName 
    '9zodeş';
    if (
    preg_match("/^[\p{L}\s]+$/"$userName)) {
      echo 
    'user name is OK';
    } else {
      echo 
    'user name is WRONG';
    }
    ?>
    output:
    user name is WRONG
    input:
    PHP Code:
    <?php
    $userName 
    'çşğ';
    if (
    preg_match("/^[\p{L}\s]+$/"$userName)) {
      echo 
    'user name is OK';
    } else {
      echo 
    'user name is WRONG';
    }
    ?>
    output:
    user name is OK
    p.s. As I know there is not language in the world that uses numbers in the names
    can a human name starting with number ? or can a human name included number ? in turkish no

  • Users who have thanked PHP6 for this post:

    zodehala (09-09-2008)


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •