Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    Regular Coder
    Join Date
    May 2012
    Posts
    153
    Thanks
    115
    Thanked 0 Times in 0 Posts

    problem when copying fom "MS Word" to database

    I have a form in the user interface of my website, where the user can type in a paragraph and it will display on a page of their website.

    It works fine if they just type it in, BUT if if they type their content in "MS Word" and then COPY it into the textarea (in my user input), I get a wierd charactor everyplace there should be a double quote " or a series of periods. I think "Word" does not always output normal asscii characters.

    In summary..... Double quotes normally look like this ", whereas copied from “Word”, they look like this
    And series of periods normally look like this . . . . whereas Word converts them to this ……………

    Is there anyway to deal with this? Thank you in advance.

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,613
    Thanks
    80
    Thanked 4,634 Times in 4,596 Posts
    No, of course MS Word (or *ANY* good word processing program!) does not limit itself to the silly limited set of ASCII characters! Why would it *WANT* to?

    Word processors *want* to use all those characters to make the final output look better.

    So they use "smart quotes" and en-dashes and em-dashes and ellipses and much more
    He said, “now is the time for all ‘good’ men to come to!”
    And he said it—loudly—tongue-in-cheek.
    See smart quotes *AND* smart apostrophes there? And the em-dash? (You might call it "long dash".)

    If you are going to allow people to paste word processed text into your <textarea>s, or even into standard text fields, then you will need to store the text as UNICODE and *NOT* as ASCII. In MySQL, that means storing the data as UTF8, most likely.

    You can specify that all fields in a table are Unicode by putting the charset at the end of CREATE TABLE, thus:
    Code:
    CREATE TABLE mytable (
        name varchar(100),
        address varchar(50),
        city varchar(20)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8;
    Or you can pick and choose which fields are unicode and which are not:
    Code:
    CREATE TABLE mixed (
        name varchar(100) CHARSET utf8,
        email varchar(50)
    );
    (Here, email will be the default character set of your current MySQL *installation*, likely CHARSET latin1, whereas name will be Unicode.)
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    Eggweezer (02-02-2013)

  • #3
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,613
    Thanks
    80
    Thanked 4,634 Times in 4,596 Posts
    By the by, if you have an existing table that has been specified wrongly, you *CAN* alter it.

    How to tell? Use the SHOW CREATE TABLE yourtablename; command.

    It will dump out the table and show you the CHARSET default for the entire table and the character set of each field, if different than the table default.

    For example:
    Code:
    mysql> show create table xxx;
    +-------+----------------------------------------
    | Table | Create Table
    +-------+----------------------------------------
    | xxx   | CREATE TABLE `xxx` (
      `name` varchar(100) DEFAULT NULL,
      `email` varchar(20) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
    +-------+----------------------------------------
    Okay, so that table has two fields and both use the DEFAULT for the table character set of "latin1".

    So I can do:
    Code:
    mysql> alter table xxx modify name varchar(100) charset utf8;
    Query OK, 0 rows affected (0.05 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> show create table xxx;
    +-------+--------------------------------------------------------------------
    | Table | Create Table
    +-------+--------------------------------------------------------------------
    | xxx   | CREATE TABLE `xxx` (
      `name` varchar(100) CHARACTER SET utf8 DEFAULT NULL,
      `email` varchar(20) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
    +-------+--------------------------------------------------------------------
    See? The table still defaults to "latin1"--and that means the EMAIL field will be "latin1". But the NAME field is now UTF8 and so can hold Unicode characters.

    Easy with MySQL!
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    Eggweezer (02-02-2013)

  • #4
    Regular Coder
    Join Date
    May 2012
    Posts
    153
    Thanks
    115
    Thanked 0 Times in 0 Posts
    Wow. That enlightened me. Very interesting.
    Thank you very much Old Pendant.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •