Using latin character sets for storing emails and domain names in MySQL

Home / Uncategorized / Using latin character sets for storing emails and domain names in MySQL

The generic advice on stackoverflow is to use utf8 or utf8mb4 everywhere in MySQL, even for fields that will only ever contain latin characters.
What is the best character set for email field?
best character set and collation for storing Tags, and URLs in MySQL DB

To clarify, for a column containing only latin characters, would using utf8mb4:…result in a larger index and higher memory usage?
…use more storage space when using column type varchar(100) or char(100)?
…allow more than than 100 characters to be stored in column type varchar(100) or char(100)?

It’s 2017. Use utf8mb4 and VARCHAR(255) for every generic "string" field unless you have a very compelling reason to deviate from that. Even pure Engish speakers love to use quirky non-Latin characters in situations like "¯_(ツ)_/¯" and "ᕕ( ᐛ )ᕗ" or even .

Email addresses can contain non-ASCII characters in both the domain component, and in the local-part before the @. Whatever rules there were for these things seem to be getting thrown out the window one by one, so all bets are off for what the future holds. Hopefully the @ stays, that’s the only thing I’d count on.

Unless you have a system that is juggling billions of email addresses in memory then the storage cost of a VARCHAR is largely irrelevant. Remember, VARCHAR(100) and VARCHAR(255) take exactly the same amount of space for a 50-character string. The only thing the 100-length field does is get on someone’s nerves when their email address is "too long" and gets trimmed arbitrarily.

Additionally, VARCHAR measures the length in characters and not bytes, a difference that is only relevant when multi-byte characters are involved. takes identical amounts of space in Latin-1, UTF-8 and UTF8MB4.

Don’t use CHAR for variable length character fields. The 1980s have died. Let it go.
Read more

Leave a Reply

Your email address will not be published. Required fields are marked *