I'm working on a database that has tables with different charsets. Since it's a big database, I was wondering if it could lead to a performance issue. Yes, the usual value comparison a DB is usually doing is the JOIN and is done comparing integers, but are there any other performance problems we could experience having tables with different charsets, other than the bigger space taken by some charsets?
Can we have performance issues when using mixed table charsets in MySQL or in Postgres?
522 Views Asked by sekmo At
2
There are 2 best solutions below
0
Rick James
On
MySQL:
For zip_code (postal_code), stored as a string (CHAR or VARCHAR), most charsets work equally well. However, when JOINing on such a column, the collation must be the same.
- If it is the same, an index on that column can be used.
- If it is not, then the index is useless, and the query must scan the entire table.
Since the collation includes the charset, that forces the charset to be the same, also.
The choice of collation is rather minor. However, if there can be letters in the string (postal_code, country_code, etc), you need to decide whether to force the tables (and user queries) to use a particular case.
- Collation
..._bintreats cases as different: 'de' won't match 'DE' (for Germany). - Collation
..._ciis "Case Insensitive", so those would match.
Related Questions in MYSQL
- MySQL Select Rank
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- Push mysql database script to server using git
- Why does mysql stop using indexes when date ranges are added to the query?
- Google Maps API Re-size
- store numpy array in mysql
- Whats wrong with this query? Using ands
- MySQL-Auto increment
- show duplicate values subquery mysql
- Java Web Application Query Is Not Working
- microsoft odbc driver manager data source name not found and no default driver specified
- Setting foreign key in phpMyAdmin
- No responses from google places text search api
- Adding to MAMP database in SQL/PHP
- I want to remove certain parent- and child-divs in all my wordpress posts with php or some other script
Related Questions in DATABASE
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- How to not load all database records in my TListbox in Firemonkey Delphi XE8
- microsoft odbc driver manager data source name not found and no default driver specified
- Cloud Connection with Java Window application
- Automatic background scan if user edit column?
- Jmeter JDBC Connection Configuration Parametrization of Database URL for accessing SQL Database
- How to grant privileges to current user
- MySQL: Insert a new row at a specific primary key, or alternately, bump all subsequent rows down?
- Inserting and returning autoidentity in SQLite3
- Architecture: Multiple Mongo databases+connections vs multiple collections with Express
- SQL - Adding a flag based on results within a query - best practice?
- Android database query not returning any results
- Developing a search and tag heavy website
- Oracle stored procedure wrapping compile error with inline comments
- Problems communicating with mysql in php
Related Questions in POSTGRESQL
- Why does adding a JOIN completely modify the query planner behaviour?
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- Aggregate and count in PostgreSQL
- Rails HABTM: Select everything a that a record 'has'
- Trigger using data from inserted row
- Select results where joined table contains records with an attribute, but without another
- DB candidate as CouchDB/Schema replacement
- How do I properly add data in SQLAlchemy?
- Postgres in Conda Environment (Ubuntu 14.04)
- How to customize the output of the Postgres Pseudo Encrypt function?
- Split a large query (2 days) into pieces to increase the speed in Postgres
- Why does pg_search prefix not work like I expect?
- extracting meta info from a table psql using information_schema
- How to query a table in the database and copy it's data into one one?
- Update a table using info from a second table and a condition from a third table in Postgresql
Related Questions in CHARACTER-ENCODING
- How to encode bytes as a printable unicode string (like base64 for ascii)
- FPDF with iconv from utf8mb4
- Char encoding and SQL in C#
- How to set only one table charset to utf8mb4 without change mysql configuration?
- Why does opening a file in two different encodings work as expected?
- —- " added in HTML when converting MarkDown file to HTML using Jekyll tool
- Unicode error. database malfunctions
- Can we convert ANSI encoded CSV file to utf-8 encoded file with javascript?
- Determining ISO-8859-1 vs US-ASCII charset
- Unexpected Python String Encoding of '/b'
- Rails ActiveRecord string field encoding vs Ruby String encoding
- Jekyll JSON incorrect character encoding
- Nodejs encoding issue
- How do I encode HTML characters within Javascript functions?
- Specifying Encoding While Placing Files In InDesign Using Extendscript
Related Questions in COLLATE
- how to upper a column with collate in SQL SERVER
- postscript file is not collating when cups copies is set
- SQLITE custom Accent collation function and LIKE queries
- Collating SQL query results
- How to collect items in a list of maps in Scala
- Bypass SQL COLLATE
- perl sort substrings using array to determine collating sequence
- Crystal Reports Server 2011 character set compatibility
- Problems change collation ALL ( PART II ) The object '' is dependent on column ''. ( SQL 2008 )
- Create TABLE in PHP using
- COLLATE differences between SQL dumps from primary and replica MySQL servers
- perl libSqliteIcu.so collate icu
- PostgreSQL 9.1 using collate in select statements
- Unicode sort number before space in Linux environment
- ORACLE 18c XE (18.0.0.0.0) - MAX_STRING_SIZE = EXTENDED (COLLATE BINARY_CI) problem
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
If you do string comparisons with incompatible collations, those comparisons cannot use an index on the string column. I've seen this happen when doing a JOIN on a string column, and the tables joined had different collations (naturally if they also have different character sets, they are also different collations).
But you said your joins are on integer columns, not string columns. So joins shouldn't be a problem in your case.
You can also have performance problems when doing lookups against string columns if your table character set doesn't match your session character set.
Example: My table is defined with utf8mb4, but I set my session to utf8, so string literals will be utf8. Seems like a harmless change, right?
I guess the utf8 string 'abc123' has a clear way to be promoted to utf8mb4 to match the column it compares to.
But if I force a specific collation that is not supported by utf8mb4, I see it has to do a table-scan and compare to rows one by one, instead of an indexed lookup:
There's a difference between implicit collations and explicit collations. Suppose I set my session to use something that doesn't have a clear path to utf8mb4:
So far so good, but if I am explicit about the collation:
The bottom line is that you should use the same character set and collation to make your life easier. Use it for all tables and for the session too.
In these modern times, it's hard to think of a reason to use anything other than utf8mb4.
P.S. Space shouldn't be a problem. UTF-8 character sets allow multibyte characters, but they don't expand the size of characters that fit in a single byte. UTF-8 is a variable-width character encoding. So characters in the ASCII range (0-127) are stored in one byte anyway. Read UTF-8 on wikipedia for details, it has a nice explanation.