I'm working on a database that has tables with different charsets. Since it's a big database, I was wondering if it could lead to a performance issue. Yes, the usual value comparison a DB is usually doing is the JOIN and is done comparing integers, but are there any other performance problems we could experience having tables with different charsets, other than the bigger space taken by some charsets?
Can we have performance issues when using mixed table charsets in MySQL or in Postgres?
522 Views Asked by sekmo At
2
There are 2 best solutions below
0
Rick James
On
MySQL:
For zip_code (postal_code), stored as a string (CHAR or VARCHAR), most charsets work equally well. However, when JOINing on such a column, the collation must be the same.
- If it is the same, an index on that column can be used.
- If it is not, then the index is useless, and the query must scan the entire table.
Since the collation includes the charset, that forces the charset to be the same, also.
The choice of collation is rather minor. However, if there can be letters in the string (postal_code, country_code, etc), you need to decide whether to force the tables (and user queries) to use a particular case.
- Collation
..._bintreats cases as different: 'de' won't match 'DE' (for Germany). - Collation
..._ciis "Case Insensitive", so those would match.
Related Questions in MYSQL
- How to Retrieve Data from an MySQL Database and Display it in a GUI?
- How to change woocomerce or full wordpress currency with value from USD to AUD
- window.location.href redirects but is causing problems on the webpage
- Error: local variable 'bramka' referenced before assignment
- Products aren't displayed after fetching data from mysql db (node.js & express)
- status table for all entries (even in different dates) in database changing value when all checkboxes are checked
- Can't Fix Mariadb & Mysql ERROR 2002 (HY000): Can't connect to local server through socket '/tmp/mysql.sock' (2) On MacOs
- Express Mysql getting max ID from table not working cought in a promise
- failed to upload a table from sql file
- Update a MySQL row depending on the ID in Google Sheets Apps Script
- Use row values from another table to select them as columns and establish relations between them (pivot table)
- SQL: Generate combination table based on source and destination column from same table
- How to display the column names which have only unique non-null values in MySQL table?
- mysql query takes too long because of wrong indexes usage
- Multitable joining in Sql
Related Questions in DATABASE
- How to add the dynamic new rows from my registration form in my database?
- How to store a date/time in sqlite (or something similar to a date)
- Problem with add new attribute in table with BOTO3 on python
- When an E-R attribute should be perceived as a relationship attribute or as an entity set attribute?
- SQLAlchemy: efficient relationship loading in 3-way many-to-many relationship
- Cannot connect to Postgres Database when running Quarkus Tests with Gitlab ci
- Local or remote database with react-native?
- I want to edit a specific row in database
- How to enter data in mongodb array at specific position such that if there is only 2 data in array and I want to insert at 5, then rest data is null
- Open Web Library
- database login.py and register.py error showing 404 file not found and doesn't work
- SQL71561: SqlComputedColumn: When column selected
- Liquibase as SaaS To Configure Multiple Database as Dynamic
- Updated max input vars but table still shows error
- Spring does not map set of roles
Related Questions in POSTGRESQL
- Only the first SQL script gets executed inside Docker Postgres container
- Compare fields in two tables
- Hibernate ClobJdbcType bindings: what are the diferences?
- Postgres && statement Error in Mybatis Mapper?
- Can this query be optimized? (Choosing a random row to insert, that excludes previously inserted Rows)
- Connection terminated unexpectedly while performing multi row insert using pg-promise
- Processing multiple forms in nodejs and postgresql
- How to copy data from SQLite to postgreSQL?
- PGAdmin4 configured behind a reverse proxy but unable to connect to Postgresql server
- Updates to pgsodium encrypted values don't use specified key_id
- Connecting to Postgres running in a Docker container using psql
- Can't connect to local postgresql server from my docker container
- Django Arrayfield migration to cloud sql (Postgresql) not creating the column
- Get list of matching keywords for each post
- docker-compose can't reset postgresql database
Related Questions in CHARACTER-ENCODING
- Can't we make a better variable-length character encoding with just using the 1 bit extra in the 7 bit ASCII?
- Cpanel filter encoding utf-8?
- bagaimana cara menginstall steghide lewat mac
- Encoding problem on MySQL: Why some non-ASCII characters get encoded on more than 4 bytes?
- Matching multi-language (latin extended) characters in lua
- Handle mixed charsets in the same json file
- MIPS Aiken to Binary
- I am not sure why I need to Encode path parameter TWICE to make the rest call with special chars to work?
- having character encoding problem on my blog content in php application
- Visual C++ - how can I turn a unicode character into char or string?
- Cypresss Unable to Load UTF-16 Website on Brower Launch
- How to set encoding?
- HL7 encoding characters in non-ASCII strings
- How to fix these two warnings about implicit string cast during charset conversion?
- Python PyODBC and SQL Server encoding issue
Related Questions in COLLATE
- AES Encryption and Decryption not giving proper result
- SQL query not returning results when combining collate and escape function using mutated vowels (umlauts)
- COLLATE differences between SQL dumps from primary and replica MySQL servers
- Using utf-8 in an webpage isn't working properly
- COLLATE Vietnamese_CI_AI is not working correctly with LIKE '%value%'
- Different number of bounding boxes per image, tried padding, but boxes with zeros are invalid for training object detection model
- Empty error when using collate fn and num workers at same time in dataloader
- "collate_fn" for Huggingface Hyperparameter Tuning
- Mysql - convert Tables in Database to diffrent encoding and collate - foreign key constraints are failing
- divide list to sublists in groovy
- What locale/rule gives this string sort order?
- Inconsistent Results from Cross-Server Query
- Sorting of special characters in Sequelize Postgres like in Javascript
- Can I retrieve Arabic text that stored in varchar column in Arabic Collate that stored from another database in Latin Collate
- SQL collate using select query on multiple fields
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
If you do string comparisons with incompatible collations, those comparisons cannot use an index on the string column. I've seen this happen when doing a JOIN on a string column, and the tables joined had different collations (naturally if they also have different character sets, they are also different collations).
But you said your joins are on integer columns, not string columns. So joins shouldn't be a problem in your case.
You can also have performance problems when doing lookups against string columns if your table character set doesn't match your session character set.
Example: My table is defined with utf8mb4, but I set my session to utf8, so string literals will be utf8. Seems like a harmless change, right?
I guess the utf8 string 'abc123' has a clear way to be promoted to utf8mb4 to match the column it compares to.
But if I force a specific collation that is not supported by utf8mb4, I see it has to do a table-scan and compare to rows one by one, instead of an indexed lookup:
There's a difference between implicit collations and explicit collations. Suppose I set my session to use something that doesn't have a clear path to utf8mb4:
So far so good, but if I am explicit about the collation:
The bottom line is that you should use the same character set and collation to make your life easier. Use it for all tables and for the session too.
In these modern times, it's hard to think of a reason to use anything other than utf8mb4.
P.S. Space shouldn't be a problem. UTF-8 character sets allow multibyte characters, but they don't expand the size of characters that fit in a single byte. UTF-8 is a variable-width character encoding. So characters in the ASCII range (0-127) are stored in one byte anyway. Read UTF-8 on wikipedia for details, it has a nice explanation.