Hello StackOverflow users... I have a tricky situation and I have yet to find an answer. Maybe you can help me.
Database: PostgreSQL 8.4 (can't upgrade)
In this database, there is a users table. Sadly, the usernames that users can provide when they create a user profile is case sensitive, so a username of Alex is not the same as a username of alex.
There is a new system going out and username is no longer case sensitive. I'm trying to find all of the usernames that would be considered duplicates in the old system. This way we can reach out and have them update the usernames manually and then migrate their users to a newer system (without conflict of username).
I have the following query which will show me the counts of each username matching another with the "lower()" function.
select count(*), lower(username)
from users
where deleted = false
group by lower(username) having count(*) > 1
This returns results like the following:
|count|lower |
|-----+--------+
|3 |alex |
|2 |george |
What I need to do is get this data into a temp table and display all of those duplicate users and other details so that we have a list to go through.
I have part of the temp table figured out, but my main issue is: How do I get the distinct values of all of these duplicates? So in the long run, I get results that look like the following (and maybe even without a temp table if possible):
|lower |username|
|-------+--------+
|alex |Alex |
|alex |alex |
|george |georGe |
|george |George |
Restrictions:
- I can't change the version of postgres from 8.4
- Some duplicates will have more than 2 hits (the most I've seen so far is 3)
- Since the users must be informed, there is no way to change the data other then to contact them prior (which is why the list is needed)
I appreciate any suggestions/feedback you may be able to provide.
How about this. Just generate your above list as a CTE, then join with it in the main query:
Or even simpler if you only want a bare list of the affected users: