Ordering by attribute that contains letters and numbers

104 Views Asked by At

We need to order a list of room numbers. You can think of the units being numerically ordered, with the possibility of a letter in front as a prefix, e.g 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, B1, B2, B3, etc.

After reading MySQL 'Order By' - sorting alphanumeric correctly, here's what I tried:

create table units (id integer, unit_number varchar(100));

insert into units (id, unit_number) values (1, 'A1');
insert into units (id, unit_number) values (2, 'A2');
insert into units (id, unit_number) values (3, 'A3');
insert into units (id, unit_number) values (4, 'A4');
insert into units (id, unit_number) values (5, 'A5');
insert into units (id, unit_number) values (6, 'A6');
insert into units (id, unit_number) values (7, 'A7');
insert into units (id, unit_number) values (8, 'A8');
insert into units (id, unit_number) values (9, 'A9');
insert into units (id, unit_number) values (10, 'A10');
insert into units (id, unit_number) values (11, 'B1');
insert into units (id, unit_number) values (12, 'B2');
insert into units (id, unit_number) values (13, 'B3');
insert into units (id, unit_number) values (14, 'B4');
insert into units (id, unit_number) values (15, 'B5');
insert into units (id, unit_number) values (16, 'B6');
insert into units (id, unit_number) values (17, 'B7');
insert into units (id, unit_number) values (18, 'B8');
insert into units (id, unit_number) values (19, 'B9');
insert into units (id, unit_number) values (20, 'B10');

select * from units ORDER BY LENGTH(unit_number), unit_number;

When I get the results back, I'll get ordering like this:

| id       | unit_number    |
| -------- | -------------- |
| 1        | A1             |
| 2        | A2             |
| 3        | A3             |
| 4        | A4             |
| 5        | A5             |
| 6        | A6             |
| 7        | A7             |
| 8        | A8             |
| 9        | A9             |
| 10       | B1             |
| 11       | B2             |
| 12       | B3             |
| 13       | B4             |
| 14       | B5             |
| 15       | B6             |
| 16       | B7             |
| 17       | B8             |
| 18       | B9             |
| 19       | A10            |
| 20       | B10            |

How can I rewrite this query so that the ordering places A10 after A9? This is more of the expectation from the user's perspective.

2

There are 2 best solutions below

1
On BEST ANSWER

You can do this by using some regexes to extract the parts that you need and sort them:

SELECT unit_number
FROM  units
ORDER BY
    SUBSTRING(unit_number FROM '^[A-Za-z]+'), -- This sorts the alphabetical part.
    CAST(SUBSTRING(unit_number FROM '[0-9]+$') AS INTEGER); -- This sorts the numerical part.

This also solves potential issues with more alphabetical characters, like AB123

2
On

Basically, you need to cast the number part to a numeric type to get it sorted accordingly. being the key word here. Make sure that you only cast proper number literals or you raise an exception.

If all consist of a single leading letter and trailing digits, left() and right() are simplest & fastest:

SELECT unit_number
FROM   units
ORDER  BY left(unit_number, 1), right(unit_number, -1)::int;

If the only rule is: "0-n letters, followed by 0-n digits":

SELECT unit_number
FROM   units
ORDER  BY substring(unit_number, '^\D*')
        , substring(unit_number, '\d+$')::int NULLS FIRST;

substring() returns an empty string ('') for the pattern \D* (note the '*'!) if no letter (non-digit, to be precise) is found. That conveniently sorts first as it should.

The same "trick" won't fly for the conversion to integer, as '' is not valid for the type. The pattern \d+ (note the '+'!) results in a null value if no trailing digit is found. That's valid for integer too, but that null has to sort on top. NULLS FIRST achieves that. See:

If unit_number can be null to begin with, you'll need to define where to sort that ...

Alternatively, wrap the integer part in COALESCE(). Same result:

SELECT unit_number
FROM   units
ORDER  BY substring(unit_number, '^\D*')
        , COALESCE(substring(unit_number, '\d+$')::int, -1);

fiddle