Unexpected Sort Behavior PHP Collator::asort for values in format YYYY-MM-DD

313 Views Asked by At

I am finding a unexpected sorting behavior with string values in the formtat YYYY-MM-DD such as 2016-12-16 when using Collator::asort. Using PHP's normal asort, there is no issue. With the Collator version, it looks like there is some bug and it seems to then sort by the array keys instead.

Edit: I am using PHP Version 5.6.29 for this test.

I made a little sample program to show the issue:

<?php

$list["off-to-the-races"] = "2015-04-14";
$list["new-years-2015"] = "2015-01-01";
$list["ground-hog-day"] = "2015-02-02";
$list["back-to-the-future"] = "2015-08-12";

$locale = setlocale(LC_COLLATE, 0);
$col = Collator::create($locale);

$res_val = collator_get_locale( $col, \Locale::VALID_LOCALE );
$res_act = collator_get_locale( $col, \Locale::ACTUAL_LOCALE );

echo "<pre>";

printf( "Valid locale name: %s <br /> Actual locale name: %s <br />",
        $res_val, $res_act );


echo "<br /><br />Raw List: <br />";
print_r($list);


$col->asort($list);
echo "<br /><br />\$col->asort(\$list): <br />";
print_r($list);

echo "<br /><br />Collator error message: <br />";
echo $col->getErrorMessage();

asort($list, $sort_flags);
echo "<br /><br />asort(\$list) (without collation): <br />";
print_r($list);

Here is my output on the test server:

Valid locale name: en_US_POSIX 
Actual locale name: en_US_POSIX 


Raw List: 
Array
(
    [off-to-the-races] => 2015-04-14
    [new-years-2015] => 2015-01-01
    [ground-hog-day] => 2015-02-02
    [back-to-the-future] => 2015-08-12
)


$col->asort($list): 
Array
(
    [back-to-the-future] => 2015-08-12
    [ground-hog-day] => 2015-02-02
    [new-years-2015] => 2015-01-01
    [off-to-the-races] => 2015-04-14
)


Collator error message: 
U_ZERO_ERROR

asort($list) (without collation): 
Array
(
    [new-years-2015] => 2015-01-01
    [ground-hog-day] => 2015-02-02
    [off-to-the-races] => 2015-04-14
    [back-to-the-future] => 2015-08-12
)

As you can see, when using $col->asort($list) it is not in the original order, but has clearly not be sorted as expected.

The whole point of using the YYYY-MM-DD way to represent a date as a string is that if sorted like a string, it would be in both alphabetical and chronological order in most implementations.

Note, if I add the following test case with $sort_flag set:

$col->asort($list, 1 );
echo "<br /><br />\$col->asort(\$list, 1): <br />";
print_r($list);

it will work correctly:

$col->asort($list, 1): 
Array
(
    [new-years-2015] => 2015-01-01
    [ground-hog-day] => 2015-02-02
    [off-to-the-races] => 2015-04-14
    [back-to-the-future] => 2015-08-12
)

While this is an adequate work around if you know all of your values will be in this format, I would expect it to also work as a natural or string sort - for cases where your values are mixes of date strings and other arbitrary strings.

I am not sure if this is locale/collation specific. Is there anyway to debug further?

0

There are 0 best solutions below