How to parse only 4 digit years

2.6k Views Asked by At

I'm using Joda-Time to parse years like this:

private DateTime attemptParse(String pattern, String date) {
        DateTimeFormatter parser = DateTimeFormat.forPattern(pattern).withLocale(Locale.ENGLISH);
        DateTime parsedDateTime = parser.parseLocalDateTime(date).toDateTime(WET);
        return parsedDateTime;
    }

I'm trying to parse multiple formats: "yyyy-MM-dd", "yyyy-MMM-dd","yyyy MMM dd-dd","yyyy MMM", (etc), "yyyy". When one doesn't work, I try the next one.

And it works like a charm when the string is indeed only 4 digits (e.g: "2016"). The problem is that I sometimes receive things like this: "201400". And Joda-Time matches this with "yyyy" pattern and returns a date with year 201400.

I wanted to avoid the ugly if to check if year > 9999. Is there any way to do this using Joda-Time?

2

There are 2 best solutions below

7
On

To parse multiple formats, you can create lots of DateTimeParser instances and join all in one single formatter (instead of trying one after another).

This will require a DateTimeFormatterBuilder, which will also be used to enforce a specific number of digits in the input (unfortunately, there's no way to enforce a specific number of digits like you want using just DateTimeFormat.forPattern()).

First you create lots of org.joda.time.format.DateTimeParser instances (one for each possible pattern):

// only yyyy
DateTimeParser p1 = new DateTimeFormatterBuilder()
    // year with exactly 4 digits
    .appendYear(4, 4).toParser();
// yyyy-MM-dd
DateTimeParser p2 = new DateTimeFormatterBuilder()
    // year with exactly 4 digits
    .appendYear(4, 4)
    // rest of the pattern
    .appendPattern("-MM-dd").toParser();
// yyyy MMM
DateTimeParser p3 = new DateTimeFormatterBuilder()
    // year with exactly 4 digits
    .appendYear(4, 4)
    // rest of the pattern
    .appendPattern(" MMM").toParser();

Then you create an array with all these patterns and create a DateTimeFormatter with it:

// create array with all the possible patterns
DateTimeParser[] possiblePatterns = new DateTimeParser[] { p1, p2, p3 };

DateTimeFormatter parser = new DateTimeFormatterBuilder()
    // append all the possible patterns
    .append(null, possiblePatterns)
    // use the locale you want (in case of month names and other locale sensitive data)
    .toFormatter().withLocale(Locale.ENGLISH);

I also used Locale.ENGLISH (as you're also using it in your question's code). This locale indicates that the month names will be in English (so MMM can parse values like Jan and Sep). With this, you can parse the inputs:

System.out.println(parser.parseLocalDateTime("2014")); // OK
System.out.println(parser.parseLocalDateTime("201400")); // exception
System.out.println(parser.parseLocalDateTime("2014-10-10")); // OK
System.out.println(parser.parseLocalDateTime("201400-10-10")); // exception
System.out.println(parser.parseLocalDateTime("2014 Jul")); // OK
System.out.println(parser.parseLocalDateTime("201400 Jul")); // exception

When the year is 2014, the code works fine. When it's 201400, it throws a java.lang.IllegalArgumentException, such as:

java.lang.IllegalArgumentException: Invalid format: "201400" is malformed at "00"

DateTimeFormatter is immutable and thread-safe, so you don't need to create it every time your validation method is called. You can create it outside of the method (such as in a static final field).

This is better than creating one formatter everytime you perform a validation, and going to the next one when an exception occurs. The formatter created already does it internally, going to the next pattern until it finds one that works (or throwing the exception if all patterns fail).


Java new Date/Time API

Joda-Time is in maintainance mode and is being replaced by the new APIs, so I don't recommend start a new project with it. Even in joda's website it says: "Note that Joda-Time is considered to be a largely “finished” project. No major enhancements are planned. If using Java SE 8, please migrate to java.time (JSR-310).".

If you can't (or don't want to) migrate from Joda-Time to the new API, you can ignore this section.

If you're using Java 8, consider using the new java.time API. It's easier, less bugged and less error-prone than the old APIs.

If you're using Java 6 or 7, you can use the ThreeTen Backport, a great backport for Java 8's new date/time classes. And for Android, you'll also need the ThreeTenABP (more on how to use it here).

The code below works for both. The only difference is the package names (in Java 8 is java.time and in ThreeTen Backport (or Android's ThreeTenABP) is org.threeten.bp), but the classes and methods names are the same.

This new API is much more strict than the previous ones, so the formatter only works with the exact number of digits (note that some classes are very similar to Joda-Time):

// 4 digits in year
DateTimeFormatter fmt = DateTimeFormatter.ofPattern("yyyy", Locale.ENGLISH);
fmt.parse("2014"); // OK
fmt.parse("201400"); // exception
fmt.parse("201"); // exception

This code works with 2014, but with 201400 or 201 (or any other value without exactly 4 digits) it throws an exception:

java.time.format.DateTimeParseException: Text '201400' could not be parsed at index 0

With this, your validation code could work with the array of strings.


There's only one detail: when parsing to a date, Joda-Time sets default values when the input doesn't have some fields (like month becomes January, day becomes 1, hour/minute/second are set to zero, etc).

If you are just validating the input, then you don't need to return anything. Just check if the exception is thrown and you'll know if the input is valid or not.

If you just need the year value, though, you can use the Year class:

DateTimeFormatter parser = DateTimeFormatter.ofPattern("yyyy", Locale.ENGLISH);
System.out.println(Year.parse("2014", parser)); // ok
System.out.println(Year.parse("201400", parser)); // exception

If you want the year value as an int:

Year year = Year.parse("2014", parser);
int yearValue = year.getValue(); // 2014

But if you want to get a date object, you'll need to set the default values manually - the new API is very strict and don't set those values automatically. In this case, you must set the default values, by using a DateTimeFormatterBuilder.

I also parse it to a LocalDateTime, just as example:

DateTimeFormatter fmt = new DateTimeFormatterBuilder()
    // string pattern
    .appendPattern("yyyy")
    // default month is January
    .parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
    // default day is 1
    .parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
    // default hour is zero
    .parseDefaulting(ChronoField.HOUR_OF_DAY, 0)
    // default minute is zero
    .parseDefaulting(ChronoField.MINUTE_OF_HOUR, 0)
    // set locale
    .toFormatter(Locale.ENGLISH);
// create LocalDateTime
System.out.println(LocalDateTime.parse("2014", fmt)); // 2014-01-01T00:00
System.out.println(LocalDateTime.parse("201400", fmt)); // exception

You can choose whatever values you want as the default for the fields, and use any of the new available date types.

1
On

What you are saying is that Jodatime should somehow guess that it should parse "201400" as 2014. I don't think that's reasonably within the scope of that library. You should pre-process the data yourself, for example by using:

String normalizedDate = String.format("%4s", date).trim();