Regular Expression to validate any Hebrew year in the Hebrew Numeric System

111 Views Asked by At

I need to validate input to make sure it's a valid Hebrew year number, which can only have certain valid sequences of letters.

Something like (pseudo regex) ([:thousands:]')?[:hundreds combinations:][:tens:]?("[:alef-tes digits:])? But then sometimes it's only single digits, or tens etc without any hundreds or thousands...

Not sure what it's formally called, the Hebrew Numeric system? Hebrew Numerology? Hebrew Numeral System? Gematria? Overview described here: https://en.wikipedia.org/wiki/Hebrew_numerals Test here: https://hebrewnumerals.github.io/ Examples:

  • 1 = א
  • 15 = ט"ו
  • 16 = ט"ז
  • 42 = מ"ב
  • 133 = קל"ג
  • 499 = תצ"ט
  • 501 = תק"א
  • 651 = תרנ"א
  • 765 = תשס"ה
  • 872 = תתע"ב
  • 1015 = א׳ט"ו
  • 1724 = א׳תשכ"ד
  • 2389 = ב׳שפ"ט
  • 4129 = ד׳קכ"ט
  • 4487 = ד׳תפ"ז
  • 6298 = ו׳רצ"ח
  • 7892 = ז׳תתצ"ב
  • 9301 = ט׳ש"א

Some examples of invalid year numbers:

  • ז׳צת"ב (Taf 400 cannot be after Tzadik 90)
  • שר"ו (Shin 300 followed by Resh 200 invalid, should have been Taf 400 followed by Kuf 100)
  • פ'א"ב (Alef 1 followed by Bet 2 not ok, should be Gimmel 3)
  • ל"מ bad - should've been ע
  • ט"ט bad - should've been י"ח

(Tried searching for existing libraries or calendar source codes, gave regex101 a bunch of tries)

1

There are 1 best solutions below

1
LetsDoGood On

PHP non-regex workaround I'm using for now:

  1. Start with the reverse; Make a function numberToHeb($num) that reliably converts a number (ex 4345) into a properly formulated hebrew number (ד'שמ"ה). Currently making use of built in PHP calendar conversion functions taking advantage of the year number as a general purpose convert-number-to-hebrew functionality (limited to 1-9999 range)
  2. Now make a function hebNumber($str) to take any hebrew number (ex ד'שמ"ה) wishing to be validated if it is a valid formulation of any hebrew number (1-9999 in this case). In this function;
  3. Convert every hebrew digit to numerical value (ד'=4000, ש=300, מ=40, ה=5) add up the total (4345), pass the total thru the numberToHeb($num) function from step #1, check if the resulting string matches the string given to the hebNumber($str) function, If matches = passes validation, otherwise it was not a properly formed hebrew number.
function numberToHeb(int $num){
    // limitation of using built in calendar converstion functions
    //  to convert number to hebrew representation: only handles 1-9999
    
    $x1 = mb_convert_encoding(
        jdtojewish(
            jewishtojd(1,1,$num), true,
            CAL_JEWISH_ADD_GERESHAYIM|CAL_JEWISH_ADD_ALAFIM_GERESH
        ),
        "UTF-8",
        "ISO-8859-8"
    );

    return mb_substr($x1, mb_strrpos($x1, ' ') + 1);

    // alternatively can port to PHP something like
    //    https://hebrewnumerals.github.io/ function Generate(...)
    // or https://github.com/MattMcManis/Aleph/blob/master/src/Aleph/Aleph/Converter.cs
}

function hebNumber(string $num){
    $lets = [
        // looks like StackOverflow code block reverses the text direction for RTL text, even the greater-than symbol looks like a less-than
        // but it's really "x" => N
        "א" => 1,
        "ב" => 2,
        "ג" => 3,
        "ד" => 4,
        "ה" => 5,
        "ו" => 6,
        "ז" => 7,
        "ח" => 8,
        "ט" => 9,
        "י" => 10,
        "כ" => 20,
        "ל" => 30,
        "מ" => 40,
        "נ" => 50,
        "ס" => 60,
        "ע" => 70,
        "פ" => 80,
        "צ" => 90,
        "ק" => 100,
        "ר" => 200,
        "ש" => 300,
        "ת" => 400,
    ];
    $parts = explode("'", $num); // if has thousands digit
    $main = end($parts);
    $total = 0;
    if (count($parts)>1) {
        // has thousands digit; add to total
        $yrpart = $parts[0];
        $main = $parts[1];
        $total = ($lets[$yrpart]??0) * 1000;
    }
    foreach (mb_str_split($main) as $char) {
        // simply add up each individual letter to total
        $total += $lets[$char] ?? 0;
    }

    // if valid hebrew input number, then input string should match same as generating the hebrew from total
    return numberToHeb($total) === $num ? $total : false;
}
>>> hebNumber("ד'שמ\"ה")
=> 4345
>>> hebNumber("ד'שה\"מ")
=> false