Regex for parse name with one or more words after double number and before 2 or more spaces

52 Views Asked by At

Problem: How create regex to parse "DISNAY LAND 2.0 GCP" like name from Array of lines in Scala like this:

DE1ALAT0002  32.4756  -86.4393  106.1 ZQ DISNAY LAND 2.0 GCP             23456

//For using in code:


val regex = """(?:[\d\.\d]){2}\s*(?:[\d.\d])\s*(ZQ)\s*([A-Z])""".r . // my attempt
  val getName = row match {
    case regex(name) => name
    case _ =>
  }

I'm sure only in:

1) there is different number of spaces between values 2) useful value "DISNAY LAND 2.0 GCP" come after double number and "ZQ" letters 3) name separating with one space and may consist of one or many words 4) name ending with two or more spaces

sorry if I repeat the question, but after a long search I did not find the right solution

Many thank for answers

1

There are 1 best solutions below

0
Wiktor Stribiżew On

You may use an .unanchored pattern like

\d\.\d+\s+ZQ\s+(\S+(?:\s\S+)*)

See the regex demo. Details

  • \d\.\d+ - 1 digit, . and then 1+ digits
  • \s+ - 1+ whitespaces
  • ZQ - ZQ substring
  • \s+ - 1+ whitespaces (here, the left-hand side context definition ends, now, starting to capture the value we need to return)
  • (\S+(?:\s\S+)*) - Capturing group 1:
    • \S+ - 1 or more non-whitespace chars
    • (?:\s\S+)* - a non-capturing group that matches 0 or more sequences of a single whitespace (\s) and then 1+ non-whitespace chars (so, up to the double whitespace or end of string).

Scala demo:

val regex = """\d\.\d+\s+ZQ\s+(\S+(?:\s\S+)*)""".r.unanchored
val row = "DE1ALAT0002  32.4756  -86.4393  106.1 ZQ DISNAY LAND 2.0 GCP             23456"
val getName = row match {
  case regex(name) => name
  case _ =>
}
print(getName) 

Output: DISNAY LAND 2.0 GCP