Extract multiple strings from a long string efficiently in Javascript

606 Views Asked by At

I am trying to extract multiple strings using different patterns from one long string. I have done this successfully using different Regex queries but I feel this is not very efficient. Here is an example of the input string:

const input = ....Number of students: 5...[New]Break at 1:45 pm\nStudents involved are: John, Joseph, Maria\nLunch at 2:00 pm...Activities remaining: long jump, shuffle..

There are three prefixes which are used to extract the data after it:
const prefix1 = Students involved are:
const prefix2 = Activities remaining:
const prefix3 = Number of students:

The only pattern is a newline after each of the above three strings. Example: Students involved are: John, Joseph, Maria\n

I have used the following regex to accomplish this:

const students = input.match(new RegExp(prefix1 + "(.*)"));
const activities = input.match(new RegExp(prefix2 + "(.*)"));
const numOfStudents = input.match(new RegExp(prefix3 + "(.*)"));

Is there a better and more efficient way of accomplishing the above where I have to iterate through the long string only once?

1

There are 1 best solutions below

2
On

You can combine all the three regexs to one, like this: (?<=Number of students: )(?<number>[^\n]+).*?(?<=Students involved are: )(?<students>[^\n]+).*?(?<=Activities remaining: )(?<activities>[^\n]+)/gms

Then by using capture groups you can access all the three values that you want.

let s = `const input = ....Number of students: 5...
 at 1:45 pm
Students involved are: John, Joseph, Maria
Lunch at 2:00 pm...Activities remaining: long jump, shuffle..`

let pattern = /(?<=Number of students: )(?<number>[^\n]+).*?(?<=Students involved are: )(?<students>[^\n]+).*?(?<=Activities remaining: )(?<activities>[^\n]+)/gms

let m = pattern.exec(s)

console.log(m.groups.students)
console.log(m.groups.number)
console.log(m.groups.activities)

Since there is no pattern other than a newline after each of the mentioned lines, this will match anything till the newline, so you will need to clean up the matched values though.