I'm making a script for Adobe Indesign to read some data from a CSV file from Excel.
There are some CSV annoyances with commas and quotes, so I tried to use regex to split, instead of split(","). The regex I'm trying to use is:
/,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)/g
A line from the CSV file may look like:
[SP],"Example 3/8"" Wrench, (2) Screwdrivers","Example 3/8"" Wrench, (2) Screwdrivers"
Which should be 3 cells:
| Cell 1 | Cell 2 | Cell 3 |
|---|---|---|
| [SP] | Example 3/8" Wrench, (2) Screwdrivers | Example 3/8" Wrench, (2) Screwdrivers |
The following code works in javascript but not extendscript:
var csvData = csvLine.split(/,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)/g);
I also tried using https://regex101.com/ as a sanity check, and it identifies commas where I would expect.
With extendscript I got:
| Cell 1 | Cell 2 | Cell 3 |
|---|---|---|
| [SP] | Example 3/8"" Wrench | Example 3/8" Wrench, (2) Screwdrivers |
Where it is truncating part of the second cell, but not the third cell.
Is this a bug in extendscript? Or am I doing something wrong? Is there a different regex that would work in extendscript?
Looks like another Extendscript glitch indeed.
Here is the 'stupid' working solution:
Looks pretty awful but on the other hand you can see exactly what the code does, which is good for maintenance. )