Text Scanning to read in unknown number of variables and unknown number of runs

5.4k Views Asked by At

I am trying to read in a csv file which will have the format

  Var1 Val1A Val1B ... Val1Q
  Var2 Val2A Val2B ... Val2Q
  ...

And I will not know ahead of time how many variables (rows) or how many runs (columns) will be in the file.

I have been trying to get text scan to work but no matter what I try I cannot get either all the variable names isolated or a rows by columns cell array. This is what I've been trying.

  fID = fopen(strcat(pwd,'/',inputFile),'rt');

  if fID == -1
      disp('Could not find file')
      return
  end

  vars = textscan(fID, '%s,%*s','delimiter','\n');
  fclose(fID);

Does anyone have a suggestion?

2

There are 2 best solutions below

0
On

For any given file, are all the lines equal length? If they are, you could start by reading in the first line and use that to count the number of fields and then use textscan to read in the file.

fID = fopen(strcat(pwd,'/',inputFile),'rt');
firstLine = fgetl(fID);
numFields = length(strfind(firstLine,' ')) + 1;
fclose(fID);

formatString = repmat('%s',1,numFields);

fID = fopen(strcat(pwd,'/',inputFile),'rt');
vars = textscan(fID, formatString,' ');
fclose(fID);

Now you will have a cell array where first entry are the var names and all the other entries are the observations.

In this case I assumed the delimiter was space even though you said it was a csv file. If it is really commas, you can change the code accordingly.

0
On

If the file has the same number of columns in each row (you just don't know how many to begin with), try the following.

First, figure out how many columns by parsing just the first row and find the number of columns, then parse the full file:

% Open the file, get the first line
fid = fopen('myfile.txt');
line = fgetl(fid);
fclose(fid);

tmp = textscan(line, '%s');
% The length of tmp will tell you how many lines
n = length(tmp);

% Now scan the file
fid = fopen('myfile.txt');
tmp = textscan(fid, repmat('%s ', [1, n]));
fclose(fid);