- File: it is a file with about 100 rows and 7 columns. XLS extension (MS Excel 97-2003).
- Issue: unable to read in the file with R through
read.xlsx
orread.xlsx2
. Code used to try to read the file:
library(xlsxjars) library(rJava) library(xlsx) excel <- read.xlsx("File.xls",sheetIndex=1,startRow=1,stringsAsFactor=F)
Error prompted:
Error in
.jcall ("RJavaTools","Ljava/lang/object;","invokeMethod",cl,
: java.lang.IllegalArgumentException: Your InputStream was neither an OLE2 stream, nor an OOXML stream.
PD: for the record, I did try to read it with read.csv
and it does read it but since it's an Excel file and it's not separated by commas or dots, R reads it as if everything is in 1 column. Maybe anyone can suggest a way to read it through read.csv
?
There are two things you have stated that make me think that you are not dealing with the type of file you think you are dealing with:
If it really was a file in the MS Excel 97-2003 native format(s),
read.csv
would not have worked at all on it. The error fromread.xlsx
supports this. This appears to be a text file, and your comment that usingsep="\t"
inread.table
allows it to be read correctly confirms it. What you have is a tab separated text file that happens to have the suffix.xls
.It should be noted that the suffix of a file is not a guarantee that the file is of the purported format. A much better indicators is running the
file
shell command on it (if that command is available on your operating system). File extensions are sometimes used by GUIs to determine which application to use to open the file with. As Excel is capable of reading a tab delimited file (and because Excel prorbably is not set to be the default application for a.tsv
file, which would be the more traditional file extension), by using a.xls
extension, the file will (generally) be opened by Excel.In summary, just because the filename ends in
.xls
does not mean it really is an Excel native format file.