I have the following CSV file:
;A;C;D;E;F;G;H;I;K;L;M;N;P;Q;R;S;T;V;W;X;Y
Position1;0,054213776;0,003005945;0,027905128;0,00375423;0,290228233;0,064954976;0,002462278;0,047134442;0,005404894;0,081739388;0,002012803;0,046380669;0,020762236;0,03654459;0,057469835;0,011760176;0,002482397;0,026511666;0,108202585;0,011974854;0,058416108
Position2;0,004057157;0,041518985;0,019806132;0,051610208;0,003572703;0,036402843;0,074879075;0,010325334;0,044981263;0,09328763;0,03897166;0,064762246;0,029074767;0,004175355;0,013691361;0,109767515;0,046100376;0,002930728;0,248865169;0;0,028268182
Position3;0,051305224;0,064958634;0,025061506;0,001931642;0,022646096;0,053596034;0,060665537;0,002355053;0,002426384;0,264133805;0,030836312;0,032183821;0,018242803;0,048333116;0,11381004;0,066739613;0,052130556;0,005772064;0,047369009;2,92638E-05;0,033100145
Basically, my row names are Position1, Position2, Position3 and my column names are A, B, C....,Y. I have loaded them in R using the following command:
data<- read.csv2(f, header=TRUE)
where f has been selected before.
However, if I ask for the row names using data[,1]
I get
[1] Position1 Position2 Position3
Levels: Position1 Position2 Position3
which seems ok. However, if I now ask for the column names via data[1,]
I get the following:
X A C D E F G H I K L M N P Q R S T V W X.1
1 Position1 0.05421378 0.003005945 0.02790513 0.00375423 0.2902282 0.06495498 0.002462278 0.04713444 0.005404894 0.08173939 0.002012803 0.04638067 0.02076224 0.03654459 0.05746983 0.01176018 0.002482397 0.02651167 0.1082026 0.01197485
Y
1 0.05841611
which I do not understand. For some reason R thinks that the first element [1,1] should have a name and uses X
for that while in the CSV file the first element is empty, i.e.
[1,1]=empty A C D E..........Y
Position1
Position2
Position3
How should I read the CSV file in R?
Edit: I removed the semicolon and used the following command: data<- read.table(f, header=TRUE, sep=";")
However, if I now want to ask for the rownames via data[,1]
I get the following:
[1] 0,054213776 0,004057157 0,051305224
Levels: 0,004057157 0,051305224 0,054213776
while the column names via data[1,]
are:
A C D E F G H I K L M N P Q R S T V W
Position1 0,054213776 0,003005945 0,027905128 0,00375423 0,290228233 0,064954976 0,002462278 0,047134442 0,005404894 0,081739388 0,002012803 0,046380669 0,020762236 0,03654459 0,057469835 0,011760176 0,002482397 0,026511666 0,108202585
X Y
Position1 0,011974854 0,058416108
This is still not correct. Any suggestions?
I think
will do what you want.
The row names and column names in an R
matrix
ordata.frame
are not considered part of the table data (that is, they're not the first row and column of the data) -- rather, they are kept as separate attributes, which are retrievable usingcolnames(dat)
andrownames(dat)
(and settable usingrownames(dat) <- ...
andcolnames(dat) <- ...
).dimnames()
is useful for retrieving or setting both column and row names at the same time ...header=TRUE
(which is the default forread.csv[2]
) tells R that it should treat the first row of the CSV file as column names (rather than assuming they're data, and that it should make up generic column names).row.names=1
tells R that it should treat the first column of the CSV file as row names (ditto).