I have the following data
GOBPID Term ADX_KD_06.ip ADX_KD_24.ip ADX_LG_06.ip (more columns)
GO:0000003 reproduction 0 0 0
GO:0000165 MAPK cascade 0 0 0
(more rows)
When I read it like the following
d1 <- read.table("http://dpaste.com/1487049/plain/",sep="\t",header=TRUE)
I expect d1$GOBPID to contain values like GO:0000003, but it access Term column
instead.
> d1$GOBPID
[1] reproduction MAPK cascade ....
Basically, it doesn't assign the header column as it should. Why is that? What's the right way to do it?
How big are your actual data?
As Richie Cotton pointed out,
count.fieldsis useful for identifying how many delimiters there are in each row of your data. In this case, however, it was a little more useful to open the file up in a decent text editor that shows tab characters, and you would see that every line except for the first has a trailing tab. Because all the other rows have one more tab than the first, R assumes the first "column" should be therow.nameswhich leads to the problem you're having.Here are two possible options for this data:
Option 1
This is convenient if your data are small: Use
gsubto get rid of the trailing tabs, and useread.delimon the output of that:Option 2
Read the table in skipping the first line, drop the last column (which should be all
NAvalues), and add names by reading just the first line usingscan: