How to get column name and column index

Hi I have the below dataframe. Since the column contains NA's the datatype of the column is character. Now, I need to get the column name and index which only contains String value.

In the example below, I want to get the column name and column index of Zo-A and Zo-B:

 ZONE-1        Zo-A         Zone-3        Zo-B
 58            On             75          NA
 60            NA             NA          High
 NA            Off            68          Low
 70            On             NA          NA

So far I tried to first convert all of them to numeric, which created NA's for Zo-A and Zo-B column. And if I use the below code for column index, I'm getting NA's as a result

a <- which(colnames(df)=="Zo-A" )

match_col <- match(c("Zo-A","Zo-B")names(df))

I need to perform below operations:

  1. I need to first get the column names which consists of String values
  2. I need the column index for the same

There are 3 best solutions below


To obtain this we can use the code below:

 names (df)[K]
    Zo.A Zo.B 

 Which (k)
   Zo.A Zo.B 
     2    4 

For what I understand of your question, what you want or need is really, really simple.

First, read the data in.

df <- read.table(text = "
ZONE-1        Zo-A         Zone-3        Zo-B
 58            On             75          NA
 60            NA             NA          High
 NA            Off            68          Low
 70            On             NA          NA
", header = TRUE, check.names = FALSE)

'data.frame':   4 obs. of  4 variables:
 $ ZONE-1: int  58 60 NA 70
 $ Zo-A  : Factor w/ 2 levels "Off","On": 2 NA 1 2
 $ Zone-3: int  75 NA 68 NA
 $ Zo-B  : Factor w/ 2 levels "High","Low": NA 1 2 NA

  ZONE-1 Zo-A Zone-3 Zo-B
1     58   On     75 <NA>
2     60 <NA>     NA High
3     NA  Off     68  Low
4     70   On     NA <NA>

Now, question (1), "first get the column names which consists of String values". All column names consist of string values so this can be done either with names or with colnames.

[1] "ZONE-1" "Zo-A"   "Zone-3" "Zo-B" 

[1] "ZONE-1" "Zo-A"   "Zone-3" "Zo-B" 

Now question (2), to get the column index of "the same". (I assume it's of column Zo-A you are asking for.)

a <- which(colnames(df) == "Zo-A")
[1] 2

a2 <- grep("Zo-A", colnames(df))
[1] 2

Data in dput format.

df <-
structure(list(`ZONE-1` = c(58L, 60L, NA, 70L), `Zo-A` = structure(c(2L, 
NA, 1L, 2L), .Label = c("Off", "On"), class = "factor"), `Zone-3` = c(75L, 
NA, 68L, NA), `Zo-B` = structure(c(NA, 1L, 2L, NA), .Label = c("High", 
"Low"), class = "factor")), .Names = c("ZONE-1", "Zo-A", "Zone-3", 
"Zo-B"), class = "data.frame", row.names = c(NA, -4L))

If you need to get only the column names composed of alphabetic characters and punctuation marks, you can use the following regular expression.

a3 <- grep("^[[:alpha:]|[:punct:]]*$", colnames(df))
[1] 2 4

While reading the data.frame you can specify 'stringsAsFactors=FALSE' and if your data itself contains NA as a string "NA" then you can specify that in the read.csv setting this parameter na.strings = c("NA")

df = read.csv('file.csv',header=T,stringsAsFactors=FALSE,na.strings=c("NA"))

Then try:

type = sapply(df,class) 
indexes = which(type=='character')
nameofindexes = names(indexes)