Omitting NAs from Data

51 Views Asked by At

First time posting. Apologies if I'm not as clear as I intend.

I have an excel (xlxs) spreadsheet of data; it's sequencing data if that helps. Generally indexed as follows: column 1 = organism families (hundreds of organisms down this column) columns 2-x = specific samples

Many of the boxes scattered throughout the data are zero values, or too low, which I want to omit. I set my data such that anything under 5 is set to an NA. Since different samples will have many more, less, or different species omitted by that threshold, I want to separate by samples. Code so far is:

#Files work, I just omitted my directories to place online
`my_counts <- read_excel("...Family_120821.xlsx" , sheet = "family_Counts") 
my_perc <- read_excel("...Family_120821.xlsx" , sheet = "family_Percentages")
my_counts[my_counts < 5] <- NA
my_counts
my_perc[my_perc < 0.05] <- NA
my_perc

S13 <- my_counts$family , my_counts$Sample.13
S13A <- na.omit(S13)
S13A


S14 <- my_counts$Sample.14
S14A <- na.omit(S14)
S14A

S15 <- my_counts$Sample.15
S15A <- na.omit(S15)
S15A


...

First question, there a better way I can go about this such that I can replicate it in different data without typing out each individual sample? Most important question: When I do this, I get what I want, which is the values I want, no NAs. But they are values, when I want another dataframe so I can write it back to an xlxs. As I have it, I lose the association to the organism.

Ex: Before

All samples by associated organisms

Ex: After

Single sample, no NAs, but also no association to organism index

Essentially the following image, but broken into individual samples. With only the organisms that met my threshold of 5 for counts, 0.05 for percents.

enter image description here

0

There are 0 best solutions below