Write Genepop friendly .txt file from a regular data frame in R

719 Views Asked by At

I've lost an entire day of work trying to do this operation which look very simple but apparently isn't that easy :

I have a dataframe with the first column as my population column, the other columns are loci and each row is an individual : (this is the format used by hierFstat by the way)

I would like to convert this data frame to a .txt file usable by the genepop package and I can't manage to find a solution that fit what I would like to do and it seems I'm not able to write such a function myself.

Here is my data frame : enter image description here

Anyone ever had to do such an operation ? and know a way ?

Thanks for your time !

(Sorry I'm new to the forum I hope I'm clear and correct)

2

There are 2 best solutions below

1
On

I guess write.table() is the function you are looking for. Just have a look at the R documentation via ?write.table. Should look like this:

write.table(df_name, file= "filename.txt", ...)

You could also create a csv file and later on save it as a txt file (for example via Excel). Therefore you could use write.csv() or write.csv2(). Please have a look at the corresponding R documentation, or simply google the functions.

Is there a specific reason why you want to use a function of a specific package?

0
On

when figuring out the answer to your question, it's important to share a reproducible code example.

Dummy data frame example:

Sample LocusA LocusB
1 076331 145131
2 012076 NA
3 331012 145012
4 NA 131076

etc...

Convert data frame to genepop format:

#Get data from .csv (or others formats)
rawtable <- read.csv("raw_alleles.csv", header = TRUE)

#Replace NAs of all data.frame
rawtable[is.na(rawtable)] <- 000000

#select and join alleles
#you can exclude Sample number and remained just with Group1 to declare 
#the population, this not compromise the downstream analysis
haplotypes <- as.data.frame(paste("Group1_", rawtable$Sample, ","," ", rawtable$LocusA," ", rawtable$LocusB, sep = ""))

#Build the Genepop format
sink("genepop_format.txt")
cat("Title: Any information \n")
cat("LocusA \n")
cat("LocusB \n")
cat("Pop \n")
invisible(apply(haplotypes, 1,function(x) cat(x,"\n")))
sink()

The output file should be similar to this:

------------- The file starts here ---------------- 
 Title: Any Information 
 LocusA
 LocusB 
 Pop 
 Group1_1, 076331 145131
 Group1_2, 012076 000000
 Group1_3, 331012 145012
 Group1_4, 000000 145012