I need to export my data frame with colour conditional formatting on specific rows in R

32 Views Asked by At

I have a dataframe, df1 which is data manipulated from an imported excel, and I need to export it back out to have specific formatting that the original file had - it has many columns, but essentially if the 'Subproject' column has NA1 in it, that whole row should be Yellow (#FFFF00).

If 'Subporject' column has a code starting SE followed by any number (i.e. could be SE235, or SE062) the whole row should be red (#FF0000).

Also, if in the Sample_Thaws column there is any number other than 0 (i.e. 1 to 10), and the code in the 'Subproject' column starts with RE and not SE, the whole row should be blue (#0099FF).

I can export df1 without the formatting, but don't know how to put it in, ive been trying to use openxlsx library.

Or if possible, is there a way when reading in the excel to read the row colours and give an annotation column of the row colours such that i could use excels conditional formatting after export to put the colours back in.

The code below is what I've been trying to do, but when I use the saveWorkbook(), no file is created and I'm unsure why. I also didn't add an argument for the fact that rows in blue need the Subproject code to start with RE. Sorry this is a bit long!

# #read in the original data
> df1 <- read.xlsx("1946_P2_master.xlsx")  
# #(I then did the data manipulation and assigned the dataframe back to df1)

# Create a workbook
>   wb <- createWorkbook()

# #we need colors such that SE in Subproject column gives a red colour, or NA1 in the same column gives yellow, and  any number but 0 in samplethaws gives blue,
> yellow_rows <- which(df1$Subproject == "NA1")
> red_rows <- which(grepl("^SE\\d+", df1$Subproject))
> blue_rows <- which(df1$Sample_Thaws != 0)

# Add a worksheet
> addWorksheet(wb, "Sheet1")

# Write data to the worksheet
> writeData(wb, "Sheet1", df1)

# Create styles for yellow, red, and blue
> yellow_style <- createStyle(fgFill = "#FFFF00")
>     red_style <- createStyle(fgFill = "#FF0000")
>     blue_style <- createStyle(fgFill = "#0099FF")

# Apply styles to the respective rows
> styles_and_rows <- list(
>       list(style = yellow_style, rows = yellow_rows),
>       list(style = red_style, rows = red_rows),
>       list(style = blue_style, rows = blue_rows)
>     )

# Loop through the list of styles and rows
>  for (style_row_pair in styles_and_rows) {
>       style <- style_row_pair$style
>       rows <- style_row_pair$rows

# Check if rows are not empty, Apply style to each row
> if (length(rows) > 0) {
>         for (row in rows) {
>           addStyle(wb, sheet = "Sheet1", style = style, rows = row + 1, cols = 1:ncol(df1))
>         }
>       }
>     }

# Write the dataframe with applied styles to an Excel file
>  saveWorkbook(wb, "formatted_data.xlsx")

dput(head(df1)) returns the following (example data):

"3330_534-20210403 RE278.2 2 of 15", "3330_534-20210403 RE278.2 3 of 15", 
"3330_534-20210403 RE278.2 4 of 15"), Sample_Project_Code = c("3330_", 
"3330_", "3330_", "3330_", "3330_"), Sample_Original_ID = c("534-20210403", 
"534-20210403", "534-20210403", "534-20210403", "534-20210403"
), Sample_Part = c(NA, "1 of 15", "2 of 15", "3 of 15", "4 of 15"
), Original_Batch_Code = c("RE277.4", "RE278.2", "RE278.2", "RE278.2", 
"RE278.2"), Subproject = c("RE277.4", "RE278.2", "RE278.2", "RE278.2", 
"RE278.2"), LastAction = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), Date_Of_Import = structure(c(18720, 18720, 18720, 
18720, 18720), class = "Date"), ItemType = c("A", "B", "B", "B", 
"B"), Sample_Container_Type = c("card", "2ml tube ", "2ml tube ", 
"2ml tube ", "2ml tube "), Sample_Thaws = c(0, 0, 0, 0, 0), Age = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_), Subject_Sex = c("M", 
"M", "M", "M", "M"), named_person = c("Kay", "Kay", "Kay", "Kay", 
"Kay"), Researcher = c("Bee", "Bee", "Bee", "Bee", "Bee"), Technician = c("Jay", 
"Jay", "Jay", "Jay", "Jay"), Identifier = c("ACR", "ACR", "ACR", 
"ACR", "ACR"), PPL_Sender = c("Bee", "Bee", "Bee", "Bee", "Bee"
), Extraction_Date = structure(c(18720, 18720, 18720, 18720, 
18720), class = "Date"), Sample_PLPI = c(340, 65, 65, 65, 65), 
    Date_Sent = structure(c(18720, 18720, 18720, 18720, 18720
    ), class = "Date"), Date_Received = structure(c(18720, 18720, 
    18720, 18720, 18720), class = "Date"), D = c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_), Ethics_Code = c("33/333/3", 
    "33/333/4", "33/333/5", "33/333/6", "33/333/7"), Sample_Volume = c(NA, 
    500, 500, 500, 500), Method = c(NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), Comments = c(NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), Row = c(20, 34, 34, 35, 35), Col = c("A", 
    "K", "L", "A", "B"), Level5Name = c("Book 18", "Shelf 14", 
    "Shelf 14", "Shelf 14", "Shelf 14"), Level4Name = c("Compartment B", 
    "Compartment E", "Compartment E", "Compartment E", "Compartment E"
    ), Level3Name = c("Freezer 8", "Freezer 7", "Freezer 7", 
    "Freezer 7", "Freezer 7"), Level2Name = c("Ground Floor", 
    "Ground Floor", "Ground Floor", "Ground Floor", "Ground Floor"
    ), Level1Name = c("AH", "AH", "AH", "AH", "AH")), row.names = c(NA, 
5L), class = "data.frame")```
0

There are 0 best solutions below