It might be silly, but I am having trouble merging two datasets (in a new file); I want to have the structure of Excel A (which I am calling BehaviouralEconomicsTourism.xlsx) with the information from the variables (same variables with different spelling from scopus.csv). So, I want them to appear as new rows, adding the information from these variables in common. Does someone have any suggestions? I tried to follow the following code (I tried some other variations before but did not save). But it did not work for hte purpose intended
# Install and load necessary packages if not already installed
if (!requireNamespace("tidyverse", quietly = TRUE)) install.packages("tidyverse")
if (!requireNamespace("readxl", quietly = TRUE)) install.packages("readxl")
if (!requireNamespace("writexl", quietly = TRUE)) install.packages("writexl")
if (!requireNamespace("janitor", quietly = TRUE)) install.packages("janitor")
library(tidyverse)
library(readxl)
library(writexl)
library(janitor)
# Load datasets from Sheet2 in BehaviouralEconomicsTourism.xlsx
behavioral_economics_df <- read_excel("/Users/s5320381/Downloads/BehaviouralEconomicsTourism.xlsx", sheet = "Sheet2")
scopus_df <- read.csv("/Users/s5320381/Downloads/scopus.csv")
# Clean column names
scopus_df <- janitor::clean_names(scopus_df)
behavioral_economics_df <- janitor::clean_names(behavioral_economics_df)
# Mapping columns between the two datasets
column_mapping <- c(
"authors" = "authors",
"author_full_names" = "author_full_names",
"article_title" = "title",
"publication_year" = "year",
"volume" = "volume",
"issue" = "issue",
"publisher" = "publisher",
"source_title" = "source_title",
"author_keywords" = "author_keywords",
"abstract" = "abstract",
"issn" = "issn",
"isbn" = "isbn",
"start_page" = "page_start",
"end_page" = "page_end",
"doi" = "doi",
"language" = "language_of_original_document",
"doi_link" = "link",
"affiliations" = "affiliations"
)
# Convert relevant columns to character in scopus_df
scopus_df <- scopus_df %>%
mutate(across(c(volume, issue, start_page, end_page), as.character))
# Convert relevant columns to double in behavioral_economics_df
behavioral_economics_df <- behavioral_economics_df %>%
mutate(across(c(volume, issue, start_page, end_page), as.double))
# Merge datasets based on common columns
merged_df <- bind_rows(behavioral_economics_df, scopus_df %>% select(column_mapping, references))
# Save the merged dataset to a new Excel file
writexl::write_xlsx(merged_df, "/Users/s5320381/Downloads/test201223.xlsx")
I tried to merge two datasets with same variables that possess different header for these variables.
Assuming you have the following dataframes (given you didn't share a piece of code):
Then you can use
dplyr::selectto select and rename the variables you need as follows:Note that for bind_rows to work you need that variables in both dataframes have same names and types so I believe the error might be related to this detail.