I have a dataframe where one column contains a number of short descriptions of policies. I also have a list that contains three sublists: 'private', 'public', and 'commercial'. These sublists contain a number of words that are associated with either private, public or commercial ev charging policies.
I need code that:
Checks if words from the lists are present in a text
a) If a word is present: take the name of the sublist and enter it in the column 'charging_type'
b) If no word is present: leave empty
c) If words from multiple sublists are present: take the names of all the sublists and enter it in the column 'charging_type'
The data (generated via dput()) looks like this:
The list of lists containing the dictionaries:
list(private = list(c("income tax credit", "property owners",
"homeowner", "owner", "personal use", "condominium", "common interest",
"housing association", "tenant", "residential", "residents",
"purchasers")), commercial = list(c("commercial", "fleet", "nonprofit",
"non-residential", "industrial")), public = list(c("public",
"utility-owned", "fast charging", "interstates", "highways",
"network", "general public", "operators", "operator", "IOUs",
"infrastructure", "disadvantaged communities", "charging station service providers",
"low-income", "underserved")))
The dataframe with the text:
structure(list(manual_classification = c("private", NA, "public",
"public", "?", NA, "private", "public", "commercial", NA, "public",
NA, NA, "private", NA), Text = c("a common interest development including a community apartment \ncondominium and cooperative development may not prohibit or restrict\nthe installation or use of ev charging stations or ev dedicated\ntime of use tou meter in a homeowner s designated parking space or\nunit these entities may put reasonable restrictions on ev charging\nstations but the policies may not significantly increase the cost of\nthe ev charging stations or significantly decrease its efficiency or\nperformance restrictions may be placed on tou meter installations if\nthey are based on the structure of or available space in the building \nif installation in the homeowner s designated parking space or unit is\nnot possible with authorization the homeowner may add ev charging\nstations or a ev dedicated tou meter in a common area the homeowner\nmust obtain appropriate approvals from the common interest development\nassociation and agree in writing to comply with applicable architectural\nstandards engage a licensed installation contractor provide a\ncertificate of insurance and pay for the electricity usage \nmaintenance and other costs associated with the ev charging stations or\ntou meter any application for approval should be processed by the\ncommon interest development association without willful avoidance or\ndelay the homeowner and each successive homeowner of the parking space\nor unit equipped with ev charging stations or a tou meter is responsible\nfor the cost of the installation maintenance repair removal or\nreplacement of the equipment as well as any resulting damage to the ev\ncharging stations tou meter or surrounding area the homeowner must\nalso maintain a 1 million umbrella liability coverage policy and name\nthe common interest development as an additional insured entity under\nthe policy if ev charging stations or an ev dedicated tou meter is\ninstalled in a common area for use by all members of the association \nthe common interest development must develop terms for use of the ev\ncharging stations or tou meter \n reference california civil code 4745 and 6713 http www oal ca gov ",
"the san joaquin valley air pollution control district sjvapcd and thesouth coast air quality management district aqmd administer enhancedfleet modernization program efmp pilot retire and replace programs providing incentives to replace a vehicle eligible for retirement with amore fuel efficient vehicle used vehicles must be no more than eightyears old and applicants must live in the san joaquin valley or southcoast air basins eligible replacement vehicles must meet a minimum fueleconomy average by model year or average at least 35 miles per gallon mpg alternative fuel vehicles are also eligible including plug inhybrid electric vehicles phev and battery electric vehicles evs funding for alternative transportation mobility options such as publictransportation or car sharing is also available in lieu of purchasinganother vehicle the incentive amounts vary by income level as comparedto the federal poverty level fpl and replacement vehicle type alleligible applicants must have a household income that is at or below400 of the fpl data align center income eligibility fuel economy greater than 35 mpg phev or zev low income 225 fpl 4 500 4 500 moderate income 300 fpl 3 500 3 500 above moderate income 400 fpl 2 500 2 500 residents living in qualified disadvantaged communities may be eligiblefor higher incentive amounts and for residents replacing their vehicleswith a phev or ev a rebate of up to 2 000 for the purchase ofelectric vehicle supply equipment residents of south coast aqmd mayalso be eligible to receive a rebate of 7 500 for alternativetransportation mobility options for more information includingeligible vehicles and applicable requirements see the california airresources board efmp https ww2 arb ca gov our work programs enhanced fleet modernization program sjvapcd driveclean https www valleyair org drivecleaninthesanjoaquin replace and south coast aqmd replace yourride http www replaceyourride com websites reference california health and safety code 44062 3 and 44125 http www oal ca gov ",
"municipalities may not restrict the types of evs such as plug in hybridelectric vehicles that may access an ev charging station that ispublic intended for passenger vehicle use and funded in any part bythe state or utility ratepayers reference california government code 65850 9 http www oal ca gov ",
"the washington state department of ecology ecology will work with the\noffice of the governor and state agencies to select projects and\ndistribute funding to leverage 15 of washington s portion of the vw\nenvironmental mitigation\ntrust https www epa gov enforcement volkswagen clean air act civil settlement \nfor the acquisition installation operation and maintenance of\nlight duty zero emission vehicle charging infrastructure \necology will establish a competitive process to identify and select\nprojects to fund with the remaining 85 of the appropriation to maximize\ntotal air pollution reduction and health benefits improve air quality\nin areas disproportionately affected by air pollution leverage\nadditional matching funds achieve substantial emission reduction beyond\nwhat would occur absent the funding accelerate fleet turnover to the\ncleanest engines and accelerate adoption of electric vehicles \nequipment and vessels as appropriate ecology will work with state\nagencies to select projects and distribute funding for more\ninformation see the ecology vw enforcement\naction https ecology wa gov about us how we operate grants loans find a grant or loan volkswagen enforcement action grants \nwebsite \n",
"an individual may not park a motor vehicle within any parking space\nspecifically designated for charging evs to use the parking space evs\nmust be actively charging violators may receive a fine of up to 750 \n reference nevada revised statutes 484b 468 https www leg state nv us ",
"by january 1 2026 the kentucky finance and administration cabinet\n cabinet must increase the use of ethanol biodiesel and other\nalternative transportation fuels and replace at least 50 of light duty\nstate fleet vehicles with new alternative fuel vehicles or vehicles\nequipped with low emission technology beginning december 1 2024 the\ncabinet must compile annual reports detailing the progress made towards\nthese requirements including a life cycle cost assessment vehicle\nreplacement timeline and targets for increased alternative fuels in\nstate agency vehicles \n\n reference senate bill 281 2023 https legislature ky gov legislation pages default aspx 26 u s code 30b http www gpo gov fdsys and kentucky revised statutes 45a 625 and 152 715 http lrc ky gov krs titles htm ",
"an income tax credit is available for 50 of the cost of alternative\nfueling infrastructure up to 5 000 qualifying infrastructure\nincludes electric vehicle charging stations and equipment to dispense\nfuel that is 85 or more natural gas propane or hydrogen unused\ncredits may be carried over into future tax years for more information \nincluding how to claim the credit please see the new york state\n department of taxation and\nfinance http www tax ny gov pit credits alt fuels elec vehicles htm \nwebsite reference new york tax\nlaw http public leginfo state ny us lawssrch cgi nvlwo 187 b ",
"tva will establish and fund a network of direct current fast charging\n dcfc stations every 50 miles along interstates and major highways\nthrough the fast charge network program program the program offers\nfunding for public dcfc stations along ev corridor gaps up to 150 000\nper dcfc station eligible applicants include tva local power companies \nand eligible projects must include a minimum of two dcfc ports per\nlocation program participants must identify suitable host sites and\nagree to own operate and maintain program funded dcfc stations for a\nminimum of five years for more information including guidelines and\nadditional eligibility requirements see the tva fast charge\nnetwork https energyright com ev fast charger program website \n",
"pwp offers rebates of 3 000 per port for commercial workplace multi unit dwelling mud and fleet customers for the installation ofnetworked level 2 ev charging stations or rebates of 1 500 per portfor non networked level 2 ev charging stations pwp also offers rebatesof 6 000 for the installation of direct current fast charging dcfc stations or level 2 ev charging stations installed at select sites including disadvantaged communities additional terms and conditionsapply for more information including how to apply see the pwp commercial ev and charger incentiveprogram https ww5 cityofpasadena net water and power commercialchargerrebate website ",
"the u s department of transportation s dot nevi formulaprogram https afdc energy gov laws 12744 requires the alabamadepartment of transportation aldot to submit an annual evinfrastructure deployment plan plan to the dot and u s department ofenergy doe joint office of energy andtransportation https driveelectric gov joint office describinghow the state intends to distribute nevi funds the submitted plans mustbe established according to neviguidance https www fhwa dot gov environment alternative fuel corridors nominations 90d nevi formula program guidance pdf for more information about alabama s nevi planning process see thealabama department of economic and community affairs electric vehiclecharging infrastructure program https adeca alabama gov ev website to review alabama s nevi plan see the joint office state plans for evcharging https driveelectric gov state plans website ",
"at least one parking space or 10 of parking spaces rounded to the next\nwhole number must be made ready for level 2 ev charging stations at all\nnew buildings electrical capacity must accommodate the potential to\nserve a minimum of 20 of the total parking spaces with level 2 ev\ncharging stations for assembly education or mercantile buildings the\nrequirements apply only to employee parking spaces buildings classified\nas utility or miscellaneous and some residential buildings are exempt\nfrom these requirements additional terms and conditions apply \n reference revised code of washington 51 50 0429 https apps leg wa gov rcw ",
"cities and counties that receive funding from the road maintenance andrehabilitation program are encouraged to use funds towards advancedtransportation technologies and communication systems including butnot limited to zero emission vehicle fueling infrastructure andinfrastructure to vehicle communications for autonomous vehicles reference california streets and highways code 2030 http leginfo legislature ca gov faces home xhtml ",
NA, "homeowners associations hoas or condominium associations may not\nprohibit the installation of an ev charging station for personal use\nwithin the ev charging station owner s designated parking space hoas\nmay establish restrictions on the number size placement manner of\ninstallation and insurance requirements for the ev charging station if\nit is installed on the exterior of the property or in a common area \nhoas are not liable for the ev charging station \na condominium association may prohibit the installation of an ev\ncharging station if it is not technically feasible or practical due to\nsafety risks structural issues or engineering conditions condominiums\nmay establish requirements on the manner of installation architectural\ndesign insurance requirements and community related expenses for the\nev charging station \n reference virginia code 55 1 1823 1 55 1 1962 1 and 55 1 2139 1 https law lis virginia gov vacode ",
"new evs must be equipped with a conductive charger inlet port that meetsthe specifications contained in society of automotive engineers sae standard j1772 evs must be equipped with an on board charger with aminimum output of 3 3 kilowatts kw these requirements do not apply toevs that are only capable of level 1 charging which has a maximum powerof 12 amperes amps a branch circuit rating of 15 amps and continuouspower of 1 44 kw reference california code of regulations title 13 section 1962 2 http www oal ca gov "
), charging_type = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA)), row.names = c(NA, -15L), class = c("tbl_df",
"tbl", "data.frame"))
I want the column charging_type to be automatically classified according to the words in the lists.
So far I tried this approach:
find_max_matching_list <- function(text, cleaned_codebook_list) {
max_matches <- 0
max_list_name <- NULL
matched_phrases <- character()
for (list_name in names(cleaned_codebook_list)) {
codebook_phrases <- cleaned_codebook_list[[list_name]]
matches <- sum(sapply(codebook_phrases, function(phrase) grepl(phrase, text)))
if (matches > max_matches) {
max_matches <- matches
max_list_name <- list_name
matched_phrases <- codebook_phrases[sapply(codebook_phrases, function(phrase) grepl(phrase, text))]
}
}
return(list(max_list_name = max_list_name, matched_phrases = matched_phrases))
}
#Applying the function to each row in df_text
result <- apply(df_text, 1, function(row) {
find_max_matching_list(row["Text"], cleaned_codebook_list)
})
# Assigning results to the dataframe
df_text$charging_type <- sapply(result, function(res) res$max_list_name)
df_text$matched_words <- sapply(result, function(res) paste(res$matched_phrases, collapse = ", "))
you asked the same question yesterday and I answered your question with an easy to understand and reproducible example - such examples are beneficial as other people can often more easily transfer coding approaches to their field/use case. However, it seems that you have deleted that question, and thus the answer, and are looking for code that uses the exact data you provided. So I'll give you the same answer here as yesterday, this time using your data. If it does not the job for you, please explain.
data (that you provided; maybe
v1needs some cleaning before, e.g., many empty spaces)