I have some data that I used ArcGIS to get and I want to construct a database by watershed identifier (for example HUC_8=1404106). The data contains the watershed identifier (HUC_8), watershed Area, Soil type, and Soil Area. The watershed identifier is listed as many times as there are soil types. I want to create a database based on watersheds (having the identifier only appear once in the column) and extract the soil area by type in different columns. I attached a subset of the data so that hopefully it is clear. I am somewhat new to R, but I feel that this could be done with a for loop. Knowing how to do this would be extremely helpful, being that I work a lot with GIS, but would like to perform more analysis in R.
HUC_8 WatershedArea Soil SoilArea A_Area B_Area C_Area D_Area Null_Area
14040106 461104.4883 B 96590.33424
14040106 461104.4883 C 86282.93487
14040106 461104.4883 D 24945.9992
14050007 921494.3621 Null 2.861388
14050007 921494.3621 A 87214.28385
14050007 921494.3621 B 131417.8659
14050007 921494.3621 C 268324.5125
14050007 921494.3621 D 314131.5806
14060001 627348.8316 Null 8119.375083
14060001 627348.8316 A 5315.511117
14060001 627348.8316 B 286915.9001
14060001 627348.8316 C 114357.5251
14060001 627348.8316 D 163671.7545
Essentially it sounds like you want to reshape your data from long format to wide format. The
reshape2
library can come in handy herenow transform the data
which results in the
wide
dataframe which looks like