Suppose I have something like the following vector:
text <- as.character(c("string1", "str2ing", "3string", "stringFOUR", "5tring", "string6", "s7ring", "string8", "string9", "string10"))
I want to execute a loop that does pair-wise comparisons of the edit distance of all possible combinations of these strings (ex: string 1 to string 2, string 1 to string 3, and so forth). The output should be in a matrix form with rows equal to number of strings and columns equal to number of strings.
I have the following code below:
#Matrix of pair-wise combinations
m <- expand.grid(text,text)
#Define number of strings
n <- c(1:10)
#Begin loop; "method='osa'" in stringdist is default
for (i in 1:10) {
n[i] <- stringdist(m[i,1], m[i,2], method="osa")
write.csv(data.frame(distance=n[i]),file="/File/Path/output.csv",append=TRUE)
print(n[i])
flush.console()
}
The stringdist() function is from the stringdist{} package but the function is also bundled in the base utils package as adist()
My question is, why is my loop not writing the results as a matrix, and how do I stop the loop from overwriting each individual distance calculation (ie: save all results in matrix form)?
I would suggest using
stringdistmatrix
instead ofstringdist
(especially if you are usingexpand.grid
)As for your concrete question: "My question is, why is my loop not writing the results as a matrix"
It is not clear why you would expect the output to be a matrix? You are calculating an element at a time, saving it to a vector and then writing that vector to disk.
Also, you should be aware that the arugments of
write.csv
are mostly useless (they are there, I believe, just to remind the user of what the defaults are). Usewrite.table
insteadIf you want to do this iteratively, I would do the following: