Given this:
dict = Dict(("y" => ":x / 2"))
df = DataFrame(x = [1, 2, 3, 4])
df
4×1 DataFrame
│ Row │ x │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 2 │
│ 3 │ 3 │
│ 4 │ 4 │
I want to make this:
4×2 DataFrame
│ Row │ x │ y │
│ │ Int64 │ Float64 │
├─────┼───────┼─────────┤
│ 1 │ 1 │ 0.5 │
│ 2 │ 2 │ 1.0 │
│ 3 │ 3 │ 1.5 │
│ 4 │ 4 │ 2.0 │
This seems like a perfect application for DataFramesMeta, either @with or @eachrow, but I haven't been able to get my expression to evaluate as expected in an environment where :x exists.
Basically, I want to be able to iterate over (k, v) pairs in dict and create one new column for each Symbol(k) with corresponding values eval(Meta.parse(v)), or something along those lines, where the evaluation occurs such that Symbols like :x exist at the time of evaluation.
I didn't expect this to work, and it doesn't:
[df[Symbol(k)] = eval(Meta.parse(v)) for (k, v) in dict]
ERROR: MethodError: no method matching /(::Symbol, ::Int64)
But this illustrates the problem: I need the expressions to be evaluated in an environment where the symbols they contain exist.
However, moving it inside a @with doesn't work:
using DataFramesMeta
@with(df, [eval(Meta.parse(v)) for (k, v) in dict])
ERROR: MethodError: no method matching /(::Symbol, ::Int64)
Using @eachrow fails the same way:
using DataFramesMeta
@eachrow df begin
for (k, v) in dict
@newcol tmp::Vector{Float32}
tmp = eval(Meta.parse(v))
end
end
ERROR: MethodError: no method matching /(::Symbol, ::Int64)
I'm guessing I'm unclear on some key element of how DataFramesMeta creates an environment within a DataFrame. I also don't necessarily have to use DataFramesMeta for this, any reasonably concise option will work since I can encapsulate it in a package function.
Note: I control the format of the strings to be parsed into expressions, but I want to avoid complexity such as specifying the name of the DataFrame object in the string, or broadcasting every operation. I want the expression syntax in the initial string to be reasonably clear to non-Julia programmers.
UPDATE: I tried all three solutions in the comments on this question, and they have a problem: they don't work inside functions.
dict = Dict(("y" => ":x / 2"))
data = DataFrame(x = [1, 2, 3, 4])
function transform_from_dict(df, dict)
new = eval(Meta.parse("@transform(df, " * join(join.(collect(dict), " = "), ", ") * ")"))
return new
end
transform_from_dict(data, dict)
ERROR: UndefVarError: df not defined
Or:
function transform_from_dict!(df, dict)
[df[!, Symbol(k)] = eval(:(@with(df, $(Meta.parse(v))))) for (k, v) in dict]
return nothing
end
transform_from_dict!(data, dict)
ERROR: UndefVarError: df not defined
I have worked on this answer in parallel to @Ajar, nothing is copied from that answer nor did I know about it. I was totally new to Julia so I had to install it (because I thought the online compilers did not even know a DataFrame), later I understood that these packages must be called at start anyway, be it online or offline. I have added the package information that beginners might need to know.
The @with solution:
The @transform solution:
Thanks go to the other commentators, the essential ideas listed in @Ajar's answer.