With new clojure 1.7 I decided to understand where I can use transducers. I understand what benefit they can give, but I can't find normal examples of writing custom transducers with explanation.
Ok, I tried to test what is happening. I opened the clojure documentation. And there examples use xf
as argument. First: what means this xf or xfrom?
This stuff produced identity transducer.
(defn my-identity [xf]
(fn
([]
(println "Arity 0.")
(xf))
([result]
(println "Arity 1: " result " = " (xf result))
(xf result))
([result input]
(println "Arity 2: " result input " = " (xf result input))
(xf result input))))
I took the naming of variables [result input]
from documentation example.
I thought it's like in reduce function where result
is reduced part and input
is new collection element.
So when I make (transduce my-identity + (range 5))
I got result 10
what I was expecting.
Then I read about eduction
, but I can't understand what is it. Anyway
I made (eduction my-identity (range 5))
and got:
Arity 2: nil 0 = nil
Arity 2: nil 1 = nil
Arity 1: nil = nil
(0 0 1 1)
Every item got duplicated because I call xf
in println
statement.
Why it duplicated every item twice?
Why I got nil?
Will I always get nil while making an eduction?
Can I relay on this behavior?
Anyway I did
> (reduce + (eduction my-identity (range 5))
clojure.core.Eduction cannot be cast to clojure.lang.IReduce
Ok, the result is a Eduction
that is NOT reducible, but printed like a list. Why it is not reducible? When I type (doc eduction)
I get that
Returns a reducible/iterable application of the transducers
to the items in coll.
Shouldn't (transduce xform f coll)
and (reduce f (eduction xfrom coll))
be the same?
I made
> (reduce + (sequence my-identity (range 5))
20
Of course I got 20
because of duplicates. Again I thought it should be
that (transduce xform f coll)
and (reduce f (sequence xfrom coll))
be
always equal at least in such small example without any stateful transducers. This is stupid that they are not, or I'm wrong?
Ok, then I tried (type (sequence my-identity (range 5)))
and get
clojure.lang.LazySeq
I thought, that it's lazy but when I tried to take the first
element
clojure calculated all the sequence at once.
So my summary:
1) What means xf or xform?
2) Why I get nil
as a result
argument while eduction
or sequence
?
3) Could I always be sure that it will be nil
while eduction
or sequence
?
4) What is eduction
and what is the idiomatic idea it's not reducible? Or if it is, then how I can reduce it?
5) Why I get side effects while sequence
or eduction
?
6) Can I create actual lazy sequences with transducers?
Many questions, let's first start with a few anwers:
Yes,
xf
==xform
is a "transducer".Your
my-identity
function does not compile. You have a parameter and then multiple other arities of the function. I believe you forgot a(fn ...)
.Your argument to your identity transducer is called
xf
. However, this is usually calledrf
, which means "reducing function". Now the confusing part is thatxf
's are also reducing functions (hencecomp
just works). However, it's confusing that you'd call itxf
and you should call itrf
.Transducers are usually "constructed" since they may be stateful and/or are passed parameters. In your case, you don't need to construct it since it's simple and doesn't have state or even a parameter. However be aware that you'd usually wrap your function in another
fn
returning function. This means you'd have to call(my-identity)
instead of just passing it asmy-identity
. Again, it's fine here, just slightly unconvential and possibly confusing.Let's first continue and pretend that your
my-identity
transducer is correct (it's not, and I'll explain later what's going on).eduction
is relatively rarely used. It creates a "process". I.e. you can run it over and over again and see the result. Basically, just like you have lists or vectors that hold your items the eduction will "hold" the result of the transducer applied. Note that to actually do anything you still need arf
(reducing function).In the beginning I think it is helpful to think of reducing functions as
conj
(or actuallyconj!
) or in your case+
.Your
eduction
prints the elements it produces since it implementsIterable
which is called by theprintln
or your REPL. It simply prints out every element that you add in you transducer with the arity 2 call.Your call to
(reduce + (eduction my-identity (range 5)))
doesn't work sinceEduction
(the object being constructed ineduction
) only implementsIReduceInit
.IReduceInit
as its name suggest does require an initial value. So this will work:(reduce + 0 (eduction my-identity (range 5)))
Now if you run the above
reduce
as I suggest you'll see something very interesting. It prints 10. Even though your eduction earlier printed(0 0 1 1 2 2 3 3 4 4)
(which if you add together is 20). What's going on here?As stated earlier, your transducer has a flaw. It doesn't work properly. The problem is that you call your
rf
and then call it a second time again in your arity 2 function. In clojure, stuff isn't mutable, unless it's somehow internally mutable for optimization purposes :). Here the problem is that sometimes clojure uses mutation and you get duplicates even though you never properly capture the result of your first time you call(rf)
in your arity 2 function (as the argument to yourprintln
).Let's fix you function but leave the second
rf
call in there:Note:
xf
torf
as noted earier.rf
and pass it on to the second call ofrf
. This transducer is not an identity transducer but doubles every elementObserve carefully:
To aswer your questions:
eduction
doesn't really passnil
as theresult
argument when it's being reduced. It only gets nil when being printed which calls theIterable
interface.nil
really comes fromTransformerIterator
which is a special class created for transducers. This class is also used forsequence
as you noticed. As the docs state:The reason that you receive
nil
as theresult
argument is because an iterator has no resulting collection which holds the elements iterated over so far. It simply goes over each element. No state is being accumulated.You can see the reducing function that is used by the
TransformerIterator
as and inner class here:https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/TransformerIterator.java
Do a
CTRL+f
and enterxf.invoke
to see how your transducer is getting called.The
sequence
function isn't really as lazy as a truly lazy sequence but I think this explains this part of you question:Are Clojure transducers eager?
sequence
simply computes the results of a transducer incrementally. Nothing else.Lastly, a proper identity function with some debug statements: