Why doesn't Scala optimize calls to the same Extractor?

151 Views Asked by At

Take the following example, why is the extractor called multiple times as opposed to temporarily storing the results of the first call and matching against that. Wouldn't it be reasonable to assume that results from unapply would not change given the same string.

object Name {
  val NameReg = """^(\w+)\s(?:(\w+)\s)?(\w+)$""".r

  def unapply(fullName: String): Option[(String, String, String)] = {
    val NameReg(fname, mname, lname) = fullName
    Some((fname, if (mname == null) "" else mname, lname))
  }
}

"John Smith Doe" match {
  case Name("Jane", _, _) => println("I know you, Jane.")
  case Name(f, "", _) => println(s"Hi ${f}")
  case Name(f, m, _) => println(s"Howdy, ${f} ${m}.")
  case _ => println("Don't know you")
}
2

There are 2 best solutions below

0
On BEST ANSWER

Wouldn't it be reasonable to assume that results from unapply would not change given the same string.

Unfortunately, assuming isn't good enough for a (static) compiler. In order for memoizing to be a legal optimization, the compiler has to prove that the expression being memoized is pure and referentially transparent. However, in the general case, this is equivalent to solving the Halting Problem.

It would certainly be possible to write an optimization pass which tries to prove purity of certain expressions and memoizes them iff and only iff it succeeds, but that may be more trouble than it's worth. Such proofs get very hard very quickly, so they are only likely to succeed for very trivial expressions, which execute very quickly anyway.

0
On

What is a pattern match? The spec says it matches the "shape" of the value and binds vars to its "components."

In the realm of mutation, you have questions like, if I match on case class C(var v: V), does a case C(x) capture the mutable field? Well, the answer was no:

https://issues.scala-lang.org/browse/SI-5158

The spec says (sorry) that order of evaluation may be changed, so it recommends against side-effects:

In the interest of efficiency the evaluation of a pattern matching expression may try patterns in some other order than textual sequence. This might affect evaluation through side effects in guards.

That's in relation to guard expressions, presumably because extractors were added after case classes.

There's no special promise to evaluate extractors exactly once. (Explicitly in the spec, that is.)

The semantics are only that "patterns are tried in sequence".

Also, your example for regexes can be simplified, since a regex will not be re-evaluated when unapplied to its own Match. See the example in the doc. That is,

Name.NameReg findFirstMatchIn s map {
  case Name("Jane",_,_) =>
}

where

object Name { def unapply(m: Regex.Matcher) = ... }