How do compiled queries in slick actually work?

4.2k Views Asked by At

I am looking for a detailed explanation about the execution of compiled queries. I can't understand how they just compile once and the advantage behind their use

2

There are 2 best solutions below

0
On

Assuming this question is about the usage, not the internal implementation of Compiled queries, here is my answer:

When you write a Slick query, Slick actually creates a data structure internally for all the involved expressions - an abstract syntax tree (AST). When you want to run this query, Slick takes the data structure and translates (or in other words compiles) it into a SQL string. This can be a fairly time intensive process taking more time than actually executing fast SQL queries on the DB. So ideally we shouldn't do this translation to SQL every single time the query needs to be executed. But how to avoid it? By caching the translated/compiled SQL query.

Slick could do something like only compile it the first time and cache it for the next time. But it doesn't, because that makes it harder for the user to reason about Slick's execution time, because the same code will be slow the first time, but faster later. (Also Slick would need to recognize queries when they are run a second time and lookup the SQL in some internal cache, which would complicate the implementation).

So instead Slick compiles the query every time, unless you explicitly cache it. This makes the behavior very predictable and ultimately easier. To cache it, you need to use Compiled and store the result in a place that will NOT be recomputed next time you need the query. So using a def like def q1 = Compiled(...) does not make much sense, because it would compile it every time. It should be a val or lazy val. Also you probably do not want to put that val into a class you instantiate multiple times. A good place instead is a val in a top-level Scala singleton object, which is only computed once and kept for the live time of the JVM.

So in other terms, Compiled does nothing magical. It only allows you to trigger Slick's Scala-to-SQL compilation explicitly and return a value that contains the SQL. Importantly, this allows to trigger compilation separately from actually executing the query, which allows you to compile once, but run it multiple times.

2
On

The advantage is easy to explain: Query compilation takes time, both in Slick and in the database server. If you execute the same query many times, it's faster to compile only once.

Slick needs to compile an AST with collection operations into a SQL statement. (Actually, without compiled queries you always have to build the AST first but compared to the compilation times this is extremely fast.)

The database server has to build an execution plan for a query. This means parsing the query, translating it to native database operations, and finding optimizations based on the data layout (e.g. which index to use). This part can be avoided even if you don't use compiled queries in Slick, simply by using bind variables, so that you always get the same SQL code for different sets of parameters. A database server keeps a cache of recently used / compiled executions plans, so as long as the SQL statement is identical, the execution plan is only a hash lookup away and doesn't need to be computed again. Slick relies on this kind of caching. There is no direct communication from Slick to the database server to reuse an old query.

As to how they are implemented, there is some additional complexity for dealing with streaming / non-streaming and compiled / applied / ad-hoc queries in the same way, but the interesting entry point is in Compiled:

implicit def function1IsCompilable[A , B <: Rep[_], P, U](implicit ashape: Shape[ColumnsShapeLevel, A, P, A], pshape: Shape[ColumnsShapeLevel, P, P, _], bexe: Executable[B, U]): Compilable[A => B, CompiledFunction[A => B, A , P, B, U]] = new Compilable[A => B, CompiledFunction[A => B, A, P, B, U]] {
  def compiled(raw: A => B, profile: BasicProfile) =
    new CompiledFunction[A => B, A, P, B, U](raw, identity[A => B], pshape.asInstanceOf[Shape[ColumnsShapeLevel, P, P, A]], profile)
}

This gives you an implicit Compilable object for every Function. Similar methods for arities 2 to 22 are auto-generated. Because the individual parameters only need a Shape, they can also be nested tuples, HLists or any custom type. (We still provide abstractions for all function arities because it's syntactically more convenient to write, say, a Function10 than a Function1 that takes a Tuple10 as its argument.)

There's a method in Shape that only exists to support compiled functions:

/** Build a packed representation containing QueryParameters that can extract
  * data from the unpacked representation later.
  * This method is not available for shapes where Mixed and Unpacked are
  * different types. */
def buildParams(extract: Any => Unpacked): Packed

The "packed" representation built by this method can produce an AST containing QueryParameter nodes with the correct type. They are treated the same as other literals during compilation, except that the actual values are not known. The extractor starts as identity at the top level and is refined to extract record elements as required. For example, if you have a Tuple2 parameter, the AST will end up with two QueryParameter nodes which know how to extract the first and second parameter of the tuple at a later point.

This later point is when the compiled query is applied. Executing such an AppliedCompiledFunction uses the pre-compiled SQL statement (or compiles it on the fly when you use it for the first time) and fills in the statement parameters by threading the argument value through the extractors.