How many routes does cowboy support?

1.8k Views Asked by At

I'm looking for a way, in Cowboy, to map arbitrary paths (stored in the database) to specific blog posts.

That is: I've got a few thousand blog posts, which are accessible via a few names each, such as the canonical URL (e.g. /post/42), some aliases (e.g. /2013/11/25/erlang-rocks), historical locations (e.g. /path-on-old-blog/12345), etc.

I know that I could simply use a catch-all route:

{ "/[...]", catch_all_handler, [] },

...and then look up the path in the database, but I was considering creating the routes from the database, as follows:

Posts = posts:all(),
Paths = [get_handlers_for_post(P) || P <- Posts],
Routes = lists:flatten(Paths),

get_handler_for_post(P) ->
    % Generate a list of paths with IDs from the database.
    % Return something that looks like this:
    %  [{"/canonical/1", post_handler, [1]},
    %   {"/first-alias", post_handler, [1]}].
% TODO: code goes here...

That is: put all the possible paths in the router, pointing to the same handler, each with the ID of the post.

The question is: Is this sensible? How many routes does cowboy support?

2

There are 2 best solutions below

3
On

You can do it but there is no need. Cowboy has a very effective pattern matching syntax in routing. Let us take the routes that you have given in your example for instance

[{"/canonical/1", post_handler, [1]},
 {"/first-alias", post_handler, [1]}].

The fist url has an additional path which is optional. In cowboy you can represent these two routes as

"/:first/[:second]"

This matches /canonical/1 as well as /first-alias

Both first and second and parametrized and they can take any values. The square brackets around :second indicate that this is optional. The above pattern will match both of your supplied routes.

So how do you actually access these parameters in route handlers?

It is simple really. Cowboy provides a a binding method in the cowboy_req module and you can access parameters of your url from there like so

cowboy_req:binding(first,Req)

In case of your first url this will return {<<"canonical">>,Req}.

Notice that the argument is an atom. Use parameters and optional parameters and you should be able to match your entire url collection.

Read more on routing here

More explanation

As I understand you have thousands of different blog posts and their url's are not consistent. What I suggest rather than creating routes dynamically find patterns of consistent url and group them in a route. Fall back occurs automatically in cowboy. If it does not match on pattern it looks to the other and so on.

for instance

\:a\:b

will match

\hello\man,hello\world,\hello\slash\

won't match hello\man\world.

\:a\:b\[:c]

will match \hello\man,hello\world,hello\man\world

There is no hard limit on number of routes. You can have as many as you need.

0
On

Asked long ago, still interesting.

No, I don't think generated Cowboy routing rules is an efficient way to do a lookup on a large set of unstructured paths.

The dispatch rules produced by cowboy_router:compile/1 is a structure of tuples, lists and binaries, like this:

[{'_',[],
      [{[<<"canonical">>,<<"1">>],[],post_handler,[1]},
       {[<<"first-alias">>],[],post_handler,[1]}]}]

Routing is a linear search in this structure. It is copied to every request handler process, so if it is very large, the copying would have a significant overhead per request.

In recent versions of Cowboy, the routes can be stored in persistent_term, which eliminates the copying. It is still a linear search though.

For a large set of unstructured paths, I believe an ETS table lookup would be more efficient, since it's implemented as a hash table.

Another option I want to mention, since you're considering code generation, is to generate an Erlang module containing a function which does the lookup. This eliminates copying and can benefit from compiler optimizations of pattern matching.

%% Generated module
-module(blog_path_aliases).
-export([lookup/1]).
lookup(<<"/2013/11/25/erlang-rocks">>) -> 42;
lookup(<<"/path-on-old-blog/12345">>) -> 42;
lookup(<<"/some-other/path">>) -> 123;
...