How does Plug.Cowboy Work
Overview
Architecture Overview
Plug
Cowboy
Plug.Cowboy
Ranch
Receiving a request
ranch_conns_sup:start_protocol/2
Cowboy HTTP
Cowboy Stream
cowboy_stream_h
Cowboy Router
Cowboy Handler
Plug.Cowboy.Handler
Sending a response
Starting Plug.Cowboy
Plug.Cowboy.child_spec/2
Ranch supervisor
Ranch Acceptors Supervisor
Ranch Conns Supervisor
References
How does Plug.Cowboy Work
A deep dive into how the https://github.com/elixir-plug/plug_cowboy library works, this tour is not aimed at extensively covering every single aspect of the Plug.Cowboy library but rather provide a good understanding of the main mechanisms operating under the hood
Built straight from my text editor.
Try Charta for freeOverview
Today we'll look at how an HTTP request is handled by Plug and all the different systems involved in making that happen. By the end of this code tour, you'll understand the full lifecycle of an HTTP request, from a socket being opened to your response being sent back.
Architecture Overview
The overall architecture of Plug.Cowboy relies on four main components in order for it to work
Plug
Plug is a specification to help you build web endpoints. It gives you the tools to handle HTTP requests, set status codes and send responses back. Here's a very simple example:
However, Plug in itself isn't capable of handling any HTTP requests and this is precisely where Cowboy comes in
Cowboy
The actual web-server that parses and processes any incoming and outgoing request written in Erlang. Cowboy works in tandem with Ranch which we will give a brief overview below
Plug.Cowboy
Plug can use a multitude of different web servers, via different adapters. Plug.Cowboy is a very slim Plug Adapter that specifically "glues" Cowboy with Plug
Ranch
Cowboy understand how to handle the HTTP Protocol but it still does not know how to handle incoming socket connections and managing the TCP protocol, and for this it depends on Ranch to do the work.
Ranch is a socket acceptor pool toolkit for TCP protocols written in Erlang that is widely used by most web-servers in both Elixir and Erlang.
Receiving a request
In this section you'll get a better understanding of how the lifecycle of a request gets managed. To illustrate our flow, we will assume that our client is sending a simple GET request with no additional headers or body (e.g. `GET http://localhost:4000`) to our Example Application. Before we begin it is important to know that Ranch, the underlying socket acceptor used by Cowboy and subsequently Plug.Cowboy, has these two main moving pieces: * Ranch acceptors - A fleet of processes that are actively accepting connections from external clients * Ranch connections - After the acceptor accepts a new connection, it delegates to this process to actually begin processing the incoming request Everything begins with the fleet of acceptors that got initialised by the Ranch Acceptor supervisor. This fleet works in the "Transport" layer (TCP in our case which translates into the ranch_tcp module being used) waiting indefinitely for incoming connections to accept. Most socket operations are handled by `ranch_tcp`, this is for the most part a simple wrapper of OTP's `gen_tcp` which is the native interface of TCP/IP sockets for Erlang. As soon as a connection gets accepted, a Connection Socket will get returned which is then passed to ranch_tcp:controlling_process, which, according to the documentation (https://erlang.org/doc/man/gen_tcp.html#controlling_process-2) delegates the handling of the Socket to the provided process (our Connection's process in this case).
27 |
loop(LSocket, Transport, Logger, ConnsSup) ->
|
28 |
_ = case Transport:accept(LSocket, infinity) of
|
ranch_tcp:accept(...) will wait indefinitely for a incoming connection to be established in the provided socket | |
29 |
{ok, CSocket} ->
|
30 |
case Transport:controlling_process(CSocket, ConnsSup) of
|
31 |
ok ->
|
32 |
%% This call will not return until process has been started
|
33 |
%% AND we are below the maximum number of connections.
|
34 |
ranch_conns_sup:start_protocol(ConnsSup, CSocket);
|
35 |
{error, _} ->
|
36 |
Transport:close(CSocket)
|
37 |
end;
|
As soon as a connection gets accepted, we bind it to the controlling process. This process will receive socket events such as when data is pushed through or a disconnect happens. The call finishes up with invoking ranch_conns_sup:start_protocol/2 which we'll now dig deeper into.
ranch_conns_sup:start_protocol/2
We begin by sending a message (start_protocol) to the supervised connections supervisor process
71 |
start_protocol(SupPid, Socket) ->
|
72 |
SupPid ! {?MODULE, start_protocol, self(), Socket},
|
73 |
receive SupPid -> ok end.
|
Receiving a response back will unblock this function to finally return back to the acceptor |
The Conns Supervisor process, on the other hand, will be waiting for any incoming messages sent to it and , as illustrated below, we're interested in the start_protocol message that got sent by the code snippet above
117 |
receive
|
118 |
{?MODULE, start_protocol, To, Socket} ->
|
119 |
try Protocol:start_link(Ref, Socket, Transport, Opts) of
|
The cowboy_clear (Protocol) process gets initialised returning back its Pid which we will use below | |
120 |
{ok, Pid} ->
|
121 |
handshake(State, CurConns, NbChildren, Sleepers, To, Socket, Pid, Pid);
|
A handshake routine between the newly initialised cowboy_clear process and Ranch begins to take place |
225 |
handshake(State=#state{ref=Ref, transport=Transport, handshake_timeout=HandshakeTimeout,
|
226 |
max_conns=MaxConns}, CurConns, NbChildren, Sleepers, To, Socket, SupPid, ProtocolPid) ->
|
227 |
case Transport:controlling_process(Socket, ProtocolPid) of
|
The protocol process (cowboy_clear) will now be assigned to receive messages from the provided socket | |
228 |
ok ->
|
229 |
ProtocolPid ! {handshake, Ref, Transport, Socket, HandshakeTimeout},
|
Emit the handshake message, with the Socket so that cowboy_clear becomes aware of it | |
230 |
put(SupPid, active),
|
231 |
CurConns2 = CurConns + 1,
|
232 |
if CurConns2 < MaxConns ->
|
If we're still below the maximum number of connections, we can process the request straight away | |
233 |
To ! self(),
|
We send a message back to the process that initialised the conns_sup process to flag that the request is now being processed which, in turn will unblock the acceptor to start processing additional requests | |
234 |
loop(State, CurConns2, NbChildren + 1, Sleepers);
|
The started Cowboy Clear start_link function will start its own process invoking the connection_process routine
The connection_process/4 function synchronises with ranch's conns_sup, through the ranch:handshake/1 function and delegates the processing of the request to the picked protocol (cowboy_http in our case)
35 |
connection_process(Parent, Ref, Transport, Opts) ->
|
36 |
ProxyInfo = case maps:get(proxy_header, Opts, false) of
|
37 |
true ->
|
38 |
{ok, ProxyInfo0} = ranch:recv_proxy_header(Ref, 1000),
|
39 |
ProxyInfo0;
|
40 |
false ->
|
41 |
undefined
|
42 |
end,
|
43 |
{ok, Socket} = ranch:handshake(Ref),
|
A handshake with Ranch is initialised which begins listening for a handshake message from conns_sup. This is done so that our protocol can get access to the connected socket | |
44 |
%% Use cowboy_http2 directly only when 'http' is missing.
|
45 |
%% Otherwise switch to cowboy_http2 from cowboy_http.
|
46 |
%%
|
47 |
%% @todo Extend this option to cowboy_tls and allow disabling
|
48 |
%% the switch to cowboy_http2 in cowboy_http. Also document it.
|
49 |
Protocol = case maps:get(protocols, Opts, [http2, http]) of
|
50 |
[http2] -> cowboy_http2;
|
51 |
[_|_] -> cowboy_http
|
52 |
end,
|
53 |
init(Parent, Ref, Socket, Transport, ProxyInfo, Opts, Protocol).
|
Initialises the Protocol through Protocol:init/7 |
The relevant part of Ranch's handshake can be seen below, where we can see the format of the expected message (which is emitted by our connections supervision process handshake routine).
241 |
handshake(Ref, Opts) ->
|
242 |
receive {handshake, Ref, Transport, CSocket, HandshakeTimeout} ->
|
243 |
case Transport:handshake(CSocket, Opts, HandshakeTimeout) of
|
Transport handshake will return the provided CSocket back to us in our case (things would be different if we were exploring the SSL Transport layer rather than TCP) | |
244 |
OK = {ok, _} ->
|
245 |
OK;
|
Lets dive deeper into what happens when the Protocol, which in our case is Cowboy HTTP gets initialised
Cowboy HTTP
With the connection established and acknowledged we begin looking at how the request itself gets processed. This began in none other than the Cowboy Clear module we explored above, which, after going through a handshake routine with Ranch and receiving the connection socket in return, finally delegates the work of processing that same Socket to the chosen protocol module, which in our case is Cowboy HTTP Given that we're exploring the simplest flow possible, this code actually becomes a lot simpler given that we can explore most of the SSL logic in it.
159 |
init(Parent, Ref, Socket, Transport, ProxyHeader, Opts) ->
|
160 |
Peer0 = Transport:peername(Socket),
|
161 |
Sock0 = Transport:sockname(Socket),
|
162 |
Cert1 = case Transport:name() of
|
{ok, undefined} as we're not handling SSL in this code tour | |
163 |
ssl ->
|
164 |
case ssl:peercert(Socket) of
|
165 |
{error, no_peercert} ->
|
166 |
{ok, undefined};
|
167 |
Cert0 ->
|
168 |
Cert0
|
169 |
end;
|
170 |
_ ->
|
171 |
{ok, undefined}
|
172 |
end,
|
173 |
case {Peer0, Sock0, Cert1} of
|
174 |
{{ok, Peer}, {ok, Sock}, {ok, Cert}} ->
|
175 |
State = #state{
|
176 |
parent=Parent, ref=Ref, socket=Socket,
|
177 |
transport=Transport, proxy_header=ProxyHeader, opts=Opts,
|
178 |
peer=Peer, sock=Sock, cert=Cert,
|
179 |
last_streamid=maps:get(max_keepalive, Opts, 1000)},
|
180 |
setopts_active(State),
|
181 |
loop(set_timeout(State, request_timeout));
|
Loop begins to process the request |
The Loop function has a multitude of scenarios which we won't get to explore in this code tour and we will instead focus on the simplest scenario possible of parsing the incoming GET request
The State passed to parsing always gets initialised with #ps_request_line{empty_lines=0} which causes the function below to be our first match
From reading HTTP's RFC7230, we learn that the first request line should always contain the following format: request-line = method <SP> request-target <SP? HTTP-version CRLF
Therefore, our first step towards parsing the request would be the method.
To do so, we sequentially process each character until we find the first space escape character.
We then proceed to the second part of parsing the first request line, which is parsing the request-target or URI, which, for brevity we'll only explore the HTTP case since that'd be the one used in our example.
I won't bother continuing further into explaining the parsing as it is pretty much the same pattern applied until we reach the end of the line. After we're done with the first request line we move on to the headers, which pretty much follow the exact same pattern explored above as well. With the request now parsed, we can shift our attention to the function that receives the result of the parse_request/3 function. The after_parse/1 function. This function simply delegates its work to the Cowboy Stream module, which we will explore next.
Cowboy Stream
The Cowboy Stream module can be summarised as taking a list of handlers and going through them one by one until the request has been fully processed.
These handlers can be configured but for the sake of simplicity we'll just explore the default option for now (cowboy_stream_h).
cowboy_stream_h
We're getting closer to finally reaching the edge of our example application, but before we do that we still have to go through a sequence of middleware, which get passed to a request_process function. We'll be using the defaults (cowboy_router and cowboy_handler) in this tour but these could alternatively also be configured.
45 |
init(StreamID, Req=#{ref := Ref}, Opts) ->
|
46 |
Env = maps:get(env, Opts, #{}),
|
47 |
Middlewares = maps:get(middlewares, Opts, [cowboy_router, cowboy_handler]),
|
48 |
Shutdown = maps:get(shutdown_timeout, Opts, 5000),
|
49 |
Pid = proc_lib:spawn_link(?MODULE, request_process, [Req, Env, Middlewares]),
|
A process will get initialised to process the request with the Middlewares being passed to it | |
50 |
Expect = expect(Req),
|
51 |
{Commands, Next} = cowboy_stream:init(StreamID, Req, Opts),
|
52 |
{[{spawn, Pid, Shutdown}|Commands],
|
53 |
#state{next=Next, ref=Ref, pid=Pid, expect=Expect}}.
|
Each one of the provided middleware gets executed in sequence, if we follow the default sequence we'd first handle cowboy_router and then cowboy_handler
Lets then explore each Middleware in closer detail
Cowboy Router
The whole point of Cowboy Router is quite simple, it "traverses" the MFA provided by Plug.Cowboy, breaks it apart, and joins it together under a Request structure and Environment structure.
163 |
execute(Req=#{host := Host, path := Path}, Env=#{dispatch := Dispatch0}) ->
|
164 |
Dispatch = case Dispatch0 of
|
The provided MFA will be similar to: [{'_',[],[{'_',[],'Elixir.Plug.Cowboy.Handler',{'Elixir.Example.HelloWorldPlug',[]}}]}] | |
165 |
{persistent_term, Key} -> persistent_term:get(Key);
|
166 |
_ -> Dispatch0
|
167 |
end,
|
168 |
case match(Dispatch, Host, Path) of
|
169 |
{ok, Handler, HandlerOpts, Bindings, HostInfo, PathInfo} ->
|
170 |
{ok, Req#{
|
171 |
host_info => HostInfo,
|
172 |
path_info => PathInfo,
|
173 |
bindings => Bindings
|
174 |
}, Env#{
|
175 |
handler => Handler,
|
176 |
handler_opts => HandlerOpts
|
177 |
}};
|
These values ultimately get passed to the next stage of the Middleware pipeline, which turns out to be Cowboy Handler.
Cowboy Handler
We're almost at our final destination! The Cowboy Handler will simply process the Request/Environment pair provided by the Cowboy Router and invoke it. As it just so happens the Handler contained within the Environment structure, is none other than Plug.Cowboy.Handler and its options contain our HelloWorldPlug that we've implemented.
36 |
execute(Req, Env=#{handler := Handler, handler_opts := HandlerOpts}) ->
|
37 |
try Handler:init(Req, HandlerOpts) of
|
Handler = Plug.Cowboy.Handler and HandlerOpts = {HelloWorldPlug, Opts} | |
38 |
{ok, Req2, State} ->
|
39 |
Result = terminate(normal, Req2, State, Handler),
|
40 |
{ok, Req2, Env#{result => Result}};
|
41 |
{Mod, Req2, State} ->
|
42 |
Mod:upgrade(Req2, Env, Handler, State);
|
43 |
{Mod, Req2, State, Opts} ->
|
44 |
Mod:upgrade(Req2, Env, Handler, State, Opts)
|
Plug.Cowboy.Handler
The handler will in turn, call the plug we've developed at the beginning of this tour to apply our own logic based on the request
6 |
def init(req, {plug, opts}) do
|
The Handler and HandlerOpts being provided by the Cowboy Handler that we've explored above | |
7 |
conn = @connection.conn(req)
|
8 |
|
9 |
try do
|
10 |
%{adapter: {@connection, req}} =
|
11 |
conn
|
12 |
|> plug.call(opts)
|
13 |
|> maybe_send(plug)
|
14 |
|
15 |
{:ok, req, {plug, opts}}
|
and finally emit a response back to the client through the maybe_send
Sending a response
We've explored how the request gets processed all the way from being emitted to finally being processed by our own application logic. Lets now explore sending back a response to the client. It all begins with the maybe_send function we've mentioned above delegating the task of emitting the response to an "adapter" which in our case will be Plug.Cowboy.Conn
The actual action of sending the response back is in turn passed to cowboy_req.reply/4
That reply function ends up calling the do_reply function to process the actual response which also defers it to cast/2
cast/2 is quite simple, all it does is send a message to the PID handling the request (that we now know is Cowboy Clear) with the contents of the response
Cowboy Clear process, in turn, will be waiting for a message with this exact format, to begin processing it
info/3 itself calls the commands function, which, as the name suggests, actions a series of commands, one of them being sending back a response to the client as we see below
And that is it! We now understand how Plug.Cowboy both receives and sends back requests through a TCP connection
Starting Plug.Cowboy
Below, we initialise all the options to start a Plug.Cowboy process under our application supervision tree namely: * scheme - can be either http or https (depending on wether you'd like to have TLS enabled) * plug - Your plug which will process incoming requests and return a response * options - All options available to Plug.Cowboy (which will impact the underlying cowboy and ranch configuration). In this case we only make use of `port` to specify which port we want to open for listening to requests (4000) but a full list of all the options is available here: https://hexdocs.pm/plug_cowboy/Plug.Cowboy.html#module-options
If you take a closer look at our Example Application, the most common way of starting Plug.Cowboy is by specifying it under the supervision tree of your application with a series of options (such as the port number), which ends up invoking the code below.
This child_spec function is always invoked through the Supervisor.start_link/2 function as a means to initialise all applications under the supervision tree with the provided options that we've covered in the "Example App" section.
Plug.Cowboy.child_spec/2
There are two main aspects within the Plug.Cowboy.child_spec/2 that we need to understand: The first one is the Dispatcher, which, unless explicitly configured, is built from the provided plug (in this case our "HelloWorldPlug") and returns an MFA (Module, Function, Arguments) which in the end will result in our plug getting called to handle the incoming request.
The second one is the fact that Plug.Cowboy, actually starts a Ranch process underneath which we'll dive into next.
Ranch supervisor
Building on what was mentioned above, if we take a closer look at the second argument (named start) being returned by the child spec we notice the following format: {ranch_listener_sup, start_link, [Ref, Transport, TransOpts, Protocol, ProtoOpts]} This is again an MFA that will initialise the Ranch Listener process under our application's supervision tree which we will begin to explore in the next section. If we look deeper into the options being passed to the Ranch Listener process we have: * Ref - The listener name, in our case "Example.HelloWorldPlug.HTTP" built by Plug.Cowboy.build_ref/2 which combines the scheme (http) with the plug name (Example.HelloWorldPlug) * Transport - Since we are using HTTP (and not https) the module used will be "ranch_tcp" * TransportOpts0 - The options that will be provided to the ranch transport module * Protocol - Provided through the "cowboy_protocol" variable seen above, it too is inferred from the scheme being used, which in this case will result in "cowboy_clear" being used * ProtoOpts - Equally to TransportOpts0, these are the options that will be provided to the protocol ranch module
Afterwards, all these values will get returned back to Plug.Cowboy which in turn will pass them to the Application's supervisor to get initialised which results in the ranch_listener_sup.start_link/5 getting called. Now, the magic begins to happen.
Through the start_link function we begin, trough invoking the set_new_listener_opts action, by setting all our listener options (such as max connections) in an ETS table as seen below which will then be retrieved at multiple stages of our handling of a request.
Afterwards, we end up calling the init/1 function through the supervisor:start_link/2 self-referencing the ranch_listener_sup which initialises the two most important processes of this whole tour: * ranch_conns_sup - The connections supervisor * ranch_acceptors_sup - The supervisor in charge of managing our whole fleet of accepting any incoming connection
Ranch Acceptors Supervisor
The processes under this supervisor will be responsible for starting the server in the correct address and begin to accept connections, which are then delegated to the Ranch Conns Supervisor
The ranch_acceptors_sup:init function is tasked with both setting up a socket to begin listening on the configured port number (4000 in our case)
31 |
LSocket = case maps:get(socket, TransOpts, undefined) of
|
32 |
undefined ->
|
33 |
SocketOpts = maps:get(socket_opts, TransOpts, []),
|
34 |
%% We temporarily put the logger in the process dictionary
|
35 |
%% so that it can be used from ranch:filter_options. The
|
36 |
%% interface as it currently is does not allow passing it
|
37 |
%% down otherwise.
|
38 |
put(logger, Logger),
|
39 |
case Transport:listen(SocketOpts) of
|
ranch_tcp:listen/2 is used to initialise a new Socket to listen in the provided port | |
40 |
{ok, Socket} ->
|
41 |
erase(logger),
|
42 |
Socket;
|
43 |
{error, Reason} ->
|
44 |
listen_error(Ref, Transport, SocketOpts, Reason, Logger)
|
45 |
end;
|
and initialising a "fleet" of acceptors under its supervision (100 in our case), each its own process, which will accept any incoming connection being established with our server in the configured port
Ranch Conns Supervisor
The Ranch Conns Supervisor will run in an indefinite loop listening for any incoming requests to start handling.
As we've seen from the Acceptor its task is to send a message to the Conns supervisor requesting it to start the TCP protocol, which we see below
114 |
loop(State=#state{parent=Parent, ref=Ref, conn_type=ConnType,
|
115 |
transport=Transport, protocol=Protocol, opts=Opts,
|
116 |
max_conns=MaxConns, logger=Logger}, CurConns, NbChildren, Sleepers) ->
|
117 |
receive
|
Receive will wait for any incoming message | |
118 |
{?MODULE, start_protocol, To, Socket} ->
|
The "start_protocol" message sent by the acceptor got received | |
119 |
try Protocol:start_link(Ref, Socket, Transport, Opts) of
|
Protocol is :cowboy_clear as per Plug.Cowboy | |
120 |
{ok, Pid} ->
|
121 |
handshake(State, CurConns, NbChildren, Sleepers, To, Socket, Pid, Pid);
|
122 |
{ok, SupPid, ProtocolPid} when ConnType =:= supervisor ->
|
123 |
handshake(State, CurConns, NbChildren, Sleepers, To, Socket, SupPid, ProtocolPid);
|
After this our entire application is fired up and ready to begin accepting incoming requests from our clients
References
* Plug - https://github.com/elixir-plug/plug * Plug.Cowboy - https://github.com/elixir-plug/plug_cowboy * Cowboy - https://github.com/ninenines/cowboy * Ranch - https://github.com/ninenines/ranch * gen_tcp - https://erlang.org/doc/man/gen_tcp.html * HTTP's RFC7230 - https://tools.ietf.org/html/rfc7230#section-3.1.1