Managing Haskell wreq efficiently

December 10 2018

If you have ever needed to make HTTP requests from your Haskell code, chances are that you have used the wreq library. If your work was a one-off job, or you have been using it only for one or two requests or infrequently, you might not have noticed that wreq needs managing once you are making a lot of network requests.

In my workplace deployments, backend services make lots of HTTP requests. Specifically, they make multiple HTTP requests to the same server, and there are a group of servers that they talk to. For example, we make requests to Google, AWS, Azure and Digital Ocean cloud services, and to each we make multiple requests. I have noticed, if you do not use a HTTP session manager when making network requests of the above pattern using wreq, it :

tends to use up significant memory, probably due to keeping so many TCP connections open
tends to perform not so efficiently (tends to be slower), because it would setup and teardown an entire TCP connection for every request
can even lead to TCP socket leaks (read more)

Usually browsers and other popular HTTP clients automatically manage the above by keeping TCP connections open and re-using them. But in wreq you have to be explicit about them.

Using wreq’s session manager

wreq has a Network.Wreq.Session module, which exposes a HTTP session manager. The API is straight-forward, and is used like :

import           Network.Wreq
import           Network.Wreq.Session (Session)
import qualified Network.Wreq.Session as Sess

main :: IO ()
main = do
  sess  <- Sess.newSession
  resp  <- mkGetRequest sess
  resp2 <- mkAnotherRequest sess
  ...

mkGetRequest :: Session -> IO ByteString
mkGetRequest sess = do
  resp <- Sess.get sess "http://httpbin.org/get"
  return resp

mkAnotherRequest sess = do
  Sess.get sess "http://httpbin.org/get"
  ...

The wreq documentation recommends to use the manager if you’re making multiple requests to the same server so that it can re-use TCP connections. But this documentation is hidden away in the Session module separate from the other main modules. That is why it is easy to overlook this.

Also, the newSession API creates a manager that manages cookies as well. That is, any cookie sent by a server is sent back across requests (how browsers behave) when using the same manager. This is not really desirable in backend systems unless you’re dealing with having a user session. Wreq exposes another API called newAPISession. The API usage is exactly same as newSession but this just a HTTP manager without managing any of the cookies.

import           Network.Wreq
import qualified Network.Wreq.Session as Sess

main = do
  sess <- Sess.newAPISession
  ...

Underneath, wreq uses the HTTP Manager from the http-client package for sessions. You can use the Manager directly from the http-client package as well.

Tidying things up

Finally, you would obviously not define functions that take the Session explicitly in its argument. You should have a Reader monad constraint on your functions and make the the HTTP session manager as part of your environment. Something like:

import           Control.Monad.Reader
import           Network.Wreq.Session (Session)
import qualified Network.Wreq.Session as Sess

type App r = ReaderT r IO a

main :: IO ()
main = do
  sess <- Sess.newAPISession
  res <- flip runReaderT sess $ do
    resp  <- mkGetRequest sess
    resp2 <- mkAnotherRequest sess
    ...
  print res

mkGetRequest :: (MonadReader Session) => IO ByteString
mkGetRequest = do
  sess <- ask
  resp <- Sess.get sess "http://httpbin.org/get"
  return resp

mkAnotherRequest :: (MonadReader Session) => IO ByteString
mkAnotherRequest = do
  sess <- ask
  Sess.get sess "http://httpbin.org/get"
  ...

Wreq also has another problem. It throws exceptions when the response from the server is a non-200 response. It also throws exceptions if the network connection fails. In production code we need to handle this behaviour of wreq as well to make if safer. But that I’ll probably discuss in another post.

rayanon

Musings of a human existence