2_http_comparison.qbk 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456
  1. [/
  2. Copyright (c) 2016-2019 Vinnie Falco (vinnie dot falco at gmail dot com)
  3. Distributed under the Boost Software License, Version 1.0. (See accompanying
  4. file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  5. Official repository: https://github.com/boostorg/beast
  6. ]
  7. [section HTTP Comparison to Other Libraries]
  8. There are a few C++ published libraries which implement some of the HTTP
  9. protocol. We analyze the message model chosen by those libraries and discuss
  10. the advantages and disadvantages relative to Beast.
  11. The general strategy used by the author to evaluate external libraries is
  12. as follows:
  13. * Review the message model. Can it represent a complete request or
  14. response? What level of allocator support is present? How much
  15. customization is possible?
  16. * Review the stream abstraction. This is the type of object, such as
  17. a socket, which may be used to parse or serialize (i.e. read and write).
  18. Can user defined types be specified? What's the level of conformance to
  19. to Asio or Networking-TS concepts?
  20. * Check treatment of buffers. Does the library manage the buffers
  21. or can users provide their own buffers?
  22. * How does the library handle corner cases such as trailers,
  23. Expect: 100-continue, or deferred commitment of the body type?
  24. [note
  25. Declarations examples from external libraries have been edited:
  26. portions have been removed for simplification.
  27. ]
  28. [heading cpp-netlib]
  29. [@https://github.com/cpp-netlib/cpp-netlib/tree/092cd570fb179d029d1865aade9f25aae90d97b9 [*cpp-netlib]]
  30. is a network programming library previously intended for Boost but not
  31. having gone through formal review. As of this writing it still uses the
  32. Boost name, namespace, and directory structure although the project states
  33. that Boost acceptance is no longer a goal. The library is based on Boost.Asio
  34. and bills itself as ['"a collection of network related routines/implementations
  35. geared towards providing a robust cross-platform networking library"]. It
  36. cites ['"Common Message Type"] as a feature. As of the branch previous
  37. linked, it uses these declarations:
  38. ```
  39. template <class Tag>
  40. struct basic_message {
  41. public:
  42. typedef Tag tag;
  43. typedef typename headers_container<Tag>::type headers_container_type;
  44. typedef typename headers_container_type::value_type header_type;
  45. typedef typename string<Tag>::type string_type;
  46. headers_container_type& headers() { return headers_; }
  47. headers_container_type const& headers() const { return headers_; }
  48. string_type& body() { return body_; }
  49. string_type const& body() const { return body_; }
  50. string_type& source() { return source_; }
  51. string_type const& source() const { return source_; }
  52. string_type& destination() { return destination_; }
  53. string_type const& destination() const { return destination_; }
  54. private:
  55. friend struct detail::directive_base<Tag>;
  56. friend struct detail::wrapper_base<Tag, basic_message<Tag> >;
  57. mutable headers_container_type headers_;
  58. mutable string_type body_;
  59. mutable string_type source_;
  60. mutable string_type destination_;
  61. };
  62. ```
  63. This container is the base class template used to represent HTTP messages.
  64. It uses a "tag" type style specializations for a variety of trait classes,
  65. allowing for customization of the various parts of the message. For example,
  66. a user specializes `headers_container<T>` to determine what container type
  67. holds the header fields. We note some problems with the container declaration:
  68. * The header and body containers may only be default-constructed.
  69. * No stateful allocator support.
  70. * There is no way to defer the commitment of the type for `body_` to
  71. after the headers are read in.
  72. * The message model includes a "source" and "destination." This is
  73. extraneous metadata associated with the connection which is not part
  74. of the HTTP protocol specification and belongs elsewhere.
  75. * The use of `string_type` (a customization point) for source,
  76. destination, and body suggests that `string_type` models a
  77. [*ForwardRange] whose `value_type` is `char`. This representation
  78. is less than ideal, considering that the library is built on
  79. Boost.Asio. Adapting a __DynamicBuffer__ to the required forward
  80. range destroys information conveyed by the __ConstBufferSequence__
  81. and __MutableBufferSequence__ used in dynamic buffers. The consequence
  82. is that cpp-netlib implementations will be less efficient than an
  83. equivalent __NetTS__ conforming implementation.
  84. * The library uses specializations of `string<Tag>` to change the type
  85. of string used everywhere, including the body, field name and value
  86. pairs, and extraneous metadata such as source and destination. The
  87. user may only choose a single type: field name, field values, and
  88. the body container will all use the same string type. This limits
  89. utility of the customization point. The library's use of the string
  90. trait is limited to selecting between `std::string` and `std::wstring`.
  91. We do not find this use-case compelling given the limitations.
  92. * The specialized trait classes generate a proliferation of small
  93. additional framework types. To specialize traits, users need to exit
  94. their namespace and intrude into the `boost::network::http` namespace.
  95. The way the traits are used in the library limits the usefulness
  96. of the traits to trivial purpose.
  97. * The `string<Tag> customization point constrains user defined body types
  98. to few possible strategies. There is no way to represent an HTTP message
  99. body as a filename with accompanying algorithms to store or retrieve data
  100. from the file system.
  101. The design of the message container in this library is cumbersome
  102. with its system of customization using trait specializations. The
  103. use of these customizations is extremely limited due to the way they
  104. are used in the container declaration, making the design overly
  105. complex without corresponding benefit.
  106. [heading Boost.HTTP]
  107. [@https://github.com/BoostGSoC14/boost.http/tree/45fc1aa828a9e3810b8d87e669b7f60ec100bff4 [*boost.http]]
  108. is a library resulting from the 2014 Google Summer of Code. It was submitted
  109. for a Boost formal review and rejected in 2015. It is based on Boost.Asio,
  110. and development on the library has continued to the present. As of the branch
  111. previously linked, it uses these message declarations:
  112. ```
  113. template<class Headers, class Body>
  114. struct basic_message
  115. {
  116. typedef Headers headers_type;
  117. typedef Body body_type;
  118. headers_type &headers();
  119. const headers_type &headers() const;
  120. body_type &body();
  121. const body_type &body() const;
  122. headers_type &trailers();
  123. const headers_type &trailers() const;
  124. private:
  125. headers_type headers_;
  126. body_type body_;
  127. headers_type trailers_;
  128. };
  129. typedef basic_message<boost::http::headers, std::vector<std::uint8_t>> message;
  130. template<class Headers, class Body>
  131. struct is_message<basic_message<Headers, Body>>: public std::true_type {};
  132. ```
  133. * This container cannot model a complete message. The ['start-line] items
  134. (method and target for requests, reason-phrase for responses) are
  135. communicated out of band, as is the ['http-version]. A function that
  136. operates on the message including the start line requires additional
  137. parameters. This is evident in one of the
  138. [@https://github.com/BoostGSoC14/boost.http/blob/45fc1aa828a9e3810b8d87e669b7f60ec100bff4/example/basic_router.cpp#L81 example programs].
  139. The `500` and `"OK"` arguments represent the response ['status-code] and
  140. ['reason-phrase] respectively:
  141. ```
  142. ...
  143. http::message reply;
  144. ...
  145. self->socket.async_write_response(500, string_ref("OK"), reply, yield);
  146. ```
  147. * `headers_`, `body_`, and `trailers_` may only be default-constructed,
  148. since there are no explicitly declared constructors.
  149. * There is no way to defer the commitment of the [*Body] type to after
  150. the headers are read in. This is related to the previous limitation
  151. on default-construction.
  152. * No stateful allocator support. This follows from the previous limitation
  153. on default-construction. Buffers for start-line strings must be
  154. managed externally from the message object since they are not members.
  155. * The trailers are stored in a separate object. Aside from the combinatorial
  156. explosion of the number of additional constructors necessary to fully
  157. support arbitrary forwarded parameter lists for each of the headers, body,
  158. and trailers members, the requirement to know in advance whether a
  159. particular HTTP field will be located in the headers or the trailers
  160. poses an unnecessary complication for general purpose functions that
  161. operate on messages.
  162. * The declarations imply that `std::vector` is a model of [*Body].
  163. More formally, that a body is represented by the [*ForwardRange]
  164. concept whose `value_type` is an 8-bit integer. This representation
  165. is less than ideal, considering that the library is built on
  166. Boost.Asio. Adapting a __DynamicBuffer__ to the required forward range
  167. destroys information conveyed by the __ConstBufferSequence__ and
  168. __MutableBufferSequence__ used in dynamic buffers. The consequence is
  169. that Boost.HTTP implementations will be less efficient when dealing
  170. with body containers than an equivalent __NetTS__ conforming
  171. implementation.
  172. * The [*Body] customization point constrains user defined types to
  173. very limited implementation strategies. For example, there is no way
  174. to represent an HTTP message body as a filename with accompanying
  175. algorithms to store or retrieve data from the file system.
  176. This representation addresses a narrow range of use cases. It has
  177. limited potential for customization and performance. It is more difficult
  178. to use because it excludes the start line fields from the model.
  179. [heading C++ REST SDK (cpprestsdk)]
  180. [@https://github.com/Microsoft/cpprestsdk/tree/381f5aa92d0dfb59e37c0c47b4d3771d8024e09a [*cpprestsdk]]
  181. is a Microsoft project which ['"...aims to help C++ developers connect to and
  182. interact with services"]. It offers the most functionality of the libraries
  183. reviewed here, including support for Websocket services using its websocket++
  184. dependency. It can use native APIs such as HTTP.SYS when building Windows
  185. based applications, and it can use Boost.Asio. The WebSocket module uses
  186. Boost.Asio exclusively.
  187. As cpprestsdk is developed by a large corporation, it contains quite a bit
  188. of functionality and necessarily has more interfaces. We will break down
  189. the interfaces used to model messages into more manageable pieces. This
  190. is the container used to store the HTTP header fields:
  191. ```
  192. class http_headers
  193. {
  194. public:
  195. ...
  196. private:
  197. std::map<utility::string_t, utility::string_t, _case_insensitive_cmp> m_headers;
  198. };
  199. ```
  200. This declaration is quite bare-bones. We note the typical problems of
  201. most field containers:
  202. * The container may only be default-constructed.
  203. * No support for allocators, stateful or otherwise.
  204. * There are no customization points at all.
  205. Now we analyze the structure of
  206. the larger message container. The library uses a handle/body idiom. There
  207. are two public message container interfaces, one for requests (`http_request`)
  208. and one for responses (`http_response`). Each interface maintains a private
  209. shared pointer to an implementation class. Public member function calls
  210. are routed to the internal implementation. This is the first implementation
  211. class, which forms the base class for both the request and response
  212. implementations:
  213. ```
  214. namespace details {
  215. class http_msg_base
  216. {
  217. public:
  218. http_headers &headers() { return m_headers; }
  219. _ASYNCRTIMP void set_body(const concurrency::streams::istream &instream, const utf8string &contentType);
  220. /// Set the stream through which the message body could be read
  221. void set_instream(const concurrency::streams::istream &instream) { m_inStream = instream; }
  222. /// Set the stream through which the message body could be written
  223. void set_outstream(const concurrency::streams::ostream &outstream, bool is_default) { m_outStream = outstream; m_default_outstream = is_default; }
  224. const pplx::task_completion_event<utility::size64_t> & _get_data_available() const { return m_data_available; }
  225. protected:
  226. /// Stream to read the message body.
  227. concurrency::streams::istream m_inStream;
  228. /// stream to write the msg body
  229. concurrency::streams::ostream m_outStream;
  230. http_headers m_headers;
  231. bool m_default_outstream;
  232. /// <summary> The TCE is used to signal the availability of the message body. </summary>
  233. pplx::task_completion_event<utility::size64_t> m_data_available;
  234. };
  235. ```
  236. To understand these declarations we need to first understand that cpprestsdk
  237. uses the asynchronous model defined by Microsoft's
  238. [@https://msdn.microsoft.com/en-us/library/dd504870.aspx [*Concurrency Runtime]].
  239. Identifiers from the [@https://msdn.microsoft.com/en-us/library/jj987780.aspx [*`pplx` namespace]]
  240. define common asynchronous patterns such as tasks and events. The
  241. `concurrency::streams::istream` parameter and `m_data_available` data member
  242. indicates a lack of separation of concerns. The representation of HTTP messages
  243. should not be conflated with the asynchronous model used to serialize or
  244. parse those messages in the message declarations.
  245. The next declaration forms the complete implementation class referenced by the
  246. handle in the public interface (which follows after):
  247. ```
  248. /// Internal representation of an HTTP request message.
  249. class _http_request final : public http::details::http_msg_base, public std::enable_shared_from_this<_http_request>
  250. {
  251. public:
  252. _ASYNCRTIMP _http_request(http::method mtd);
  253. _ASYNCRTIMP _http_request(std::unique_ptr<http::details::_http_server_context> server_context);
  254. http::method &method() { return m_method; }
  255. const pplx::cancellation_token &cancellation_token() const { return m_cancellationToken; }
  256. _ASYNCRTIMP pplx::task<void> reply(const http_response &response);
  257. private:
  258. // Actual initiates sending the response, without checking if a response has already been sent.
  259. pplx::task<void> _reply_impl(http_response response);
  260. http::method m_method;
  261. std::shared_ptr<progress_handler> m_progress_handler;
  262. };
  263. } // namespace details
  264. ```
  265. As before, we note that the implementation class for HTTP requests concerns
  266. itself more with the mechanics of sending the message asynchronously than
  267. it does with actually modeling the HTTP message as described in __rfc7230__:
  268. * The constructor accepting `std::unique_ptr<http::details::_http_server_context`
  269. breaks encapsulation and separation of concerns. This cannot be extended
  270. for user defined server contexts.
  271. * The "cancellation token" is stored inside the message. This breaks the
  272. separation of concerns.
  273. * The `_reply_impl` function implies that the message implementation also
  274. shares responsibility for the means of sending back an HTTP reply. This
  275. would be better if it was completely separate from the message container.
  276. Finally, here is the public class which represents an HTTP request:
  277. ```
  278. class http_request
  279. {
  280. public:
  281. const http::method &method() const { return _m_impl->method(); }
  282. void set_method(const http::method &method) const { _m_impl->method() = method; }
  283. /// Extract the body of the request message as a string value, checking that the content type is a MIME text type.
  284. /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out.
  285. pplx::task<utility::string_t> extract_string(bool ignore_content_type = false)
  286. {
  287. auto impl = _m_impl;
  288. return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->extract_string(ignore_content_type); });
  289. }
  290. /// Extracts the body of the request message into a json value, checking that the content type is application/json.
  291. /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out.
  292. pplx::task<json::value> extract_json(bool ignore_content_type = false) const
  293. {
  294. auto impl = _m_impl;
  295. return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->_extract_json(ignore_content_type); });
  296. }
  297. /// Sets the body of the message to the contents of a byte vector. If the 'Content-Type'
  298. void set_body(const std::vector<unsigned char> &body_data);
  299. /// Defines a stream that will be relied on to provide the body of the HTTP message when it is
  300. /// sent.
  301. void set_body(const concurrency::streams::istream &stream, const utility::string_t &content_type = _XPLATSTR("application/octet-stream"));
  302. /// Defines a stream that will be relied on to hold the body of the HTTP response message that
  303. /// results from the request.
  304. void set_response_stream(const concurrency::streams::ostream &stream);
  305. {
  306. return _m_impl->set_response_stream(stream);
  307. }
  308. /// Defines a callback function that will be invoked for every chunk of data uploaded or downloaded
  309. /// as part of the request.
  310. void set_progress_handler(const progress_handler &handler);
  311. private:
  312. friend class http::details::_http_request;
  313. friend class http::client::http_client;
  314. std::shared_ptr<http::details::_http_request> _m_impl;
  315. };
  316. ```
  317. It is clear from this declaration that the goal of the message model in
  318. this library is driven by its use-case (interacting with REST servers)
  319. and not to model HTTP messages generally. We note problems similar to
  320. the other declarations:
  321. * There are no compile-time customization points at all. The only
  322. customization is in the `concurrency::streams::istream` and
  323. `concurrency::streams::ostream` reference parameters. Presumably,
  324. these are abstract interfaces which may be subclassed by users
  325. to achieve custom behaviors.
  326. * The extraction of the body is conflated with the asynchronous model.
  327. * No way to define an allocator for the container used when extracting
  328. the body.
  329. * A body can only be extracted once, limiting the use of this container
  330. when using a functional programming style.
  331. * Setting the body requires either a vector or a `concurrency::streams::istream`.
  332. No user defined types are possible.
  333. * The HTTP request container conflates HTTP response behavior (see the
  334. `set_response_stream` member). Again this is likely purpose-driven but
  335. the lack of separation of concerns limits this library to only the
  336. uses explicitly envisioned by the authors.
  337. The general theme of the HTTP message model in cpprestsdk is "no user
  338. definable customizations". There is no allocator support, and no
  339. separation of concerns. It is designed to perform a specific set of
  340. behaviors. In other words, it does not follow the open/closed principle.
  341. Tasks in the Concurrency Runtime operate in a fashion similar to
  342. `std::future`, but with some improvements such as continuations which
  343. are not yet in the C++ standard. The costs of using a task based
  344. asynchronous interface instead of completion handlers is well
  345. documented: synchronization points along the call chain of composed
  346. task operations which cannot be optimized away. See:
  347. [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3747.pdf
  348. [*A Universal Model for Asynchronous Operations]] (Kohlhoff).
  349. [endsect]