design_rationale.qbk 24 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665
  1. [/==============================================================================
  2. Copyright (c) 2007-2012 Barend Gehrels, Amsterdam, the Netherlands.
  3. Copyright (c) 2008-2012 Bruno Lalande, Paris, France.
  4. Copyright (c) 2009-2012 Mateusz Loskot, London, UK., London, UK
  5. Use, modification and distribution is subject to the Boost Software License,
  6. Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  7. http://www.boost.org/LICENSE_1_0.txt)
  8. ===============================================================================/]
  9. [/ note the source code in this QBK is the only not (yet) checked by a compiler]
  10. [section:design Design Rationale]
  11. Suppose you need a C++ program to calculate the distance between two points.
  12. You might define a struct:
  13. struct mypoint
  14. {
  15. double x, y;
  16. };
  17. and a function, containing the algorithm:
  18. double distance(mypoint const& a, mypoint const& b)
  19. {
  20. double dx = a.x - b.x;
  21. double dy = a.y - b.y;
  22. return sqrt(dx * dx + dy * dy);
  23. }
  24. Quite simple, and it is usable, but not generic. For a library it has to be
  25. designed way further. The design above can only be used for 2D points, for the
  26. struct [*mypoint] (and no other struct), in a Cartesian coordinate system. A
  27. generic library should be able to calculate the distance:
  28. * for any point class or struct, not on just this [*mypoint] type
  29. * in more than two dimensions
  30. * for other coordinate systems, e.g. over the earth or on a sphere
  31. * between a point and a line or between other geometry combinations
  32. * in higher precision than ['double]
  33. * avoiding the square root: often we don't want to do that because it is a relatively expensive
  34. function, and for comparing distances it is not necessary
  35. In this and following sections we will make the design step by step more generic.
  36. [heading Using Templates]
  37. The distance function can be changed into a template function. This is trivial and allows
  38. calculating the distance between other point types than just [*mypoint]. We add two template
  39. parameters, allowing input of two different point types.
  40. template <typename P1, typename P2>
  41. double distance(P1 const& a, P2 const& b)
  42. {
  43. double dx = a.x - b.x;
  44. double dy = a.y - b.y;
  45. return std::sqrt(dx * dx + dy * dy);
  46. }
  47. This template version is slightly better, but not much.
  48. Consider a C++ class where member variables are protected...
  49. Such a class does not allow to access `x` and `y` members directly. So, this paragraph is short
  50. and we just move on.
  51. [heading Using Traits]
  52. We need to take a generic approach and allow any point type as input to the distance function.
  53. Instead of accessing `x` and `y` members, we will add a few levels of indirection, using a
  54. traits system. The function then becomes:
  55. template <typename P1, typename P2>
  56. double distance(P1 const& a, P2 const& b)
  57. {
  58. double dx = get<0>(a) - get<0>(b);
  59. double dy = get<1>(a) - get<1>(b);
  60. return std::sqrt(dx * dx + dy * dy);
  61. }
  62. This adapted distance function uses a generic get function, with dimension as a template parameter,
  63. to access the coordinates of a point. This get forwards to the traits system, defined as following:
  64. namespace traits
  65. {
  66. template <typename P, int D>
  67. struct access {};
  68. }
  69. which is then specialized for our [*mypoint] type, implementing a static method called `get`:
  70. namespace traits
  71. {
  72. template <>
  73. struct access<mypoint, 0>
  74. {
  75. static double get(mypoint const& p)
  76. {
  77. return p.x;
  78. }
  79. };
  80. // same for 1: p.y
  81. ...
  82. }
  83. Calling `traits::access<mypoint, 0>::get(a)` now returns us our `x` coordinate. Nice, isn't it?
  84. It is too verbose for a function like this, used so often in the library. We can shorten the syntax
  85. by adding an extra free function:
  86. template <int D, typename P>
  87. inline double get(P const& p)
  88. {
  89. return traits::access<P, D>::get(p);
  90. }
  91. This enables us to call `get<0>(a)`, for any point having the traits::access specialization,
  92. as shown in the distance algorithm at the start of this paragraph. So we wanted to enable classes
  93. with methods like `x()`, and they are supported as long as there is a specialization of the access
  94. `struct` with a static `get` function returning `x()` for dimension 0, and similar for 1 and `y()`.
  95. [heading Dimension Agnosticism]
  96. Now we can calculate the distance between points in 2D, points of any structure or class.
  97. However, we wanted to have 3D as well. So we have to make it dimension agnostic. This complicates
  98. our distance function. We can use a `for` loop to walk through dimensions, but for loops have
  99. another performance than the straightforward coordinate addition which was there originally.
  100. However, we can make more usage of templates and make the distance algorithm as following,
  101. more complex but attractive for template fans:
  102. template <typename P1, typename P2, int D>
  103. struct pythagoras
  104. {
  105. static double apply(P1 const& a, P2 const& b)
  106. {
  107. double d = get<D-1>(a) - get<D-1>(b);
  108. return d * d + pythagoras<P1, P2, D-1>::apply(a, b);
  109. }
  110. };
  111. template <typename P1, typename P2 >
  112. struct pythagoras<P1, P2, 0>
  113. {
  114. static double apply(P1 const&, P2 const&)
  115. {
  116. return 0;
  117. }
  118. };
  119. The distance function is calling that `pythagoras` structure, specifying the number of dimensions:
  120. template <typename P1, typename P2>
  121. double distance(P1 const& a, P2 const& b)
  122. {
  123. BOOST_STATIC_ASSERT(( dimension<P1>::value == dimension<P2>::value ));
  124. return sqrt(pythagoras<P1, P2, dimension<P1>::value>::apply(a, b));
  125. }
  126. The dimension which is referred to is defined using another traits class:
  127. ``
  128. namespace traits
  129. {
  130. template <typename P>
  131. struct dimension {};
  132. }
  133. ``
  134. which has to be specialized again for the `struct mypoint`.
  135. Because it only has to publish a value, we conveniently derive it from the
  136. __boost_mpl__ `class boost::mpl::int_`:
  137. ``
  138. namespace traits
  139. {
  140. template <>
  141. struct dimension<mypoint> : boost::mpl::int_<2>
  142. {};
  143. }
  144. ``
  145. Like the free get function, the library also contains a dimension meta-function.
  146. template <typename P>
  147. struct dimension : traits::dimension<P>
  148. {};
  149. Below is explained why the extra declaration is useful. Now we have agnosticism in the number of
  150. dimensions. Our more generic distance function now accepts points of three or more dimensions.
  151. The compile-time assertion will prevent point a having two dimension and point b having
  152. three dimensions.
  153. [heading Coordinate Type]
  154. We assumed double above. What if our points are in integer?
  155. We can easily add a traits class, and we will do that. However, the distance between two integer
  156. coordinates can still be a fractionized value. Besides that, a design goal was to avoid square
  157. roots. We handle these cases below, in another paragraph. For the moment we keep returning double,
  158. but we allow integer coordinates for our point types. To define the coordinate type, we add
  159. another traits class, `coordinate_type`, which should be specialized by the library user:
  160. namespace traits
  161. {
  162. template <typename P>
  163. struct coordinate_type{};
  164. // specialization for our mypoint
  165. template <>
  166. struct coordinate_type<mypoint>
  167. {
  168. typedef double type;
  169. };
  170. }
  171. Like the access function, where we had a free get function, we add a proxy here as well.
  172. A longer version is presented later on, the short function would look like this:
  173. template <typename P>
  174. struct coordinate_type : traits::coordinate_type<P> {};
  175. We now can modify our distance algorithm again. Because it still returns double, we only
  176. modify the `pythagoras` computation class. It should return the coordinate type of its input.
  177. But, it has two input, possibly different, point types. They might also differ in their
  178. coordinate types. Not that that is very likely, but we’re designing a generic library and we
  179. should handle those strange cases. We have to choose one of the coordinate types and of course
  180. we select the one with the highest precision. This is not worked out here, it would be too long,
  181. and it is not related to geometry. We just assume that there is a meta-function `select_most_precise`
  182. selecting the best type.
  183. So our computation class becomes:
  184. template <typename P1, typename P2, int D>
  185. struct pythagoras
  186. {
  187. typedef typename select_most_precise
  188. <
  189. typename coordinate_type<P1>::type,
  190. typename coordinate_type<P2>::type
  191. >::type computation_type;
  192. static computation_type apply(P1 const& a, P2 const& b)
  193. {
  194. computation_type d = get<D-1>(a) - get<D-1>(b);
  195. return d * d + pythagoras <P1, P2, D-1> ::apply(a, b);
  196. }
  197. };
  198. [heading Different Geometries]
  199. We have designed a dimension agnostic system supporting any point type of any
  200. coordinate type. There are still some tweaks but they will be worked out later.
  201. Now we will see how we calculate the distance between a point and a polygon, or
  202. between a point and a line-segment. These formulae are more complex, and the
  203. influence on design is even larger. We don’t want to add a function with another
  204. name:
  205. template <typename P, typename S>
  206. double distance_point_segment(P const& p, S const& s)
  207. We want to be generic, the distance function has to be called from code not
  208. knowing the type of geometry it handles, so it has to be named distance. We also
  209. cannot create an overload because that would be ambiguous, having the same
  210. template signature. There are two solutions:
  211. * tag dispatching
  212. * SFINAE
  213. We select tag dispatching because it fits into the traits system. The earlier
  214. versions (previews) of Boost.Geometry used SFINAE but we found it had several
  215. drawbacks for such a big design, so the switch to tag dispatching was made.
  216. With tag dispatching the distance algorithm inspects the type of geometry of the
  217. input parameters. The distance function will be changed into this:
  218. template <typename G1, typename G2>
  219. double distance(G1 const& g1, G2 const& g2)
  220. {
  221. return dispatch::distance
  222. <
  223. typename tag<G1>::type,
  224. typename tag<G2>::type,
  225. G1, G2
  226. >::apply(g1, g2);
  227. }
  228. It is referring to the tag meta-function and forwarding the call to the [*apply] method of
  229. a ['dispatch::distance] structure. The [*tag] meta-function is another traits class, and should
  230. be specialized for per point type, both shown here:
  231. namespace traits
  232. {
  233. template <typename G>
  234. struct tag {};
  235. // specialization
  236. template <>
  237. struct tag<mypoint>
  238. {
  239. typedef point_tag type;
  240. };
  241. }
  242. Free meta-function, like coordinate_system and get:
  243. template <typename G>
  244. struct tag : traits::tag<G> {};
  245. [*Tags] (`point_tag`, `segment_tag`, etc) are empty structures with the purpose to specialize a
  246. dispatch struct. The dispatch struct for distance, and its specializations, are all defined
  247. in a separate namespace and look like the following:
  248. namespace dispatch {
  249. template < typename Tag1, typename Tag2, typename G1, typename G2 >
  250. struct distance
  251. {};
  252. template <typename P1, typename P2>
  253. struct distance < point_tag, point_tag, P1, P2 >
  254. {
  255. static double apply(P1 const& a, P2 const& b)
  256. {
  257. // here we call pythagoras
  258. // exactly like we did before
  259. ...
  260. }
  261. };
  262. template <typename P, typename S>
  263. struct distance
  264. <
  265. point_tag, segment_tag, P, S
  266. >
  267. {
  268. static double apply(P const& p, S const& s)
  269. {
  270. // here we refer to another function
  271. // implementing point-segment
  272. // calculations in 2 or 3
  273. // dimensions...
  274. ...
  275. }
  276. };
  277. // here we might have many more
  278. // specializations,
  279. // for point-polygon, box-circle, etc.
  280. } // namespace
  281. So yes, it is possible; the distance algorithm is generic now in the sense that it also
  282. supports different geometry types. One drawback: we have to define two dispatch specializations
  283. for point - segment and for segment - point separately. That will also be solved, in the paragraph
  284. reversibility below. The example below shows where we are now: different point types,
  285. geometry types, dimensions.
  286. point a(1,1);
  287. point b(2,2);
  288. std::cout << distance(a,b) << std::endl;
  289. segment s1(0,0,5,3);
  290. std::cout << distance(a, s1) << std::endl;
  291. rgb red(255, 0, 0);
  292. rbc orange(255, 128, 0);
  293. std::cout << "color distance: " << distance(red, orange) << std::endl;
  294. [heading Kernel Revisited]
  295. We described above that we had a traits class `coordinate_type`, defined in namespace traits,
  296. and defined a separate `coordinate_type` class as well. This was actually not really necessary
  297. before, because the only difference was the namespace clause. But now that we have another
  298. geometry type, a segment in this case, it is essential. We can call the `coordinate_type`
  299. meta-function for any geometry type, point, segment, polygon, etc, implemented again by tag dispatching:
  300. template <typename G>
  301. struct coordinate_type
  302. {
  303. typedef typename dispatch::coordinate_type
  304. <
  305. typename tag<G>::type, G
  306. >::type type;
  307. };
  308. Inside the dispatch namespace this meta-function is implemented twice: a generic version and
  309. one specialization for points. The specialization for points calls the traits class.
  310. The generic version calls the point specialization, as a sort of recursive meta-function definition:
  311. namespace dispatch
  312. {
  313. // Version for any geometry:
  314. template <typename GeometryTag, typename G>
  315. struct coordinate_type
  316. {
  317. typedef typename point_type
  318. <
  319. GeometryTag, G
  320. >::type point_type;
  321. // Call specialization on point-tag
  322. typedef typename coordinate_type < point_tag, point_type >::type type;
  323. };
  324. // Specialization for point-type:
  325. template <typename P>
  326. struct coordinate_type<point_tag, P>
  327. {
  328. typedef typename
  329. traits::coordinate_type<P>::type
  330. type;
  331. };
  332. }
  333. So it calls another meta-function point_type. This is not elaborated in here but realize that it
  334. is available for all geometry types, and typedefs the point type which makes up the geometry,
  335. calling it type.
  336. The same applies for the meta-function dimension and for the upcoming meta-function coordinate system.
  337. [heading Coordinate System]
  338. Until here we assumed a Cartesian system. But we know that the Earth is not flat.
  339. Calculating a distance between two GPS-points with the system above would result in nonsense.
  340. So we again extend our design. We define for each point type a coordinate system type
  341. using the traits system again. Then we make the calculation dependant on that coordinate system.
  342. Coordinate system is similar to coordinate type, a meta-function, calling a dispatch function
  343. to have it for any geometry-type, forwarding to its point specialization, and finally calling
  344. a traits class, defining a typedef type with a coordinate system. We don’t show that all here again.
  345. We only show the definition of a few coordinate systems:
  346. struct cartesian {};
  347. template<typename DegreeOrRadian>
  348. struct geographic
  349. {
  350. typedef DegreeOrRadian units;
  351. };
  352. So Cartesian is simple, for geographic we can also select if its coordinates are stored in degrees
  353. or in radians.
  354. The distance function will now change: it will select the computation method for the corresponding
  355. coordinate system and then call the dispatch struct for distance. We call the computation method
  356. specialized for coordinate systems a strategy. So the new version of the distance function is:
  357. template <typename G1, typename G2>
  358. double distance(G1 const& g1, G2 const& g2)
  359. {
  360. typedef typename strategy_distance
  361. <
  362. typename coordinate_system<G1>::type,
  363. typename coordinate_system<G2>::type,
  364. typename point_type<G1>::type,
  365. typename point_type<G2>::type,
  366. dimension<G1>::value
  367. >::type strategy;
  368. return dispatch::distance
  369. <
  370. typename tag<G1>::type,
  371. typename tag<G2>::type,
  372. G1, G2, strategy
  373. >::apply(g1, g2, strategy());
  374. }
  375. The strategy_distance mentioned here is a struct with specializations for different coordinate
  376. systems.
  377. template <typename T1, typename T2, typename P1, typename P2, int D>
  378. struct strategy_distance
  379. {
  380. typedef void type;
  381. };
  382. template <typename P1, typename P2, int D>
  383. struct strategy_distance<cartesian, cartesian, P1, P2, D>
  384. {
  385. typedef pythagoras<P1, P2, D> type;
  386. };
  387. So, here is our `pythagoras` again, now defined as a strategy. The distance dispatch function just
  388. calls its apply method.
  389. So this is an important step: for spherical or geographical coordinate systems, another
  390. strategy (computation method) can be implemented. For spherical coordinate systems
  391. have the haversine formula. So the dispatching traits struct is specialized like this
  392. template <typename P1, typename P2, int D = 2>
  393. struct strategy_distance<spherical, spherical, P1, P2, D>
  394. {
  395. typedef haversine<P1, P2> type;
  396. };
  397. // struct haversine with apply function
  398. // is omitted here
  399. For geography, we have some alternatives for distance calculation. There is the Andoyer method,
  400. fast and precise, and there is the Vincenty method, slower and more precise, and there are some
  401. less precise approaches as well.
  402. Per coordinate system, one strategy is defined as the default strategy. To be able to use
  403. another strategy as well, we modify our design again and add an overload for the distance
  404. algorithm, taking a strategy object as a third parameter.
  405. This new overload distance function also has the advantage that the strategy can be constructed
  406. outside the distance function. Because it was constructed inside above, it could not have
  407. construction parameters. But for Andoyer or Vincenty, or the haversine formula, it certainly
  408. makes sense to have a constructor taking the radius of the earth as a parameter.
  409. So, the distance overloaded function is:
  410. template <typename G1, typename G2, typename S>
  411. double distance(G1 const& g1, G2 const& g2, S const& strategy)
  412. {
  413. return dispatch::distance
  414. <
  415. typename tag<G1>::type,
  416. typename tag<G2>::type,
  417. G1, G2, S
  418. >::apply(g1, g2, strategy);
  419. }
  420. The strategy has to have a method apply taking two points as arguments (for points). It is not
  421. required that it is a static method. A strategy might define a constructor, where a configuration
  422. value is passed and stored as a member variable. In those cases a static method would be
  423. inconvenient. It can be implemented as a normal method (with the const qualifier).
  424. We do not list all implementations here, Vincenty would cover half a page of mathematics,
  425. but you will understand the idea. We can call distance like this:
  426. distance(c1, c2)
  427. where `c1` and `c2` are Cartesian points, or like this:
  428. distance(g1, g2)
  429. where `g1` and `g2` are Geographic points, calling the default strategy for Geographic
  430. points (e.g. Andoyer), and like this:
  431. distance(g1, g2, vincenty<G1, G2>(6275))
  432. where a strategy is specified explicitly and constructed with a radius.
  433. [/ 'TODO: What to do with this section? We have concepts already --mloskot]
  434. [heading Point Concept]
  435. The five traits classes mentioned in the previous sections form together the
  436. Point Concept. Any point type for which specializations are implemented in
  437. the traits namespace should be accepted as a valid type. So the Point Concept
  438. consists of:
  439. * a specialization for `traits::tag`
  440. * a specialization for `traits::coordinate_system`
  441. * a specialization for `traits::coordinate_type`
  442. * a specialization for `traits::dimension`
  443. * a specialization for `traits::access`
  444. The last one is a class, containing the method get and the (optional) method
  445. set, the first four are metafunctions, either defining type or declaring a
  446. value (conform MPL conventions).
  447. So we now have agnosticism for the number of dimensions, agnosticism for
  448. coordinate systems; the design can handle any coordinate type, and it can
  449. handle different geometry types. Furthermore we can specify our own
  450. strategies, the code will not compile in case of two points with different
  451. dimensions (because of the assertion), and it will not compile for two points
  452. with different coordinate systems (because there is no specialization).
  453. A library can check if a point type fulfills the requirements imposed by the
  454. concepts. This is handled in the upcoming section Concept Checking.
  455. [heading Return Type]
  456. We promised that calling `std::sqrt` was not always necessary. So we define a distance result `struct`
  457. that contains the squared value and is convertible to a double value. This, however, only has to
  458. be done for `pythagoras`. The spherical distance functions do not take the square root so for them
  459. it is not necessary to avoid the expensive square root call; they can just return their distance.
  460. So the distance result struct is dependant on strategy, therefore made a member type of
  461. the strategy. The result struct looks like this:
  462. template<typename T = double>
  463. struct cartesian_distance
  464. {
  465. T sq;
  466. explicit cartesian_distance(T const& v) : sq (v) {}
  467. inline operator T() const
  468. {
  469. return std::sqrt(sq);
  470. }
  471. };
  472. It also has operators defined to compare itself to other results without taking the square root.
  473. Each strategy should define its return type, within the strategy class, for example:
  474. typedef cartesian_distance<T> return_type;
  475. or:
  476. typedef double return_type;
  477. for cartesian (pythagoras) and spherical, respectively.
  478. Again our distance function will be modified, as expected, to reflect the new return type.
  479. For the overload with a strategy it is not complex:
  480. template < typename G1, typename G2, typename Strategy >
  481. typename Strategy::return_type distance( G1 const& G1 , G2 const& G2 , S const& strategy)
  482. But for the one without strategy we have to select strategy, coordinate type, etc.
  483. It would be spacious to do it in one line so we add a separate meta-function:
  484. template <typename G1, typename G2 = G1>
  485. struct distance_result
  486. {
  487. typedef typename point_type<G1>::type P1;
  488. typedef typename point_type<G2>::type P2;
  489. typedef typename strategy_distance
  490. <
  491. typename cs_tag<P1>::type,
  492. typename cs_tag<P2>::type,
  493. P1, P2
  494. >::type S;
  495. typedef typename S::return_type type;
  496. };
  497. and modify our distance function:
  498. template <typename G1, typename G2>
  499. inline typename distance_result<G1, G2>::type distance(G1 const& g1, G2 const& g2)
  500. {
  501. // ...
  502. }
  503. Of course also the apply functions in the dispatch specializations will return a result like this.
  504. They have a strategy as a template parameter everywhere, making the less verbose version possible.
  505. [heading Axis Order]
  506. This is an important note for users who develop GIS applications, require compliance with OGC
  507. standards or use variety of coordinate reference systems (CRS).
  508. Boost.Geometry convention is that coordinate values of 2D tuple is always given according to
  509. mathematical axis order: X,Y. Considering GIS applications, that always means easting,northing or,
  510. in terms of geographic coordinate system: longitude,latitude.
  511. Boost.Geometry does not allow or respect coordinate values listed in the axis order as
  512. specified by coordinate reference system (CRS).
  513. In practice, users may easily adapt point types with alternate axis ordering and
  514. specify coordinate access in order as expected by Boost.Geometry.
  515. [heading Summary]
  516. In this design rationale, __boost_geometry__ is step by step designed using tag dispatching,
  517. concepts, traits, and metaprogramming. We used the well-known distance function
  518. to show the design.
  519. __boost_geometry__ is designed like described here, with
  520. some more techniques as automatically reversing template arguments, tag casting,
  521. and reusing implementation classes or dispatch classes as policies in other
  522. dispatch classes.
  523. [endsect] [/ end of section Design Rationale]