gradient.rst 40 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980
  1. Tutorial: Image Gradient
  2. ========================
  3. .. contents::
  4. :local:
  5. :depth: 1
  6. This comprehensive (and long) tutorial will walk you through an example of
  7. using GIL to compute the image gradients.
  8. We will start with some very simple and non-generic code and make it more
  9. generic as we go along. Let us start with a horizontal gradient and use the
  10. simplest possible approximation to a gradient - central difference.
  11. The gradient at pixel x can be approximated with the half-difference of its
  12. two neighboring pixels::
  13. D[x] = (I[x-1] - I[x+1]) / 2
  14. For simplicity, we will also ignore the boundary cases - the pixels along the
  15. edges of the image for which one of the neighbors is not defined. The focus of
  16. this document is how to use GIL, not how to create a good gradient generation
  17. algorithm.
  18. Interface and Glue Code
  19. -----------------------
  20. Let us first start with 8-bit unsigned grayscale image as the input and 8-bit
  21. signed grayscale image as the output.
  22. Here is how the interface to our algorithm looks like:
  23. .. code-block:: cpp
  24. #include <boost/gil.hpp>
  25. using namespace boost::gil;
  26. void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  27. {
  28. assert(src.dimensions() == dst.dimensions());
  29. ... // compute the gradient
  30. }
  31. ``gray8c_view_t`` is the type of the source image view - an 8-bit grayscale
  32. view, whose pixels are read-only (denoted by the "c").
  33. The output is a grayscale view with a 8-bit signed (denoted by the "s")
  34. integer channel type. See Appendix 1 for the complete convention GIL uses to
  35. name concrete types.
  36. GIL makes a distinction between an image and an image view.
  37. A GIL **image view**, is a shallow, lightweight view of a rectangular grid of
  38. pixels. It provides access to the pixels but does not own the pixels.
  39. Copy-constructing a view does not deep-copy the pixels. Image views do not
  40. propagate their constness to the pixels and should always be taken by a const
  41. reference. Whether a view is mutable or read-only (immutable) is a property of
  42. the view type.
  43. A GIL `image`, on the other hand, is a view with associated ownership.
  44. It is a container of pixels; its constructor/destructor allocates/deallocates
  45. the pixels, its copy-constructor performs deep-copy of the pixels and its
  46. ``operator==`` performs deep-compare of the pixels. Images also propagate
  47. their constness to their pixels - a constant reference to an image will not
  48. allow for modifying its pixels.
  49. Most GIL algorithms operate on image views; images are rarely
  50. needed. GIL's design is very similar to that of the STL. The STL
  51. equivalent of GIL's image is a container, like ``std::vector``,
  52. whereas GIL's image view corresponds to STL range, which is often
  53. represented with a pair of iterators. STL algorithms operate on
  54. ranges, just like GIL algorithms operate on image views.
  55. GIL's image views can be constructed from raw data - the dimensions,
  56. the number of bytes per row and the pixels, which for chunky views are
  57. represented with one pointer. Here is how to provide the glue between
  58. your code and GIL:
  59. .. code-block:: cpp
  60. void ComputeXGradientGray8(
  61. unsigned char const* src_pixels, ptrdiff_t src_row_bytes,
  62. int w, int h,
  63. signed char* dst_pixels, ptrdiff_t dst_row_bytes)
  64. {
  65. gray8c_view_t src = interleaved_view(w, h, (gray8_pixel_t const*)src_pixels, src_row_bytes);
  66. gray8s_view_t dst = interleaved_view(w, h, (gray8s_pixel_t*)dst_pixels, dst_row_bytes);
  67. x_gradient(src, dst);
  68. }
  69. This glue code is very fast and views are lightweight - in the above example
  70. the views have a size of 16 bytes. They consist of a pointer to the top left
  71. pixel and three integers - the width, height, and number of bytes per row.
  72. First Implementation
  73. --------------------
  74. Focusing on simplicity at the expense of speed, we can compute the horizontal
  75. gradient like this:
  76. .. code-block:: cpp
  77. void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  78. {
  79. for (int y = 0; y < src.height(); ++y)
  80. for (int x = 1; x < src.width() - 1; ++x)
  81. dst(x, y) = (src(x-1, y) - src(x+1, y)) / 2;
  82. }
  83. We use image view's ``operator(x,y)`` to get a reference to the pixel at a
  84. given location and we set it to the half-difference of its left and right
  85. neighbors. ``operator()`` returns a reference to a grayscale pixel.
  86. A grayscale pixel is convertible to its channel type (``unsigned char`` for
  87. ``src``) and it can be copy-constructed from a channel.
  88. (This is only true for grayscale pixels).
  89. While the above code is easy to read, it is not very fast, because the binary
  90. ``operator()`` computes the location of the pixel in a 2D grid, which involves
  91. addition and multiplication. Here is a faster version of the above:
  92. .. code-block:: cpp
  93. void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  94. {
  95. for (int y = 0; y < src.height(); ++y)
  96. {
  97. gray8c_view_t::x_iterator src_it = src.row_begin(y);
  98. gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
  99. for (int x=1; x < src.width() - 1; ++x)
  100. dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
  101. }
  102. }
  103. We use pixel iterators initialized at the beginning of each row. GIL's
  104. iterators are Random Access Traversal iterators. If you are not
  105. familiar with random access iterators, think of them as if they were
  106. pointers. In fact, in the above example the two iterator types are raw
  107. C pointers and their ``operator[]`` is a fast pointer indexing
  108. operator.
  109. The code to compute gradient in the vertical direction is very
  110. similar:
  111. .. code-block: cpp
  112. void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  113. {
  114. for (int x = 0; x < src.width(); ++x)
  115. {
  116. gray8c_view_t::y_iterator src_it = src.col_begin(x);
  117. gray8s_view_t::y_iterator dst_it = dst.col_begin(x);
  118. for (int y = 1; y < src.height() - 1; ++y)
  119. dst_it[y] = (src_it[y-1] - src_it[y+1]) / 2;
  120. }
  121. }
  122. Instead of looping over the rows, we loop over each column and create a
  123. ``y_iterator``, an iterator moving vertically. In this case a simple pointer
  124. cannot be used because the distance between two adjacent pixels equals the
  125. number of bytes in each row of the image. GIL uses here a special step
  126. iterator class whose size is 8 bytes - it contains a raw C pointer and a step.
  127. Its ``operator[]`` multiplies the index by its step.
  128. The above version of ``y_gradient``, however, is much slower (easily an order
  129. of magnitude slower) than ``x_gradient`` because of the memory access pattern;
  130. traversing an image vertically results in lots of cache misses. A much more
  131. efficient and cache-friendly version will iterate over the columns in the inner
  132. loop:
  133. .. code-block:: cpp
  134. void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  135. {
  136. for (int y = 1; y < src.height() - 1; ++y)
  137. {
  138. gray8c_view_t::x_iterator src1_it = src.row_begin(y-1);
  139. gray8c_view_t::x_iterator src2_it = src.row_begin(y+1);
  140. gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
  141. for (int x = 0; x < src.width(); ++x)
  142. {
  143. *dst_it = ((*src1_it) - (*src2_it)) / 2;
  144. ++dst_it;
  145. ++src1_it;
  146. ++src2_it;
  147. }
  148. }
  149. }
  150. This sample code also shows an alternative way of using pixel iterators -
  151. instead of ``operator[]`` one could use increments and dereferences.
  152. Using Locators
  153. --------------
  154. Unfortunately this cache-friendly version requires the extra hassle of
  155. maintaining two separate iterators in the source view. For every pixel, we
  156. want to access its neighbors above and below it. Such relative access can be
  157. done with GIL locators:
  158. .. code-block:: cpp
  159. void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  160. {
  161. gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
  162. for (int y = 1; y < src.height() - 1; ++y)
  163. {
  164. gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
  165. for (int x = 0; x < src.width(); ++x)
  166. {
  167. (*dst_it) = (src_loc(0,-1) - src_loc(0,1)) / 2;
  168. ++dst_it;
  169. ++src_loc.x(); // each dimension can be advanced separately
  170. }
  171. src_loc+=point<std::ptrdiff_t>(-src.width(), 1); // carriage return
  172. }
  173. }
  174. The first line creates a locator pointing to the first pixel of the
  175. second row of the source view. A GIL pixel locator is very similar to
  176. an iterator, except that it can move both horizontally and
  177. vertically. ``src_loc.x()`` and ``src_loc.y()`` return references to a
  178. horizontal and a vertical iterator respectively, which can be used to
  179. move the locator along the desired dimension, as shown
  180. above. Additionally, the locator can be advanced in both dimensions
  181. simultaneously using its ``operator+=`` and ``operator-=``. Similar to
  182. image views, locators provide binary ``operator()`` which returns a
  183. reference to a pixel with a relative offset to the current locator
  184. position. For example, ``src_loc(0,1)`` returns a reference to the
  185. neighbor below the current pixel. Locators are very lightweight
  186. objects - in the above example the locator has a size of 8 bytes - it
  187. consists of a raw pointer to the current pixel and an int indicating
  188. the number of bytes from one row to the next (which is the step when
  189. moving vertically). The call to ``++src_loc.x()`` corresponds to a
  190. single C pointer increment. However, the example above performs more
  191. computations than necessary. The code ``src_loc(0,1)`` has to compute
  192. the offset of the pixel in two dimensions, which is slow. Notice
  193. though that the offset of the two neighbors is the same, regardless of
  194. the pixel location. To improve the performance, GIL can cache and
  195. reuse this offset::
  196. void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  197. {
  198. gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
  199. gray8c_view_t::xy_locator::cached_location_t above = src_loc.cache_location(0,-1);
  200. gray8c_view_t::xy_locator::cached_location_t below = src_loc.cache_location(0, 1);
  201. for (int y = 1; y < src.height() - 1; ++y)
  202. {
  203. gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
  204. for (int x = 0; x < src.width(); ++x)
  205. {
  206. (*dst_it) = (src_loc[above] - src_loc[below]) / 2;
  207. ++dst_it;
  208. ++src_loc.x();
  209. }
  210. src_loc+=point<std::ptrdiff_t>(-src.width(), 1);
  211. }
  212. }
  213. In this example ``src_loc[above]`` corresponds to a fast pointer indexing
  214. operation and the code is efficient.
  215. Creating a Generic Version of GIL Algorithms
  216. --------------------------------------------
  217. Let us make our ``x_gradient`` more generic. It should work with any image
  218. views, as long as they have the same number of channels. The gradient
  219. operation is to be computed for each channel independently.
  220. Here is how the new interface looks like:
  221. .. code-block:: cpp
  222. template <typename SrcView, typename DstView>
  223. void x_gradient(const SrcView& src, const DstView& dst)
  224. {
  225. gil_function_requires<ImageViewConcept<SrcView> >();
  226. gil_function_requires<MutableImageViewConcept<DstView> >();
  227. gil_function_requires
  228. <
  229. ColorSpacesCompatibleConcept
  230. <
  231. typename color_space_type<SrcView>::type,
  232. typename color_space_type<DstView>::type
  233. >
  234. >();
  235. ... // compute the gradient
  236. }
  237. The new algorithm now takes the types of the input and output image
  238. views as template parameters. That allows using both built-in GIL
  239. image views, as well as any user-defined image view classes. The
  240. first three lines are optional; they use ``boost::concept_check`` to
  241. ensure that the two arguments are valid GIL image views, that the
  242. second one is mutable and that their color spaces are compatible
  243. (i.e. have the same set of channels).
  244. GIL does not require using its own built-in constructs. You are free
  245. to use your own channels, color spaces, iterators, locators, views and
  246. images. However, to work with the rest of GIL they have to satisfy a
  247. set of requirements; in other words, they have to \e model the
  248. corresponding GIL _concept_. GIL's concepts are defined in the user
  249. guide.
  250. One of the biggest drawbacks of using templates and generic
  251. programming in C++ is that compile errors can be very difficult to
  252. comprehend. This is a side-effect of the lack of early type
  253. checking - a generic argument may not satisfy the requirements of a
  254. function, but the incompatibility may be triggered deep into a nested
  255. call, in code unfamiliar and hardly related to the problem. GIL uses
  256. ``boost::concept_check`` to mitigate this problem. The above three
  257. lines of code check whether the template parameters are valid models
  258. of their corresponding concepts. If a model is incorrect, the compile
  259. error will be inside ``gil_function_requires``, which is much closer
  260. to the problem and easier to track. Furthermore, such checks get
  261. compiled out and have zero performance overhead. The disadvantage of
  262. using concept checks is the sometimes severe impact they have on
  263. compile time. This is why GIL performs concept checks only in debug
  264. mode, and only if ``BOOST_GIL_USE_CONCEPT_CHECK`` is defined (off by
  265. default).
  266. The body of the generic function is very similar to that of the
  267. concrete one. The biggest difference is that we need to loop over the
  268. channels of the pixel and compute the gradient for each channel:
  269. .. code-block:: cpp
  270. template <typename SrcView, typename DstView>
  271. void x_gradient(const SrcView& src, const DstView& dst)
  272. {
  273. for (int y=0; y < src.height(); ++y)
  274. {
  275. typename SrcView::x_iterator src_it = src.row_begin(y);
  276. typename DstView::x_iterator dst_it = dst.row_begin(y);
  277. for (int x = 1; x < src.width() - 1; ++x)
  278. for (int c = 0; c < num_channels<SrcView>::value; ++c)
  279. dst_it[x][c] = (src_it[x-1][c]- src_it[x+1][c]) / 2;
  280. }
  281. }
  282. Having an explicit loop for each channel could be a performance problem.
  283. GIL allows us to abstract out such per-channel operations:
  284. .. code-block:: cpp
  285. template <typename Out>
  286. struct halfdiff_cast_channels
  287. {
  288. template <typename T> Out operator()(T const& in1, T const& in2) const
  289. {
  290. return Out((in1 - in2) / 2);
  291. }
  292. };
  293. template <typename SrcView, typename DstView>
  294. void x_gradient(const SrcView& src, const DstView& dst)
  295. {
  296. typedef typename channel_type<DstView>::type dst_channel_t;
  297. for (int y=0; y < src.height(); ++y)
  298. {
  299. typename SrcView::x_iterator src_it = src.row_begin(y);
  300. typename DstView::x_iterator dst_it = dst.row_begin(y);
  301. for (int x=1; x < src.width() - 1; ++x)
  302. {
  303. static_transform(src_it[x-1], src_it[x+1], dst_it[x],
  304. halfdiff_cast_channels<dst_channel_t>());
  305. }
  306. }
  307. }
  308. The ``static_transform`` is an example of a channel-level GIL algorithm.
  309. Other such algorithms are ``static_generate``, ``static_fill`` and
  310. ``static_for_each``. They are the channel-level equivalents of STL
  311. ``generate``, ``transform``, ``fill`` and ``for_each`` respectively.
  312. GIL channel algorithms use static recursion to unroll the loops; they never
  313. loop over the channels explicitly.
  314. Note that sometimes modern compilers (at least Visual Studio 8) already unroll
  315. channel-level loops, such as the one above. However, another advantage of
  316. using GIL's channel-level algorithms is that they pair the channels
  317. semantically, not based on their order in memory. For example, the above
  318. example will properly match an RGB source with a BGR destination.
  319. Here is how we can use our generic version with images of different types:
  320. .. code-block:: cpp
  321. // Calling with 16-bit grayscale data
  322. void XGradientGray16_Gray32(
  323. unsigned short const* src_pixels, ptrdiff_t src_row_bytes,
  324. int w, int h,
  325. signed int* dst_pixels, ptrdiff_t dst_row_bytes)
  326. {
  327. gray16c_view_t src=interleaved_view(w, h, (gray16_pixel_t const*)src_pixels, src_row_bytes);
  328. gray32s_view_t dst=interleaved_view(w, h, (gray32s_pixel_t*)dst_pixels, dst_row_bytes);
  329. x_gradient(src,dst);
  330. }
  331. // Calling with 8-bit RGB data into 16-bit BGR
  332. void XGradientRGB8_BGR16(
  333. unsigned char const* src_pixels, ptrdiff_t src_row_bytes,
  334. int w, int h,
  335. signed short* dst_pixels, ptrdiff_t dst_row_bytes)
  336. {
  337. rgb8c_view_t src = interleaved_view(w, h, (rgb8_pixel_t const*)src_pixels, src_row_bytes);
  338. rgb16s_view_t dst = interleaved_view(w, h, (rgb16s_pixel_t*)dst_pixels, dst_row_bytes);
  339. x_gradient(src, dst);
  340. }
  341. // Either or both the source and the destination could be planar - the gradient code does not change
  342. void XGradientPlanarRGB8_RGB32(
  343. unsigned short const* src_r, unsigned short const* src_g, unsigned short const* src_b,
  344. ptrdiff_t src_row_bytes, int w, int h,
  345. signed int* dst_pixels, ptrdiff_t dst_row_bytes)
  346. {
  347. rgb16c_planar_view_t src = planar_rgb_view (w, h, src_r, src_g, src_b, src_row_bytes);
  348. rgb32s_view_t dst = interleaved_view(w, h,(rgb32s_pixel_t*)dst_pixels, dst_row_bytes);
  349. x_gradient(src,dst);
  350. }
  351. As these examples illustrate, both the source and the destination can be
  352. interleaved or planar, of any channel depth (assuming the destination channel
  353. is assignable to the source), and of any compatible color spaces.
  354. GIL 2.1 can also natively represent images whose channels are not
  355. byte-aligned, such as 6-bit RGB222 image or a 1-bit Gray1 image.
  356. GIL algorithms apply to these images natively. See the design guide or sample
  357. files for more on using such images.
  358. Image View Transformations
  359. --------------------------
  360. One way to compute the y-gradient is to rotate the image by 90 degrees,
  361. compute the x-gradient and rotate the result back.
  362. Here is how to do this in GIL:
  363. .. code-block:: cpp
  364. template <typename SrcView, typename DstView>
  365. void y_gradient(const SrcView& src, const DstView& dst)
  366. {
  367. x_gradient(rotated90ccw_view(src), rotated90ccw_view(dst));
  368. }
  369. ``rotated90ccw_view`` takes an image view and returns an image view
  370. representing 90-degrees counter-clockwise rotation of its input. It is
  371. an example of a GIL view transformation function. GIL provides a
  372. variety of transformation functions that can perform any axis-aligned
  373. rotation, transpose the view, flip it vertically or horizontally,
  374. extract a rectangular subimage, perform color conversion, subsample
  375. view, etc. The view transformation functions are fast and shallow -
  376. they don't copy the pixels, they just change the "coordinate system"
  377. of accessing the pixels. ``rotated90cw_view``, for example, returns a
  378. view whose horizontal iterators are the vertical iterators of the
  379. original view. The above code to compute ``y_gradient`` is slow
  380. because of the memory access pattern; using ``rotated90cw_view`` does
  381. not make it any slower.
  382. Another example: suppose we want to compute the gradient of the N-th
  383. channel of a color image. Here is how to do that:
  384. .. code-block:: cpp
  385. template <typename SrcView, typename DstView>
  386. void nth_channel_x_gradient(const SrcView& src, int n, const DstView& dst)
  387. {
  388. x_gradient(nth_channel_view(src, n), dst);
  389. }
  390. ``nth_channel_view`` is a view transformation function that takes any
  391. view and returns a single-channel (grayscale) view of its N-th
  392. channel. For interleaved RGB view, for example, the returned view is
  393. a step view - a view whose horizontal iterator skips over two channels
  394. when incremented. If applied on a planar RGB view, the returned type
  395. is a simple grayscale view whose horizontal iterator is a C pointer.
  396. Image view transformation functions can be piped together. For
  397. example, to compute the y gradient of the second channel of the even
  398. pixels in the view, use:
  399. .. code-block:: cpp
  400. y_gradient(subsampled_view(nth_channel_view(src, 1), 2,2), dst);
  401. GIL can sometimes simplify piped views. For example, two nested
  402. subsampled views (views that skip over pixels in X and in Y) can be
  403. represented as a single subsampled view whose step is the product of
  404. the steps of the two views.
  405. 1D pixel iterators
  406. ------------------
  407. Let's go back to ``x_gradient`` one more time. Many image view
  408. algorithms apply the same operation for each pixel and GIL provides an
  409. abstraction to handle them. However, our algorithm has an unusual
  410. access pattern, as it skips the first and the last column. It would be
  411. nice and instructional to see how we can rewrite it in canonical
  412. form. The way to do that in GIL is to write a version that works for
  413. every pixel, but apply it only on the subimage that excludes the first
  414. and last column:
  415. .. code-block:: cpp
  416. void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
  417. {
  418. for (int y=0; y < src.height(); ++y)
  419. {
  420. gray8c_view_t::x_iterator src_it = src.row_begin(y);
  421. gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
  422. for (int x = 0; x < src.width(); ++x)
  423. dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
  424. }
  425. }
  426. void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
  427. {
  428. assert(src.width()>=2);
  429. x_gradient_unguarded(subimage_view(src, 1, 0, src.width()-2, src.height()),
  430. subimage_view(dst, 1, 0, src.width()-2, src.height()));
  431. }
  432. ``subimage_view`` is another example of a GIL view transformation
  433. function. It takes a source view and a rectangular region (in this
  434. case, defined as x_min,y_min,width,height) and returns a view
  435. operating on that region of the source view. The above implementation
  436. has no measurable performance degradation from the version that
  437. operates on the original views.
  438. Now that ``x_gradient_unguarded`` operates on every pixel, we can
  439. rewrite it more compactly:
  440. .. code-block:: cpp
  441. void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
  442. {
  443. gray8c_view_t::iterator src_it = src.begin();
  444. for (gray8s_view_t::iterator dst_it = dst.begin(); dst_it!=dst.end(); ++dst_it, ++src_it)
  445. *dst_it = (src_it.x()[-1] - src_it.x()[1]) / 2;
  446. }
  447. GIL image views provide ``begin()`` and ``end()`` methods that return
  448. one dimensional pixel iterators which iterate over each pixel in the
  449. view, left to right and top to bottom. They do a proper "carriage
  450. return" - they skip any unused bytes at the end of a row. As such,
  451. they are slightly suboptimal, because they need to keep track of their
  452. current position with respect to the end of the row. Their increment
  453. operator performs one extra check (are we at the end of the row?), a
  454. check that is avoided if two nested loops are used instead. These
  455. iterators have a method ``x()`` which returns the more lightweight
  456. horizontal iterator that we used previously. Horizontal iterators have
  457. no notion of the end of rows. In this case, the horizontal iterators
  458. are raw C pointers. In our example, we must use the horizontal
  459. iterators to access the two neighbors properly, since they could
  460. reside outside the image view.
  461. STL Equivalent Algorithms
  462. -------------------------
  463. GIL provides STL equivalents of many algorithms. For example,
  464. ``std::transform`` is an STL algorithm that sets each element in a
  465. destination range the result of a generic function taking the
  466. corresponding element of the source range. In our example, we want to
  467. assign to each destination pixel the value of the half-difference of
  468. the horizontal neighbors of the corresponding source pixel. If we
  469. abstract that operation in a function object, we can use GIL's
  470. ``transform_pixel_positions`` to do that:
  471. .. code-block:: cpp
  472. struct half_x_difference
  473. {
  474. int operator()(const gray8c_loc_t& src_loc) const
  475. {
  476. return (src_loc.x()[-1] - src_loc.x()[1]) / 2;
  477. }
  478. };
  479. void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
  480. {
  481. transform_pixel_positions(src, dst, half_x_difference());
  482. }
  483. GIL provides the algorithms ``for_each_pixel`` and
  484. ``transform_pixels`` which are image view equivalents of STL
  485. ``std::for_each`` and ``std::transform``. It also provides
  486. ``for_each_pixel_position`` and ``transform_pixel_positions``, which
  487. instead of references to pixels, pass to the generic function pixel
  488. locators. This allows for more powerful functions that can use the
  489. pixel neighbors through the passed locators. GIL algorithms iterate
  490. through the pixels using the more efficient two nested loops (as
  491. opposed to the single loop using 1-D iterators)
  492. Color Conversion
  493. ----------------
  494. Instead of computing the gradient of each color plane of an image, we
  495. often want to compute the gradient of the luminosity. In other words,
  496. we want to convert the color image to grayscale and compute the
  497. gradient of the result. Here how to compute the luminosity gradient of
  498. a 32-bit float RGB image:
  499. .. code-block:: cpp
  500. void x_gradient_rgb_luminosity(rgb32fc_view_t const& src, gray8s_view_t const& dst)
  501. {
  502. x_gradient(color_converted_view<gray8_pixel_t>(src), dst);
  503. }
  504. ``color_converted_view`` is a GIL view transformation function that
  505. takes any image view and returns a view in a target color space and
  506. channel depth (specified as template parameters). In our example, it
  507. constructs an 8-bit integer grayscale view over 32-bit float RGB
  508. pixels. Like all other view transformation functions,
  509. ``color_converted_view`` is very fast and shallow. It doesn't copy the
  510. data or perform any color conversion. Instead it returns a view that
  511. performs color conversion every time its pixels are accessed.
  512. In the generic version of this algorithm we might like to convert the
  513. color space to grayscale, but keep the channel depth the same. We do
  514. that by constructing the type of a GIL grayscale pixel with the same
  515. channel as the source, and color convert to that pixel type:
  516. .. code-block:: cpp
  517. template <typename SrcView, typename DstView>
  518. void x_luminosity_gradient(SrcView const& src, DstView const& dst)
  519. {
  520. using gray_pixel_t = pixel<typename channel_type<SrcView>::type, gray_layout_t>;
  521. x_gradient(color_converted_view<gray_pixel_t>(src), dst);
  522. }
  523. When the destination color space and channel type happens to be the
  524. same as the source one, color conversion is unnecessary. GIL detects
  525. this case and avoids calling the color conversion code at all -
  526. i.e. ``color_converted_view`` returns back the source view unchanged.
  527. Image
  528. -----
  529. The above example has a performance problem - ``x_gradient``
  530. dereferences most source pixels twice, which will cause the above code
  531. to perform color conversion twice. Sometimes it may be more efficient
  532. to copy the color converted image into a temporary buffer and use it
  533. to compute the gradient - that way color conversion is invoked once
  534. per pixel. Using our non-generic version we can do it like this:
  535. .. code-block:: cpp
  536. void x_luminosity_gradient(rgb32fc_view_t const& src, gray8s_view_t const& dst)
  537. {
  538. gray8_image_t ccv_image(src.dimensions());
  539. copy_pixels(color_converted_view<gray8_pixel_t>(src), view(ccv_image));
  540. x_gradient(const_view(ccv_image), dst);
  541. }
  542. First we construct an 8-bit grayscale image with the same dimensions
  543. as our source. Then we copy a color-converted view of the source into
  544. the temporary image. Finally we use a read-only view of the temporary
  545. image in our ``x_gradient algorithm``. As the example shows, GIL
  546. provides global functions ``view`` and ``const_view`` that take an
  547. image and return a mutable or an immutable view of its pixels.
  548. Creating a generic version of the above is a bit trickier:
  549. .. code-block:: cpp
  550. template <typename SrcView, typename DstView>
  551. void x_luminosity_gradient(const SrcView& src, const DstView& dst)
  552. {
  553. using d_channel_t = typename channel_type<DstView>::type;
  554. using channel_t = typename channel_convert_to_unsigned<d_channel_t>::type;
  555. using gray_pixel_t = pixel<channel_t, gray_layout_t>;
  556. using gray_image_t = image<gray_pixel_t, false>;
  557. gray_image_t ccv_image(src.dimensions());
  558. copy_pixels(color_converted_view<gray_pixel_t>(src), view(ccv_image));
  559. x_gradient(const_view(ccv_image), dst);
  560. }
  561. First we use the ``channel_type`` metafunction to get the channel type
  562. of the destination view. A metafunction is a function operating on
  563. types. In GIL metafunctions are class templates (declared with
  564. ``struct`` type specifier) which take their parameters as template
  565. parameters and return their result in a nested typedef called
  566. ``type``. In this case, ``channel_type`` is a unary metafunction which
  567. in this example is called with the type of an image view and returns
  568. the type of the channel associated with that image view.
  569. GIL constructs that have an associated pixel type, such as pixels,
  570. pixel iterators, locators, views and images, all model
  571. ``PixelBasedConcept``, which means that they provide a set of
  572. metafunctions to query the pixel properties, such as ``channel_type``,
  573. ``color_space_type``, ``channel_mapping_type``, and ``num_channels``.
  574. After we get the channel type of the destination view, we use another
  575. metafunction to remove its sign (if it is a signed integral type) and
  576. then use it to generate the type of a grayscale pixel. From the pixel
  577. type we create the image type. GIL's image class is specialized over
  578. the pixel type and a boolean indicating whether the image should be
  579. planar or interleaved. Single-channel (grayscale) images in GIL must
  580. always be interleaved. There are multiple ways of constructing types
  581. in GIL. Instead of instantiating the classes directly we could have
  582. used type factory metafunctions. The following code is equivalent:
  583. .. code-block:: cpp
  584. template <typename SrcView, typename DstView>
  585. void x_luminosity_gradient(SrcView const& src, DstView const& dst)
  586. {
  587. typedef typename channel_type<DstView>::type d_channel_t;
  588. typedef typename channel_convert_to_unsigned<d_channel_t>::type channel_t;
  589. typedef typename image_type<channel_t, gray_layout_t>::type gray_image_t;
  590. typedef typename gray_image_t::value_type gray_pixel_t;
  591. gray_image_t ccv_image(src.dimensions());
  592. copy_and_convert_pixels(src, view(ccv_image));
  593. x_gradient(const_view(ccv_image), dst);
  594. }
  595. GIL provides a set of metafunctions that generate GIL types -
  596. ``image_type`` is one such meta-function that constructs the type of
  597. an image from a given channel type, color layout, and
  598. planar/interleaved option (the default is interleaved). There are also
  599. similar meta-functions to construct the types of pixel references,
  600. iterators, locators and image views. GIL also has metafunctions
  601. ``derived_pixel_reference_type``, ``derived_iterator_type``,
  602. ``derived_view_type`` and ``derived_image_type`` that construct the
  603. type of a GIL construct from a given source one by changing one or
  604. more properties of the type and keeping the rest.
  605. From the image type we can use the nested typedef ``value_type`` to
  606. obtain the type of a pixel. GIL images, image views and locators have
  607. nested typedefs ``value_type`` and ``reference`` to obtain the type of
  608. the pixel and a reference to the pixel. If you have a pixel iterator,
  609. you can get these types from its ``iterator_traits``. Note also the
  610. algorithm ``copy_and_convert_pixels``, which is an abbreviated version
  611. of ``copy_pixels`` with a color converted source view.
  612. Virtual Image Views
  613. -------------------
  614. So far we have been dealing with images that have pixels stored in
  615. memory. GIL allows you to create an image view of an arbitrary image,
  616. including a synthetic function. To demonstrate this, let us create a
  617. view of the Mandelbrot set. First, we need to create a function
  618. object that computes the value of the Mandelbrot set at a given
  619. location (x,y) in the image:
  620. .. code-block:: cpp
  621. // models PixelDereferenceAdaptorConcept
  622. struct mandelbrot_fn
  623. {
  624. typedef point<ptrdiff_t> point_t;
  625. typedef mandelbrot_fn const_t;
  626. typedef gray8_pixel_t value_type;
  627. typedef value_type reference;
  628. typedef value_type const_reference;
  629. typedef point_t argument_type;
  630. typedef reference result_type;
  631. static bool constexpr is_mutable = false;
  632. mandelbrot_fn() {}
  633. mandelbrot_fn(const point_t& sz) : _img_size(sz) {}
  634. result_type operator()(const point_t& p) const
  635. {
  636. // normalize the coords to (-2..1, -1.5..1.5)
  637. double t=get_num_iter(point<double>(p.x/(double)_img_size.x*3-2, p.y/(double)_img_size.y*3-1.5f));
  638. return value_type((bits8)(pow(t,0.2)*255)); // raise to power suitable for viewing
  639. }
  640. private:
  641. point_t _img_size;
  642. double get_num_iter(const point<double>& p) const
  643. {
  644. point<double> Z(0,0);
  645. for (int i=0; i<100; ++i) // 100 iterations
  646. {
  647. Z = point<double>(Z.x*Z.x - Z.y*Z.y + p.x, 2*Z.x*Z.y + p.y);
  648. if (Z.x*Z.x + Z.y*Z.y > 4)
  649. return i/(double)100;
  650. }
  651. return 0;
  652. }
  653. };
  654. We can now use GIL's ``virtual_2d_locator`` with this function object
  655. to construct a Mandelbrot view of size 200x200 pixels:
  656. .. code-block:: cpp
  657. typedef mandelbrot_fn::point_t point_t;
  658. typedef virtual_2d_locator<mandelbrot_fn,false> locator_t;
  659. typedef image_view<locator_t> my_virt_view_t;
  660. point_t dims(200,200);
  661. // Construct a Mandelbrot view with a locator, taking top-left corner (0,0) and step (1,1)
  662. my_virt_view_t mandel(dims, locator_t(point_t(0,0), point_t(1,1), mandelbrot_fn(dims)));
  663. We can treat the synthetic view just like a real one. For example,
  664. let's invoke our ``x_gradient`` algorithm to compute the gradient of
  665. the 90-degree rotated view of the Mandelbrot set and save the original
  666. and the result:
  667. .. code-block:: cpp
  668. gray8s_image_t img(dims);
  669. x_gradient(rotated90cw_view(mandel), view(img));
  670. // Save the Mandelbrot set and its 90-degree rotated gradient (jpeg cannot save signed char; must convert to unsigned char)
  671. jpeg_write_view("mandel.jpg",mandel);
  672. jpeg_write_view("mandel_grad.jpg",color_converted_view<gray8_pixel_t>(const_view(img)));
  673. Here is what the two files look like:
  674. .. image:: ../images/mandel.jpg
  675. Run-Time Specified Images and Image Views
  676. -----------------------------------------
  677. So far we have created a generic function that computes the image
  678. gradient of an image view template specialization. Sometimes,
  679. however, the properties of an image view, such as its color space and
  680. channel depth, may not be available at compile time. GIL's
  681. ``dynamic_image`` extension allows for working with GIL constructs
  682. that are specified at run time, also called _variants_. GIL provides
  683. models of a run-time instantiated image, ``any_image``, and a run-time
  684. instantiated image view, ``any_image_view``. The mechanisms are in
  685. place to create other variants, such as ``any_pixel``,
  686. ``any_pixel_iterator``, etc. Most of GIL's algorithms and all of the
  687. view transformation functions also work with run-time instantiated
  688. image views and binary algorithms, such as ``copy_pixels`` can have
  689. either or both arguments be variants.
  690. Lets make our ``x_luminosity_gradient`` algorithm take a variant image
  691. view. For simplicity, let's assume that only the source view can be a
  692. variant. (As an example of using multiple variants, see GIL's image
  693. view algorithm overloads taking multiple variants.)
  694. First, we need to make a function object that contains the templated
  695. destination view and has an application operator taking a templated
  696. source view:
  697. .. code-block:: cpp
  698. #include <boost/gil/extension/dynamic_image/dynamic_image_all.hpp>
  699. template <typename DstView>
  700. struct x_gradient_obj
  701. {
  702. typedef void result_type; // required typedef
  703. const DstView& _dst;
  704. x_gradient_obj(const DstView& dst) : _dst(dst) {}
  705. template <typename SrcView>
  706. void operator()(const SrcView& src) const { x_luminosity_gradient(src, _dst); }
  707. };
  708. The second step is to provide an overload of ``x_luminosity_gradient`` that
  709. takes image view variant and calls GIL's ``apply_operation`` passing it the
  710. function object:
  711. .. code-block:: cpp
  712. template <typename SrcViews, typename DstView>
  713. void x_luminosity_gradient(const any_image_view<SrcViews>& src, const DstView& dst)
  714. {
  715. apply_operation(src, x_gradient_obj<DstView>(dst));
  716. }
  717. ``any_image_view<SrcViews>`` is the image view variant. It is
  718. templated over ``SrcViews``, an enumeration of all possible view types
  719. the variant can take. ``src`` contains inside an index of the
  720. currently instantiated type, as well as a block of memory containing
  721. the instance. ``apply_operation`` goes through a switch statement
  722. over the index, each case of which casts the memory to the correct
  723. view type and invokes the function object with it. Invoking an
  724. algorithm on a variant has the overhead of one switch
  725. statement. Algorithms that perform an operation for each pixel in an
  726. image view have practically no performance degradation when used with
  727. a variant.
  728. Here is how we can construct a variant and invoke the algorithm:
  729. .. code-block:: cpp
  730. #include <boost/mpl/vector.hpp>
  731. #include <boost/gil/extension/io/jpeg_dynamic_io.hpp>
  732. typedef mpl::vector<gray8_image_t, gray16_image_t, rgb8_image_t, rgb16_image_t> my_img_types;
  733. any_image<my_img_types> runtime_image;
  734. jpeg_read_image("input.jpg", runtime_image);
  735. gray8s_image_t gradient(runtime_image.dimensions());
  736. x_luminosity_gradient(const_view(runtime_image), view(gradient));
  737. jpeg_write_view("x_gradient.jpg", color_converted_view<gray8_pixel_t>(const_view(gradient)));
  738. In this example, we create an image variant that could be 8-bit or
  739. 16-bit RGB or grayscale image. We then use GIL's I/O extension to load
  740. the image from file in its native color space and channel depth. If
  741. none of the allowed image types matches the image on disk, an
  742. exception will be thrown. We then construct a 8 bit signed
  743. (i.e. ``char``) image to store the gradient and invoke ``x_gradient``
  744. on it. Finally we save the result into another file. We save the view
  745. converted to 8-bit unsigned, because JPEG I/O does not support signed
  746. char.
  747. Note how free functions and methods such as ``jpeg_read_image``,
  748. ``dimensions``, ``view`` and ``const_view`` work on both templated and
  749. variant types. For templated images ``view(img)`` returns a templated
  750. view, whereas for image variants it returns a view variant. For
  751. example, the return type of ``view(runtime_image)`` is
  752. ``any_image_view<Views>`` where ``Views`` enumerates four views
  753. corresponding to the four image types. ``const_view(runtime_image)``
  754. returns a ``any_image_view`` of the four read-only view types, etc.
  755. A warning about using variants: instantiating an algorithm with a
  756. variant effectively instantiates it with every possible type the
  757. variant can take. For binary algorithms, the algorithm is
  758. instantiated with every possible combination of the two input types!
  759. This can take a toll on both the compile time and the executable size.
  760. Conclusion
  761. ----------
  762. This tutorial provides a glimpse at the challenges associated with
  763. writing generic and efficient image processing algorithms in GIL. We
  764. have taken a simple algorithm and shown how to make it work with image
  765. representations that vary in bit depth, color space, ordering of the
  766. channels, and planar/interleaved structure. We have demonstrated that
  767. the algorithm can work with fully abstracted virtual images, and even
  768. images whose type is specified at run time. The associated video
  769. presentation also demonstrates that even for complex scenarios the
  770. generated assembly is comparable to that of a C version of the
  771. algorithm, hand-written for the specific image types.
  772. Yet, even for such a simple algorithm, we are far from making a fully
  773. generic and optimized code. In particular, the presented algorithms
  774. work on homogeneous images, i.e. images whose pixels have channels
  775. that are all of the same type. There are examples of images, such as a
  776. packed 565 RGB format, which contain channels of different
  777. types. While GIL provides concepts and algorithms operating on
  778. heterogeneous pixels, we leave the task of extending x_gradient as an
  779. exercise for the reader. Second, after computing the value of the
  780. gradient we are simply casting it to the destination channel
  781. type. This may not always be the desired operation. For example, if
  782. the source channel is a float with range [0..1] and the destination is
  783. unsigned char, casting the half-difference to unsigned char will
  784. result in either 0 or 1. Instead, what we might want to do is scale
  785. the result into the range of the destination channel. GIL's
  786. channel-level algorithms might be useful in such cases. For example,
  787. \p channel_convert converts between channels by linearly scaling the
  788. source channel value into the range of the destination channel.
  789. There is a lot to be done in improving the performance as
  790. well. Channel-level operations, such as the half-difference, could be
  791. abstracted out into atomic channel-level algorithms and performance
  792. overloads could be provided for concrete channel
  793. types. Processor-specific operations could be used, for example, to
  794. perform the operation over an entire row of pixels simultaneously, or
  795. the data could be pre-fetched. All of these optimizations can be
  796. realized as performance specializations of the generic
  797. algorithm. Finally, compilers, while getting better over time, are
  798. still failing to fully optimize generic code in some cases, such as
  799. failing to inline some functions or put some variables into
  800. registers. If performance is an issue, it might be worth trying your
  801. code with different compilers.