123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980 |
- Tutorial: Image Gradient
- ========================
- .. contents::
- :local:
- :depth: 1
- This comprehensive (and long) tutorial will walk you through an example of
- using GIL to compute the image gradients.
- We will start with some very simple and non-generic code and make it more
- generic as we go along. Let us start with a horizontal gradient and use the
- simplest possible approximation to a gradient - central difference.
- The gradient at pixel x can be approximated with the half-difference of its
- two neighboring pixels::
- D[x] = (I[x-1] - I[x+1]) / 2
- For simplicity, we will also ignore the boundary cases - the pixels along the
- edges of the image for which one of the neighbors is not defined. The focus of
- this document is how to use GIL, not how to create a good gradient generation
- algorithm.
- Interface and Glue Code
- -----------------------
- Let us first start with 8-bit unsigned grayscale image as the input and 8-bit
- signed grayscale image as the output.
- Here is how the interface to our algorithm looks like:
- .. code-block:: cpp
- #include <boost/gil.hpp>
- using namespace boost::gil;
- void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- assert(src.dimensions() == dst.dimensions());
- ... // compute the gradient
- }
- ``gray8c_view_t`` is the type of the source image view - an 8-bit grayscale
- view, whose pixels are read-only (denoted by the "c").
- The output is a grayscale view with a 8-bit signed (denoted by the "s")
- integer channel type. See Appendix 1 for the complete convention GIL uses to
- name concrete types.
- GIL makes a distinction between an image and an image view.
- A GIL **image view**, is a shallow, lightweight view of a rectangular grid of
- pixels. It provides access to the pixels but does not own the pixels.
- Copy-constructing a view does not deep-copy the pixels. Image views do not
- propagate their constness to the pixels and should always be taken by a const
- reference. Whether a view is mutable or read-only (immutable) is a property of
- the view type.
- A GIL `image`, on the other hand, is a view with associated ownership.
- It is a container of pixels; its constructor/destructor allocates/deallocates
- the pixels, its copy-constructor performs deep-copy of the pixels and its
- ``operator==`` performs deep-compare of the pixels. Images also propagate
- their constness to their pixels - a constant reference to an image will not
- allow for modifying its pixels.
- Most GIL algorithms operate on image views; images are rarely
- needed. GIL's design is very similar to that of the STL. The STL
- equivalent of GIL's image is a container, like ``std::vector``,
- whereas GIL's image view corresponds to STL range, which is often
- represented with a pair of iterators. STL algorithms operate on
- ranges, just like GIL algorithms operate on image views.
- GIL's image views can be constructed from raw data - the dimensions,
- the number of bytes per row and the pixels, which for chunky views are
- represented with one pointer. Here is how to provide the glue between
- your code and GIL:
- .. code-block:: cpp
- void ComputeXGradientGray8(
- unsigned char const* src_pixels, ptrdiff_t src_row_bytes,
- int w, int h,
- signed char* dst_pixels, ptrdiff_t dst_row_bytes)
- {
- gray8c_view_t src = interleaved_view(w, h, (gray8_pixel_t const*)src_pixels, src_row_bytes);
- gray8s_view_t dst = interleaved_view(w, h, (gray8s_pixel_t*)dst_pixels, dst_row_bytes);
- x_gradient(src, dst);
- }
- This glue code is very fast and views are lightweight - in the above example
- the views have a size of 16 bytes. They consist of a pointer to the top left
- pixel and three integers - the width, height, and number of bytes per row.
- First Implementation
- --------------------
- Focusing on simplicity at the expense of speed, we can compute the horizontal
- gradient like this:
- .. code-block:: cpp
- void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- for (int y = 0; y < src.height(); ++y)
- for (int x = 1; x < src.width() - 1; ++x)
- dst(x, y) = (src(x-1, y) - src(x+1, y)) / 2;
- }
- We use image view's ``operator(x,y)`` to get a reference to the pixel at a
- given location and we set it to the half-difference of its left and right
- neighbors. ``operator()`` returns a reference to a grayscale pixel.
- A grayscale pixel is convertible to its channel type (``unsigned char`` for
- ``src``) and it can be copy-constructed from a channel.
- (This is only true for grayscale pixels).
- While the above code is easy to read, it is not very fast, because the binary
- ``operator()`` computes the location of the pixel in a 2D grid, which involves
- addition and multiplication. Here is a faster version of the above:
- .. code-block:: cpp
- void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- for (int y = 0; y < src.height(); ++y)
- {
- gray8c_view_t::x_iterator src_it = src.row_begin(y);
- gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
- for (int x=1; x < src.width() - 1; ++x)
- dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
- }
- }
- We use pixel iterators initialized at the beginning of each row. GIL's
- iterators are Random Access Traversal iterators. If you are not
- familiar with random access iterators, think of them as if they were
- pointers. In fact, in the above example the two iterator types are raw
- C pointers and their ``operator[]`` is a fast pointer indexing
- operator.
- The code to compute gradient in the vertical direction is very
- similar:
- .. code-block: cpp
- void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- for (int x = 0; x < src.width(); ++x)
- {
- gray8c_view_t::y_iterator src_it = src.col_begin(x);
- gray8s_view_t::y_iterator dst_it = dst.col_begin(x);
- for (int y = 1; y < src.height() - 1; ++y)
- dst_it[y] = (src_it[y-1] - src_it[y+1]) / 2;
- }
- }
- Instead of looping over the rows, we loop over each column and create a
- ``y_iterator``, an iterator moving vertically. In this case a simple pointer
- cannot be used because the distance between two adjacent pixels equals the
- number of bytes in each row of the image. GIL uses here a special step
- iterator class whose size is 8 bytes - it contains a raw C pointer and a step.
- Its ``operator[]`` multiplies the index by its step.
- The above version of ``y_gradient``, however, is much slower (easily an order
- of magnitude slower) than ``x_gradient`` because of the memory access pattern;
- traversing an image vertically results in lots of cache misses. A much more
- efficient and cache-friendly version will iterate over the columns in the inner
- loop:
- .. code-block:: cpp
- void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- for (int y = 1; y < src.height() - 1; ++y)
- {
- gray8c_view_t::x_iterator src1_it = src.row_begin(y-1);
- gray8c_view_t::x_iterator src2_it = src.row_begin(y+1);
- gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
- for (int x = 0; x < src.width(); ++x)
- {
- *dst_it = ((*src1_it) - (*src2_it)) / 2;
- ++dst_it;
- ++src1_it;
- ++src2_it;
- }
- }
- }
- This sample code also shows an alternative way of using pixel iterators -
- instead of ``operator[]`` one could use increments and dereferences.
- Using Locators
- --------------
- Unfortunately this cache-friendly version requires the extra hassle of
- maintaining two separate iterators in the source view. For every pixel, we
- want to access its neighbors above and below it. Such relative access can be
- done with GIL locators:
- .. code-block:: cpp
- void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
- for (int y = 1; y < src.height() - 1; ++y)
- {
- gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
- for (int x = 0; x < src.width(); ++x)
- {
- (*dst_it) = (src_loc(0,-1) - src_loc(0,1)) / 2;
- ++dst_it;
- ++src_loc.x(); // each dimension can be advanced separately
- }
- src_loc+=point<std::ptrdiff_t>(-src.width(), 1); // carriage return
- }
- }
- The first line creates a locator pointing to the first pixel of the
- second row of the source view. A GIL pixel locator is very similar to
- an iterator, except that it can move both horizontally and
- vertically. ``src_loc.x()`` and ``src_loc.y()`` return references to a
- horizontal and a vertical iterator respectively, which can be used to
- move the locator along the desired dimension, as shown
- above. Additionally, the locator can be advanced in both dimensions
- simultaneously using its ``operator+=`` and ``operator-=``. Similar to
- image views, locators provide binary ``operator()`` which returns a
- reference to a pixel with a relative offset to the current locator
- position. For example, ``src_loc(0,1)`` returns a reference to the
- neighbor below the current pixel. Locators are very lightweight
- objects - in the above example the locator has a size of 8 bytes - it
- consists of a raw pointer to the current pixel and an int indicating
- the number of bytes from one row to the next (which is the step when
- moving vertically). The call to ``++src_loc.x()`` corresponds to a
- single C pointer increment. However, the example above performs more
- computations than necessary. The code ``src_loc(0,1)`` has to compute
- the offset of the pixel in two dimensions, which is slow. Notice
- though that the offset of the two neighbors is the same, regardless of
- the pixel location. To improve the performance, GIL can cache and
- reuse this offset::
- void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
- gray8c_view_t::xy_locator::cached_location_t above = src_loc.cache_location(0,-1);
- gray8c_view_t::xy_locator::cached_location_t below = src_loc.cache_location(0, 1);
- for (int y = 1; y < src.height() - 1; ++y)
- {
- gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
- for (int x = 0; x < src.width(); ++x)
- {
- (*dst_it) = (src_loc[above] - src_loc[below]) / 2;
- ++dst_it;
- ++src_loc.x();
- }
- src_loc+=point<std::ptrdiff_t>(-src.width(), 1);
- }
- }
- In this example ``src_loc[above]`` corresponds to a fast pointer indexing
- operation and the code is efficient.
- Creating a Generic Version of GIL Algorithms
- --------------------------------------------
- Let us make our ``x_gradient`` more generic. It should work with any image
- views, as long as they have the same number of channels. The gradient
- operation is to be computed for each channel independently.
- Here is how the new interface looks like:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void x_gradient(const SrcView& src, const DstView& dst)
- {
- gil_function_requires<ImageViewConcept<SrcView> >();
- gil_function_requires<MutableImageViewConcept<DstView> >();
- gil_function_requires
- <
- ColorSpacesCompatibleConcept
- <
- typename color_space_type<SrcView>::type,
- typename color_space_type<DstView>::type
- >
- >();
- ... // compute the gradient
- }
- The new algorithm now takes the types of the input and output image
- views as template parameters. That allows using both built-in GIL
- image views, as well as any user-defined image view classes. The
- first three lines are optional; they use ``boost::concept_check`` to
- ensure that the two arguments are valid GIL image views, that the
- second one is mutable and that their color spaces are compatible
- (i.e. have the same set of channels).
- GIL does not require using its own built-in constructs. You are free
- to use your own channels, color spaces, iterators, locators, views and
- images. However, to work with the rest of GIL they have to satisfy a
- set of requirements; in other words, they have to \e model the
- corresponding GIL _concept_. GIL's concepts are defined in the user
- guide.
- One of the biggest drawbacks of using templates and generic
- programming in C++ is that compile errors can be very difficult to
- comprehend. This is a side-effect of the lack of early type
- checking - a generic argument may not satisfy the requirements of a
- function, but the incompatibility may be triggered deep into a nested
- call, in code unfamiliar and hardly related to the problem. GIL uses
- ``boost::concept_check`` to mitigate this problem. The above three
- lines of code check whether the template parameters are valid models
- of their corresponding concepts. If a model is incorrect, the compile
- error will be inside ``gil_function_requires``, which is much closer
- to the problem and easier to track. Furthermore, such checks get
- compiled out and have zero performance overhead. The disadvantage of
- using concept checks is the sometimes severe impact they have on
- compile time. This is why GIL performs concept checks only in debug
- mode, and only if ``BOOST_GIL_USE_CONCEPT_CHECK`` is defined (off by
- default).
- The body of the generic function is very similar to that of the
- concrete one. The biggest difference is that we need to loop over the
- channels of the pixel and compute the gradient for each channel:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void x_gradient(const SrcView& src, const DstView& dst)
- {
- for (int y=0; y < src.height(); ++y)
- {
- typename SrcView::x_iterator src_it = src.row_begin(y);
- typename DstView::x_iterator dst_it = dst.row_begin(y);
- for (int x = 1; x < src.width() - 1; ++x)
- for (int c = 0; c < num_channels<SrcView>::value; ++c)
- dst_it[x][c] = (src_it[x-1][c]- src_it[x+1][c]) / 2;
- }
- }
- Having an explicit loop for each channel could be a performance problem.
- GIL allows us to abstract out such per-channel operations:
- .. code-block:: cpp
- template <typename Out>
- struct halfdiff_cast_channels
- {
- template <typename T> Out operator()(T const& in1, T const& in2) const
- {
- return Out((in1 - in2) / 2);
- }
- };
- template <typename SrcView, typename DstView>
- void x_gradient(const SrcView& src, const DstView& dst)
- {
- typedef typename channel_type<DstView>::type dst_channel_t;
- for (int y=0; y < src.height(); ++y)
- {
- typename SrcView::x_iterator src_it = src.row_begin(y);
- typename DstView::x_iterator dst_it = dst.row_begin(y);
- for (int x=1; x < src.width() - 1; ++x)
- {
- static_transform(src_it[x-1], src_it[x+1], dst_it[x],
- halfdiff_cast_channels<dst_channel_t>());
- }
- }
- }
- The ``static_transform`` is an example of a channel-level GIL algorithm.
- Other such algorithms are ``static_generate``, ``static_fill`` and
- ``static_for_each``. They are the channel-level equivalents of STL
- ``generate``, ``transform``, ``fill`` and ``for_each`` respectively.
- GIL channel algorithms use static recursion to unroll the loops; they never
- loop over the channels explicitly.
- Note that sometimes modern compilers (at least Visual Studio 8) already unroll
- channel-level loops, such as the one above. However, another advantage of
- using GIL's channel-level algorithms is that they pair the channels
- semantically, not based on their order in memory. For example, the above
- example will properly match an RGB source with a BGR destination.
- Here is how we can use our generic version with images of different types:
- .. code-block:: cpp
- // Calling with 16-bit grayscale data
- void XGradientGray16_Gray32(
- unsigned short const* src_pixels, ptrdiff_t src_row_bytes,
- int w, int h,
- signed int* dst_pixels, ptrdiff_t dst_row_bytes)
- {
- gray16c_view_t src=interleaved_view(w, h, (gray16_pixel_t const*)src_pixels, src_row_bytes);
- gray32s_view_t dst=interleaved_view(w, h, (gray32s_pixel_t*)dst_pixels, dst_row_bytes);
- x_gradient(src,dst);
- }
- // Calling with 8-bit RGB data into 16-bit BGR
- void XGradientRGB8_BGR16(
- unsigned char const* src_pixels, ptrdiff_t src_row_bytes,
- int w, int h,
- signed short* dst_pixels, ptrdiff_t dst_row_bytes)
- {
- rgb8c_view_t src = interleaved_view(w, h, (rgb8_pixel_t const*)src_pixels, src_row_bytes);
- rgb16s_view_t dst = interleaved_view(w, h, (rgb16s_pixel_t*)dst_pixels, dst_row_bytes);
- x_gradient(src, dst);
- }
- // Either or both the source and the destination could be planar - the gradient code does not change
- void XGradientPlanarRGB8_RGB32(
- unsigned short const* src_r, unsigned short const* src_g, unsigned short const* src_b,
- ptrdiff_t src_row_bytes, int w, int h,
- signed int* dst_pixels, ptrdiff_t dst_row_bytes)
- {
- rgb16c_planar_view_t src = planar_rgb_view (w, h, src_r, src_g, src_b, src_row_bytes);
- rgb32s_view_t dst = interleaved_view(w, h,(rgb32s_pixel_t*)dst_pixels, dst_row_bytes);
- x_gradient(src,dst);
- }
- As these examples illustrate, both the source and the destination can be
- interleaved or planar, of any channel depth (assuming the destination channel
- is assignable to the source), and of any compatible color spaces.
- GIL 2.1 can also natively represent images whose channels are not
- byte-aligned, such as 6-bit RGB222 image or a 1-bit Gray1 image.
- GIL algorithms apply to these images natively. See the design guide or sample
- files for more on using such images.
- Image View Transformations
- --------------------------
- One way to compute the y-gradient is to rotate the image by 90 degrees,
- compute the x-gradient and rotate the result back.
- Here is how to do this in GIL:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void y_gradient(const SrcView& src, const DstView& dst)
- {
- x_gradient(rotated90ccw_view(src), rotated90ccw_view(dst));
- }
- ``rotated90ccw_view`` takes an image view and returns an image view
- representing 90-degrees counter-clockwise rotation of its input. It is
- an example of a GIL view transformation function. GIL provides a
- variety of transformation functions that can perform any axis-aligned
- rotation, transpose the view, flip it vertically or horizontally,
- extract a rectangular subimage, perform color conversion, subsample
- view, etc. The view transformation functions are fast and shallow -
- they don't copy the pixels, they just change the "coordinate system"
- of accessing the pixels. ``rotated90cw_view``, for example, returns a
- view whose horizontal iterators are the vertical iterators of the
- original view. The above code to compute ``y_gradient`` is slow
- because of the memory access pattern; using ``rotated90cw_view`` does
- not make it any slower.
- Another example: suppose we want to compute the gradient of the N-th
- channel of a color image. Here is how to do that:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void nth_channel_x_gradient(const SrcView& src, int n, const DstView& dst)
- {
- x_gradient(nth_channel_view(src, n), dst);
- }
- ``nth_channel_view`` is a view transformation function that takes any
- view and returns a single-channel (grayscale) view of its N-th
- channel. For interleaved RGB view, for example, the returned view is
- a step view - a view whose horizontal iterator skips over two channels
- when incremented. If applied on a planar RGB view, the returned type
- is a simple grayscale view whose horizontal iterator is a C pointer.
- Image view transformation functions can be piped together. For
- example, to compute the y gradient of the second channel of the even
- pixels in the view, use:
- .. code-block:: cpp
- y_gradient(subsampled_view(nth_channel_view(src, 1), 2,2), dst);
- GIL can sometimes simplify piped views. For example, two nested
- subsampled views (views that skip over pixels in X and in Y) can be
- represented as a single subsampled view whose step is the product of
- the steps of the two views.
- 1D pixel iterators
- ------------------
- Let's go back to ``x_gradient`` one more time. Many image view
- algorithms apply the same operation for each pixel and GIL provides an
- abstraction to handle them. However, our algorithm has an unusual
- access pattern, as it skips the first and the last column. It would be
- nice and instructional to see how we can rewrite it in canonical
- form. The way to do that in GIL is to write a version that works for
- every pixel, but apply it only on the subimage that excludes the first
- and last column:
- .. code-block:: cpp
- void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- for (int y=0; y < src.height(); ++y)
- {
- gray8c_view_t::x_iterator src_it = src.row_begin(y);
- gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
- for (int x = 0; x < src.width(); ++x)
- dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
- }
- }
- void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- assert(src.width()>=2);
- x_gradient_unguarded(subimage_view(src, 1, 0, src.width()-2, src.height()),
- subimage_view(dst, 1, 0, src.width()-2, src.height()));
- }
- ``subimage_view`` is another example of a GIL view transformation
- function. It takes a source view and a rectangular region (in this
- case, defined as x_min,y_min,width,height) and returns a view
- operating on that region of the source view. The above implementation
- has no measurable performance degradation from the version that
- operates on the original views.
- Now that ``x_gradient_unguarded`` operates on every pixel, we can
- rewrite it more compactly:
- .. code-block:: cpp
- void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- gray8c_view_t::iterator src_it = src.begin();
- for (gray8s_view_t::iterator dst_it = dst.begin(); dst_it!=dst.end(); ++dst_it, ++src_it)
- *dst_it = (src_it.x()[-1] - src_it.x()[1]) / 2;
- }
- GIL image views provide ``begin()`` and ``end()`` methods that return
- one dimensional pixel iterators which iterate over each pixel in the
- view, left to right and top to bottom. They do a proper "carriage
- return" - they skip any unused bytes at the end of a row. As such,
- they are slightly suboptimal, because they need to keep track of their
- current position with respect to the end of the row. Their increment
- operator performs one extra check (are we at the end of the row?), a
- check that is avoided if two nested loops are used instead. These
- iterators have a method ``x()`` which returns the more lightweight
- horizontal iterator that we used previously. Horizontal iterators have
- no notion of the end of rows. In this case, the horizontal iterators
- are raw C pointers. In our example, we must use the horizontal
- iterators to access the two neighbors properly, since they could
- reside outside the image view.
- STL Equivalent Algorithms
- -------------------------
- GIL provides STL equivalents of many algorithms. For example,
- ``std::transform`` is an STL algorithm that sets each element in a
- destination range the result of a generic function taking the
- corresponding element of the source range. In our example, we want to
- assign to each destination pixel the value of the half-difference of
- the horizontal neighbors of the corresponding source pixel. If we
- abstract that operation in a function object, we can use GIL's
- ``transform_pixel_positions`` to do that:
- .. code-block:: cpp
- struct half_x_difference
- {
- int operator()(const gray8c_loc_t& src_loc) const
- {
- return (src_loc.x()[-1] - src_loc.x()[1]) / 2;
- }
- };
- void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
- {
- transform_pixel_positions(src, dst, half_x_difference());
- }
- GIL provides the algorithms ``for_each_pixel`` and
- ``transform_pixels`` which are image view equivalents of STL
- ``std::for_each`` and ``std::transform``. It also provides
- ``for_each_pixel_position`` and ``transform_pixel_positions``, which
- instead of references to pixels, pass to the generic function pixel
- locators. This allows for more powerful functions that can use the
- pixel neighbors through the passed locators. GIL algorithms iterate
- through the pixels using the more efficient two nested loops (as
- opposed to the single loop using 1-D iterators)
- Color Conversion
- ----------------
- Instead of computing the gradient of each color plane of an image, we
- often want to compute the gradient of the luminosity. In other words,
- we want to convert the color image to grayscale and compute the
- gradient of the result. Here how to compute the luminosity gradient of
- a 32-bit float RGB image:
- .. code-block:: cpp
- void x_gradient_rgb_luminosity(rgb32fc_view_t const& src, gray8s_view_t const& dst)
- {
- x_gradient(color_converted_view<gray8_pixel_t>(src), dst);
- }
- ``color_converted_view`` is a GIL view transformation function that
- takes any image view and returns a view in a target color space and
- channel depth (specified as template parameters). In our example, it
- constructs an 8-bit integer grayscale view over 32-bit float RGB
- pixels. Like all other view transformation functions,
- ``color_converted_view`` is very fast and shallow. It doesn't copy the
- data or perform any color conversion. Instead it returns a view that
- performs color conversion every time its pixels are accessed.
- In the generic version of this algorithm we might like to convert the
- color space to grayscale, but keep the channel depth the same. We do
- that by constructing the type of a GIL grayscale pixel with the same
- channel as the source, and color convert to that pixel type:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void x_luminosity_gradient(SrcView const& src, DstView const& dst)
- {
- using gray_pixel_t = pixel<typename channel_type<SrcView>::type, gray_layout_t>;
- x_gradient(color_converted_view<gray_pixel_t>(src), dst);
- }
- When the destination color space and channel type happens to be the
- same as the source one, color conversion is unnecessary. GIL detects
- this case and avoids calling the color conversion code at all -
- i.e. ``color_converted_view`` returns back the source view unchanged.
- Image
- -----
- The above example has a performance problem - ``x_gradient``
- dereferences most source pixels twice, which will cause the above code
- to perform color conversion twice. Sometimes it may be more efficient
- to copy the color converted image into a temporary buffer and use it
- to compute the gradient - that way color conversion is invoked once
- per pixel. Using our non-generic version we can do it like this:
- .. code-block:: cpp
- void x_luminosity_gradient(rgb32fc_view_t const& src, gray8s_view_t const& dst)
- {
- gray8_image_t ccv_image(src.dimensions());
- copy_pixels(color_converted_view<gray8_pixel_t>(src), view(ccv_image));
- x_gradient(const_view(ccv_image), dst);
- }
- First we construct an 8-bit grayscale image with the same dimensions
- as our source. Then we copy a color-converted view of the source into
- the temporary image. Finally we use a read-only view of the temporary
- image in our ``x_gradient algorithm``. As the example shows, GIL
- provides global functions ``view`` and ``const_view`` that take an
- image and return a mutable or an immutable view of its pixels.
- Creating a generic version of the above is a bit trickier:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void x_luminosity_gradient(const SrcView& src, const DstView& dst)
- {
- using d_channel_t = typename channel_type<DstView>::type;
- using channel_t = typename channel_convert_to_unsigned<d_channel_t>::type;
- using gray_pixel_t = pixel<channel_t, gray_layout_t>;
- using gray_image_t = image<gray_pixel_t, false>;
- gray_image_t ccv_image(src.dimensions());
- copy_pixels(color_converted_view<gray_pixel_t>(src), view(ccv_image));
- x_gradient(const_view(ccv_image), dst);
- }
- First we use the ``channel_type`` metafunction to get the channel type
- of the destination view. A metafunction is a function operating on
- types. In GIL metafunctions are class templates (declared with
- ``struct`` type specifier) which take their parameters as template
- parameters and return their result in a nested typedef called
- ``type``. In this case, ``channel_type`` is a unary metafunction which
- in this example is called with the type of an image view and returns
- the type of the channel associated with that image view.
- GIL constructs that have an associated pixel type, such as pixels,
- pixel iterators, locators, views and images, all model
- ``PixelBasedConcept``, which means that they provide a set of
- metafunctions to query the pixel properties, such as ``channel_type``,
- ``color_space_type``, ``channel_mapping_type``, and ``num_channels``.
- After we get the channel type of the destination view, we use another
- metafunction to remove its sign (if it is a signed integral type) and
- then use it to generate the type of a grayscale pixel. From the pixel
- type we create the image type. GIL's image class is specialized over
- the pixel type and a boolean indicating whether the image should be
- planar or interleaved. Single-channel (grayscale) images in GIL must
- always be interleaved. There are multiple ways of constructing types
- in GIL. Instead of instantiating the classes directly we could have
- used type factory metafunctions. The following code is equivalent:
- .. code-block:: cpp
- template <typename SrcView, typename DstView>
- void x_luminosity_gradient(SrcView const& src, DstView const& dst)
- {
- typedef typename channel_type<DstView>::type d_channel_t;
- typedef typename channel_convert_to_unsigned<d_channel_t>::type channel_t;
- typedef typename image_type<channel_t, gray_layout_t>::type gray_image_t;
- typedef typename gray_image_t::value_type gray_pixel_t;
- gray_image_t ccv_image(src.dimensions());
- copy_and_convert_pixels(src, view(ccv_image));
- x_gradient(const_view(ccv_image), dst);
- }
- GIL provides a set of metafunctions that generate GIL types -
- ``image_type`` is one such meta-function that constructs the type of
- an image from a given channel type, color layout, and
- planar/interleaved option (the default is interleaved). There are also
- similar meta-functions to construct the types of pixel references,
- iterators, locators and image views. GIL also has metafunctions
- ``derived_pixel_reference_type``, ``derived_iterator_type``,
- ``derived_view_type`` and ``derived_image_type`` that construct the
- type of a GIL construct from a given source one by changing one or
- more properties of the type and keeping the rest.
- From the image type we can use the nested typedef ``value_type`` to
- obtain the type of a pixel. GIL images, image views and locators have
- nested typedefs ``value_type`` and ``reference`` to obtain the type of
- the pixel and a reference to the pixel. If you have a pixel iterator,
- you can get these types from its ``iterator_traits``. Note also the
- algorithm ``copy_and_convert_pixels``, which is an abbreviated version
- of ``copy_pixels`` with a color converted source view.
- Virtual Image Views
- -------------------
- So far we have been dealing with images that have pixels stored in
- memory. GIL allows you to create an image view of an arbitrary image,
- including a synthetic function. To demonstrate this, let us create a
- view of the Mandelbrot set. First, we need to create a function
- object that computes the value of the Mandelbrot set at a given
- location (x,y) in the image:
- .. code-block:: cpp
- // models PixelDereferenceAdaptorConcept
- struct mandelbrot_fn
- {
- typedef point<ptrdiff_t> point_t;
- typedef mandelbrot_fn const_t;
- typedef gray8_pixel_t value_type;
- typedef value_type reference;
- typedef value_type const_reference;
- typedef point_t argument_type;
- typedef reference result_type;
- static bool constexpr is_mutable = false;
- mandelbrot_fn() {}
- mandelbrot_fn(const point_t& sz) : _img_size(sz) {}
- result_type operator()(const point_t& p) const
- {
- // normalize the coords to (-2..1, -1.5..1.5)
- double t=get_num_iter(point<double>(p.x/(double)_img_size.x*3-2, p.y/(double)_img_size.y*3-1.5f));
- return value_type((bits8)(pow(t,0.2)*255)); // raise to power suitable for viewing
- }
- private:
- point_t _img_size;
- double get_num_iter(const point<double>& p) const
- {
- point<double> Z(0,0);
- for (int i=0; i<100; ++i) // 100 iterations
- {
- Z = point<double>(Z.x*Z.x - Z.y*Z.y + p.x, 2*Z.x*Z.y + p.y);
- if (Z.x*Z.x + Z.y*Z.y > 4)
- return i/(double)100;
- }
- return 0;
- }
- };
- We can now use GIL's ``virtual_2d_locator`` with this function object
- to construct a Mandelbrot view of size 200x200 pixels:
- .. code-block:: cpp
- typedef mandelbrot_fn::point_t point_t;
- typedef virtual_2d_locator<mandelbrot_fn,false> locator_t;
- typedef image_view<locator_t> my_virt_view_t;
- point_t dims(200,200);
- // Construct a Mandelbrot view with a locator, taking top-left corner (0,0) and step (1,1)
- my_virt_view_t mandel(dims, locator_t(point_t(0,0), point_t(1,1), mandelbrot_fn(dims)));
- We can treat the synthetic view just like a real one. For example,
- let's invoke our ``x_gradient`` algorithm to compute the gradient of
- the 90-degree rotated view of the Mandelbrot set and save the original
- and the result:
- .. code-block:: cpp
- gray8s_image_t img(dims);
- x_gradient(rotated90cw_view(mandel), view(img));
- // Save the Mandelbrot set and its 90-degree rotated gradient (jpeg cannot save signed char; must convert to unsigned char)
- jpeg_write_view("mandel.jpg",mandel);
- jpeg_write_view("mandel_grad.jpg",color_converted_view<gray8_pixel_t>(const_view(img)));
- Here is what the two files look like:
- .. image:: ../images/mandel.jpg
- Run-Time Specified Images and Image Views
- -----------------------------------------
- So far we have created a generic function that computes the image
- gradient of an image view template specialization. Sometimes,
- however, the properties of an image view, such as its color space and
- channel depth, may not be available at compile time. GIL's
- ``dynamic_image`` extension allows for working with GIL constructs
- that are specified at run time, also called _variants_. GIL provides
- models of a run-time instantiated image, ``any_image``, and a run-time
- instantiated image view, ``any_image_view``. The mechanisms are in
- place to create other variants, such as ``any_pixel``,
- ``any_pixel_iterator``, etc. Most of GIL's algorithms and all of the
- view transformation functions also work with run-time instantiated
- image views and binary algorithms, such as ``copy_pixels`` can have
- either or both arguments be variants.
- Lets make our ``x_luminosity_gradient`` algorithm take a variant image
- view. For simplicity, let's assume that only the source view can be a
- variant. (As an example of using multiple variants, see GIL's image
- view algorithm overloads taking multiple variants.)
- First, we need to make a function object that contains the templated
- destination view and has an application operator taking a templated
- source view:
- .. code-block:: cpp
- #include <boost/gil/extension/dynamic_image/dynamic_image_all.hpp>
- template <typename DstView>
- struct x_gradient_obj
- {
- typedef void result_type; // required typedef
- const DstView& _dst;
- x_gradient_obj(const DstView& dst) : _dst(dst) {}
- template <typename SrcView>
- void operator()(const SrcView& src) const { x_luminosity_gradient(src, _dst); }
- };
- The second step is to provide an overload of ``x_luminosity_gradient`` that
- takes image view variant and calls GIL's ``apply_operation`` passing it the
- function object:
- .. code-block:: cpp
- template <typename SrcViews, typename DstView>
- void x_luminosity_gradient(const any_image_view<SrcViews>& src, const DstView& dst)
- {
- apply_operation(src, x_gradient_obj<DstView>(dst));
- }
- ``any_image_view<SrcViews>`` is the image view variant. It is
- templated over ``SrcViews``, an enumeration of all possible view types
- the variant can take. ``src`` contains inside an index of the
- currently instantiated type, as well as a block of memory containing
- the instance. ``apply_operation`` goes through a switch statement
- over the index, each case of which casts the memory to the correct
- view type and invokes the function object with it. Invoking an
- algorithm on a variant has the overhead of one switch
- statement. Algorithms that perform an operation for each pixel in an
- image view have practically no performance degradation when used with
- a variant.
- Here is how we can construct a variant and invoke the algorithm:
- .. code-block:: cpp
- #include <boost/mpl/vector.hpp>
- #include <boost/gil/extension/io/jpeg_dynamic_io.hpp>
- typedef mpl::vector<gray8_image_t, gray16_image_t, rgb8_image_t, rgb16_image_t> my_img_types;
- any_image<my_img_types> runtime_image;
- jpeg_read_image("input.jpg", runtime_image);
- gray8s_image_t gradient(runtime_image.dimensions());
- x_luminosity_gradient(const_view(runtime_image), view(gradient));
- jpeg_write_view("x_gradient.jpg", color_converted_view<gray8_pixel_t>(const_view(gradient)));
- In this example, we create an image variant that could be 8-bit or
- 16-bit RGB or grayscale image. We then use GIL's I/O extension to load
- the image from file in its native color space and channel depth. If
- none of the allowed image types matches the image on disk, an
- exception will be thrown. We then construct a 8 bit signed
- (i.e. ``char``) image to store the gradient and invoke ``x_gradient``
- on it. Finally we save the result into another file. We save the view
- converted to 8-bit unsigned, because JPEG I/O does not support signed
- char.
- Note how free functions and methods such as ``jpeg_read_image``,
- ``dimensions``, ``view`` and ``const_view`` work on both templated and
- variant types. For templated images ``view(img)`` returns a templated
- view, whereas for image variants it returns a view variant. For
- example, the return type of ``view(runtime_image)`` is
- ``any_image_view<Views>`` where ``Views`` enumerates four views
- corresponding to the four image types. ``const_view(runtime_image)``
- returns a ``any_image_view`` of the four read-only view types, etc.
- A warning about using variants: instantiating an algorithm with a
- variant effectively instantiates it with every possible type the
- variant can take. For binary algorithms, the algorithm is
- instantiated with every possible combination of the two input types!
- This can take a toll on both the compile time and the executable size.
- Conclusion
- ----------
- This tutorial provides a glimpse at the challenges associated with
- writing generic and efficient image processing algorithms in GIL. We
- have taken a simple algorithm and shown how to make it work with image
- representations that vary in bit depth, color space, ordering of the
- channels, and planar/interleaved structure. We have demonstrated that
- the algorithm can work with fully abstracted virtual images, and even
- images whose type is specified at run time. The associated video
- presentation also demonstrates that even for complex scenarios the
- generated assembly is comparable to that of a C version of the
- algorithm, hand-written for the specific image types.
- Yet, even for such a simple algorithm, we are far from making a fully
- generic and optimized code. In particular, the presented algorithms
- work on homogeneous images, i.e. images whose pixels have channels
- that are all of the same type. There are examples of images, such as a
- packed 565 RGB format, which contain channels of different
- types. While GIL provides concepts and algorithms operating on
- heterogeneous pixels, we leave the task of extending x_gradient as an
- exercise for the reader. Second, after computing the value of the
- gradient we are simply casting it to the destination channel
- type. This may not always be the desired operation. For example, if
- the source channel is a float with range [0..1] and the destination is
- unsigned char, casting the half-difference to unsigned char will
- result in either 0 or 1. Instead, what we might want to do is scale
- the result into the range of the destination channel. GIL's
- channel-level algorithms might be useful in such cases. For example,
- \p channel_convert converts between channels by linearly scaling the
- source channel value into the range of the destination channel.
- There is a lot to be done in improving the performance as
- well. Channel-level operations, such as the half-difference, could be
- abstracted out into atomic channel-level algorithms and performance
- overloads could be provided for concrete channel
- types. Processor-specific operations could be used, for example, to
- perform the operation over an entire row of pixels simultaneously, or
- the data could be pre-fetched. All of these optimizations can be
- realized as performance specializations of the generic
- algorithm. Finally, compilers, while getting better over time, are
- still failing to fully optimize generic code in some cases, such as
- failing to inline some functions or put some variables into
- registers. If performance is an issue, it might be worth trying your
- code with different compilers.
|