123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168 |
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- <html xmlns="http://www.w3.org/1999/xhtml">
- <head>
- <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-
- <title>Affine region detectors - Boost.GIL documentation</title>
- <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
- <link rel="stylesheet" href="../_static/style.css" type="text/css" />
- <script type="text/javascript">
- var DOCUMENTATION_OPTIONS = {
- URL_ROOT: '../',
- VERSION: '',
- COLLAPSE_MODINDEX: false,
- FILE_SUFFIX: '.html'
- };
- </script>
- <script type="text/javascript" src="../_static/jquery.js"></script>
- <script type="text/javascript" src="../_static/underscore.js"></script>
- <script type="text/javascript" src="../_static/doctools.js"></script>
- <link rel="index" title="Index" href="../genindex.html" />
- <link rel="search" title="Search" href="../search.html" />
- <link rel="top" title="Boost.GIL documentation" href="../index.html" />
- <link rel="up" title="Image Processing" href="index.html" />
- <link rel="next" title="IO extensions" href="../io.html" />
- <link rel="prev" title="Basics" href="basics.html" />
- </head>
- <body>
- <div class="header">
- <table border="0" cellpadding="7" cellspacing="0" width="100%" summary=
- "header">
- <tr>
- <td valign="top" width="300">
- <h3><a href="../index.html"><img
- alt="C++ Boost" src="../_static/gil.png" border="0"></a></h3>
- </td>
- <td >
- <h1 align="center"><a href="../index.html"></a></h1>
- </td>
- <td>
- <div id="searchbox" style="display: none">
- <form class="search" action="../search.html" method="get">
- <input type="text" name="q" size="18" />
- <input type="submit" value="Search" />
- <input type="hidden" name="check_keywords" value="yes" />
- <input type="hidden" name="area" value="default" />
- </form>
- </div>
- <script type="text/javascript">$('#searchbox').show(0);</script>
- </td>
- </tr>
- </table>
- </div>
- <hr/>
- <div class="content">
- <div class="navbar" style="text-align:right;">
-
-
- <a class="prev" title="Basics" href="basics.html"><img src="../_static/prev.png" alt="prev"/></a>
- <a class="up" title="Image Processing" href="index.html"><img src="../_static/up.png" alt="up"/></a>
- <a class="next" title="IO extensions" href="../io.html"><img src="../_static/next.png" alt="next"/></a>
-
- </div>
-
- <div class="section" id="affine-region-detectors">
- <h1>Affine region detectors</h1>
- <div class="section" id="what-is-being-detected">
- <h2>What is being detected?</h2>
- <p>Affine region is basically any region of the image
- that is stable under affine transformations. It can be
- edges under affinity conditions, corners (small patch of an image)
- or any other stable features.</p>
- </div>
- <hr class="docutils" />
- <div class="section" id="available-detectors">
- <h2>Available detectors</h2>
- <p>At the moment, the following detectors are implemented</p>
- <ul class="simple">
- <li>Harris detector</li>
- <li>Hessian detector</li>
- </ul>
- </div>
- <hr class="docutils" />
- <div class="section" id="algorithm-steps">
- <h2>Algorithm steps</h2>
- <div class="section" id="harris-and-hessian">
- <h3>Harris and Hessian</h3>
- <p>Both are derived from a concept called Moravec window. Lets have a look
- at the image below:</p>
- <div class="figure" id="id1">
- <img alt="Moravec window corner case" src="../_images/Moravec-window-corner.png" />
- <p class="caption"><span class="caption-text">Moravec window corner case</span></p>
- </div>
- <p>As can be noticed, moving the yellow window in any direction will cause
- very big change in intensity. Now, lets have a look at the edge case:</p>
- <div class="figure" id="id2">
- <img alt="Moravec window edge case" src="../_images/Moravec-window-edge.png" />
- <p class="caption"><span class="caption-text">Moravec window edge case</span></p>
- </div>
- <p>In this case, intensity change will happen only when moving in
- particular direction.</p>
- <p>This is the key concept in understanding how the two corner detectors
- work.</p>
- <p>The algorithms have the same structure:</p>
- <ol class="arabic simple">
- <li>Compute image derivatives</li>
- <li>Compute Weighted sum</li>
- <li>Compute response</li>
- <li>Threshold (optional)</li>
- </ol>
- <p>Harris and Hessian differ in what <strong>derivatives they compute</strong>. Harris
- computes the following derivatives:</p>
- <p><code class="docutils literal"><span class="pre">HarrisMatrix</span> <span class="pre">=</span> <span class="pre">[(dx)^2,</span> <span class="pre">dxdy],</span> <span class="pre">[dxdy,</span> <span class="pre">(dy)^2]</span></code></p>
- <p>(note that <code class="docutils literal"><span class="pre">d(x^2)</span></code> and <code class="docutils literal"><span class="pre">(dy^2)</span></code> are <strong>numerical</strong> powers, not gradient again).</p>
- <p>The three distinct terms of a matrix can be separated into three images,
- to simplify implementation. Hessian, on the other hand, computes second
- order derivatives:</p>
- <p><code class="docutils literal"><span class="pre">HessianMatrix</span> <span class="pre">=</span> <span class="pre">[dxdx,</span> <span class="pre">dxdy][dxdy,</span> <span class="pre">dydy]</span></code></p>
- <p><strong>Weighted sum</strong> is the same for both. Usually Gaussian blur
- matrix is used as weights, because corners should have hill like
- curvature in gradients, and other weights might be noisy.
- Basically overlay weights matrix over a corner, compute sum of
- <code class="docutils literal"><span class="pre">s[i,j]=image[x</span> <span class="pre">+</span> <span class="pre">i,</span> <span class="pre">y</span> <span class="pre">+</span> <span class="pre">j]</span> <span class="pre">*</span> <span class="pre">weights[i,</span> <span class="pre">j]</span></code> for <code class="docutils literal"><span class="pre">i,</span> <span class="pre">j</span></code>
- from zero to weight matrix dimensions, then move the window
- and compute again until all of the image is covered.</p>
- <p><strong>Response computation</strong> is a matter of choice. Given the general form
- of both matrices above</p>
- <p><code class="docutils literal"><span class="pre">[a,</span> <span class="pre">b][c,</span> <span class="pre">d]</span></code></p>
- <p>One of the response functions is</p>
- <p><code class="docutils literal"><span class="pre">response</span> <span class="pre">=</span> <span class="pre">det</span> <span class="pre">-</span> <span class="pre">k</span> <span class="pre">*</span> <span class="pre">trace^2</span> <span class="pre">=</span> <span class="pre">a</span> <span class="pre">*</span> <span class="pre">c</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">*</span> <span class="pre">d</span> <span class="pre">-</span> <span class="pre">k</span> <span class="pre">*</span> <span class="pre">(a</span> <span class="pre">+</span> <span class="pre">d)^2</span></code></p>
- <p><code class="docutils literal"><span class="pre">k</span></code> is called discrimination constant. Usual values are <code class="docutils literal"><span class="pre">0.04</span></code> -
- <code class="docutils literal"><span class="pre">0.06</span></code>.</p>
- <p>The other is simply determinant</p>
- <p><code class="docutils literal"><span class="pre">response</span> <span class="pre">=</span> <span class="pre">det</span> <span class="pre">=</span> <span class="pre">a</span> <span class="pre">*</span> <span class="pre">c</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">*</span> <span class="pre">d</span></code></p>
- <p><strong>Thresholding</strong> is optional, but without it the result will be
- extremely noisy. For complex images, like the ones of outdoors, for
- Harris it will be in order of 100000000 and for Hessian will be in order
- of 10000. For simpler images values in order of 100s and 1000s should be
- enough. The numbers assume <code class="docutils literal"><span class="pre">uint8_t</span></code> gray image.</p>
- <p>To get deeper explanation please refer to following <strong>paper</strong>:</p>
- <p><a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.434.4816&rep=rep1&type=pdf">Harris, Christopher G., and Mike Stephens. “A combined corner and edge
- detector.” In Alvey vision conference, vol. 15, no. 50, pp. 10-5244.
- 1988.</a></p>
- <p><a class="reference external" href="https://hal.inria.fr/inria-00548252/document">Mikolajczyk, Krystian, and Cordelia Schmid. “An affine invariant interest point detector.” In European conference on computer vision, pp. 128-142. Springer, Berlin, Heidelberg, 2002.</a></p>
- <p><a class="reference external" href="https://hal.inria.fr/inria-00548528/document">Mikolajczyk, Krystian, Tinne Tuytelaars, Cordelia Schmid, Andrew Zisserman, Jiri Matas, Frederik Schaffalitzky, Timor Kadir, and Luc Van Gool. “A comparison of affine region detectors.” International journal of computer vision 65, no. 1-2 (2005): 43-72.</a></p>
- </div>
- </div>
- </div>
- <div class="navbar" style="text-align:right;">
-
-
- <a class="prev" title="Basics" href="basics.html"><img src="../_static/prev.png" alt="prev"/></a>
- <a class="up" title="Image Processing" href="index.html"><img src="../_static/up.png" alt="up"/></a>
- <a class="next" title="IO extensions" href="../io.html"><img src="../_static/next.png" alt="next"/></a>
-
- </div>
- </div>
- <div class="footer" role="contentinfo">
- Last updated on 2019-12-10 00:12:10.
- Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.6.
- </div>
- </body>
- </html>
|