affine-region-detectors.html 9.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3. <html xmlns="http://www.w3.org/1999/xhtml">
  4. <head>
  5. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  6. <title>Affine region detectors - Boost.GIL documentation</title>
  7. <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
  8. <link rel="stylesheet" href="../_static/style.css" type="text/css" />
  9. <script type="text/javascript">
  10. var DOCUMENTATION_OPTIONS = {
  11. URL_ROOT: '../',
  12. VERSION: '',
  13. COLLAPSE_MODINDEX: false,
  14. FILE_SUFFIX: '.html'
  15. };
  16. </script>
  17. <script type="text/javascript" src="../_static/jquery.js"></script>
  18. <script type="text/javascript" src="../_static/underscore.js"></script>
  19. <script type="text/javascript" src="../_static/doctools.js"></script>
  20. <link rel="index" title="Index" href="../genindex.html" />
  21. <link rel="search" title="Search" href="../search.html" />
  22. <link rel="top" title="Boost.GIL documentation" href="../index.html" />
  23. <link rel="up" title="Image Processing" href="index.html" />
  24. <link rel="next" title="IO extensions" href="../io.html" />
  25. <link rel="prev" title="Basics" href="basics.html" />
  26. </head>
  27. <body>
  28. <div class="header">
  29. <table border="0" cellpadding="7" cellspacing="0" width="100%" summary=
  30. "header">
  31. <tr>
  32. <td valign="top" width="300">
  33. <h3><a href="../index.html"><img
  34. alt="C++ Boost" src="../_static/gil.png" border="0"></a></h3>
  35. </td>
  36. <td >
  37. <h1 align="center"><a href="../index.html"></a></h1>
  38. </td>
  39. <td>
  40. <div id="searchbox" style="display: none">
  41. <form class="search" action="../search.html" method="get">
  42. <input type="text" name="q" size="18" />
  43. <input type="submit" value="Search" />
  44. <input type="hidden" name="check_keywords" value="yes" />
  45. <input type="hidden" name="area" value="default" />
  46. </form>
  47. </div>
  48. <script type="text/javascript">$('#searchbox').show(0);</script>
  49. </td>
  50. </tr>
  51. </table>
  52. </div>
  53. <hr/>
  54. <div class="content">
  55. <div class="navbar" style="text-align:right;">
  56. <a class="prev" title="Basics" href="basics.html"><img src="../_static/prev.png" alt="prev"/></a>
  57. <a class="up" title="Image Processing" href="index.html"><img src="../_static/up.png" alt="up"/></a>
  58. <a class="next" title="IO extensions" href="../io.html"><img src="../_static/next.png" alt="next"/></a>
  59. </div>
  60. <div class="section" id="affine-region-detectors">
  61. <h1>Affine region detectors</h1>
  62. <div class="section" id="what-is-being-detected">
  63. <h2>What is being detected?</h2>
  64. <p>Affine region is basically any region of the image
  65. that is stable under affine transformations. It can be
  66. edges under affinity conditions, corners (small patch of an image)
  67. or any other stable features.</p>
  68. </div>
  69. <hr class="docutils" />
  70. <div class="section" id="available-detectors">
  71. <h2>Available detectors</h2>
  72. <p>At the moment, the following detectors are implemented</p>
  73. <ul class="simple">
  74. <li>Harris detector</li>
  75. <li>Hessian detector</li>
  76. </ul>
  77. </div>
  78. <hr class="docutils" />
  79. <div class="section" id="algorithm-steps">
  80. <h2>Algorithm steps</h2>
  81. <div class="section" id="harris-and-hessian">
  82. <h3>Harris and Hessian</h3>
  83. <p>Both are derived from a concept called Moravec window. Lets have a look
  84. at the image below:</p>
  85. <div class="figure" id="id1">
  86. <img alt="Moravec window corner case" src="../_images/Moravec-window-corner.png" />
  87. <p class="caption"><span class="caption-text">Moravec window corner case</span></p>
  88. </div>
  89. <p>As can be noticed, moving the yellow window in any direction will cause
  90. very big change in intensity. Now, lets have a look at the edge case:</p>
  91. <div class="figure" id="id2">
  92. <img alt="Moravec window edge case" src="../_images/Moravec-window-edge.png" />
  93. <p class="caption"><span class="caption-text">Moravec window edge case</span></p>
  94. </div>
  95. <p>In this case, intensity change will happen only when moving in
  96. particular direction.</p>
  97. <p>This is the key concept in understanding how the two corner detectors
  98. work.</p>
  99. <p>The algorithms have the same structure:</p>
  100. <ol class="arabic simple">
  101. <li>Compute image derivatives</li>
  102. <li>Compute Weighted sum</li>
  103. <li>Compute response</li>
  104. <li>Threshold (optional)</li>
  105. </ol>
  106. <p>Harris and Hessian differ in what <strong>derivatives they compute</strong>. Harris
  107. computes the following derivatives:</p>
  108. <p><code class="docutils literal"><span class="pre">HarrisMatrix</span> <span class="pre">=</span> <span class="pre">[(dx)^2,</span> <span class="pre">dxdy],</span> <span class="pre">[dxdy,</span> <span class="pre">(dy)^2]</span></code></p>
  109. <p>(note that <code class="docutils literal"><span class="pre">d(x^2)</span></code> and <code class="docutils literal"><span class="pre">(dy^2)</span></code> are <strong>numerical</strong> powers, not gradient again).</p>
  110. <p>The three distinct terms of a matrix can be separated into three images,
  111. to simplify implementation. Hessian, on the other hand, computes second
  112. order derivatives:</p>
  113. <p><code class="docutils literal"><span class="pre">HessianMatrix</span> <span class="pre">=</span> <span class="pre">[dxdx,</span> <span class="pre">dxdy][dxdy,</span> <span class="pre">dydy]</span></code></p>
  114. <p><strong>Weighted sum</strong> is the same for both. Usually Gaussian blur
  115. matrix is used as weights, because corners should have hill like
  116. curvature in gradients, and other weights might be noisy.
  117. Basically overlay weights matrix over a corner, compute sum of
  118. <code class="docutils literal"><span class="pre">s[i,j]=image[x</span> <span class="pre">+</span> <span class="pre">i,</span> <span class="pre">y</span> <span class="pre">+</span> <span class="pre">j]</span> <span class="pre">*</span> <span class="pre">weights[i,</span> <span class="pre">j]</span></code> for <code class="docutils literal"><span class="pre">i,</span> <span class="pre">j</span></code>
  119. from zero to weight matrix dimensions, then move the window
  120. and compute again until all of the image is covered.</p>
  121. <p><strong>Response computation</strong> is a matter of choice. Given the general form
  122. of both matrices above</p>
  123. <p><code class="docutils literal"><span class="pre">[a,</span> <span class="pre">b][c,</span> <span class="pre">d]</span></code></p>
  124. <p>One of the response functions is</p>
  125. <p><code class="docutils literal"><span class="pre">response</span> <span class="pre">=</span> <span class="pre">det</span> <span class="pre">-</span> <span class="pre">k</span> <span class="pre">*</span> <span class="pre">trace^2</span> <span class="pre">=</span> <span class="pre">a</span> <span class="pre">*</span> <span class="pre">c</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">*</span> <span class="pre">d</span> <span class="pre">-</span> <span class="pre">k</span> <span class="pre">*</span> <span class="pre">(a</span> <span class="pre">+</span> <span class="pre">d)^2</span></code></p>
  126. <p><code class="docutils literal"><span class="pre">k</span></code> is called discrimination constant. Usual values are <code class="docutils literal"><span class="pre">0.04</span></code> -
  127. <code class="docutils literal"><span class="pre">0.06</span></code>.</p>
  128. <p>The other is simply determinant</p>
  129. <p><code class="docutils literal"><span class="pre">response</span> <span class="pre">=</span> <span class="pre">det</span> <span class="pre">=</span> <span class="pre">a</span> <span class="pre">*</span> <span class="pre">c</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">*</span> <span class="pre">d</span></code></p>
  130. <p><strong>Thresholding</strong> is optional, but without it the result will be
  131. extremely noisy. For complex images, like the ones of outdoors, for
  132. Harris it will be in order of 100000000 and for Hessian will be in order
  133. of 10000. For simpler images values in order of 100s and 1000s should be
  134. enough. The numbers assume <code class="docutils literal"><span class="pre">uint8_t</span></code> gray image.</p>
  135. <p>To get deeper explanation please refer to following <strong>paper</strong>:</p>
  136. <p><a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.434.4816&amp;rep=rep1&amp;type=pdf">Harris, Christopher G., and Mike Stephens. &#8220;A combined corner and edge
  137. detector.&#8221; In Alvey vision conference, vol. 15, no. 50, pp. 10-5244.
  138. 1988.</a></p>
  139. <p><a class="reference external" href="https://hal.inria.fr/inria-00548252/document">Mikolajczyk, Krystian, and Cordelia Schmid. &#8220;An affine invariant interest point detector.&#8221; In European conference on computer vision, pp. 128-142. Springer, Berlin, Heidelberg, 2002.</a></p>
  140. <p><a class="reference external" href="https://hal.inria.fr/inria-00548528/document">Mikolajczyk, Krystian, Tinne Tuytelaars, Cordelia Schmid, Andrew Zisserman, Jiri Matas, Frederik Schaffalitzky, Timor Kadir, and Luc Van Gool. &#8220;A comparison of affine region detectors.&#8221; International journal of computer vision 65, no. 1-2 (2005): 43-72.</a></p>
  141. </div>
  142. </div>
  143. </div>
  144. <div class="navbar" style="text-align:right;">
  145. <a class="prev" title="Basics" href="basics.html"><img src="../_static/prev.png" alt="prev"/></a>
  146. <a class="up" title="Image Processing" href="index.html"><img src="../_static/up.png" alt="up"/></a>
  147. <a class="next" title="IO extensions" href="../io.html"><img src="../_static/next.png" alt="next"/></a>
  148. </div>
  149. </div>
  150. <div class="footer" role="contentinfo">
  151. Last updated on 2019-12-10 00:12:10.
  152. Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.6.
  153. </div>
  154. </body>
  155. </html>