advanced_topics.html 42 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425
  1. <html>
  2. <head>
  3. <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
  4. <title>Advanced Topics</title>
  5. <link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css">
  6. <meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
  7. <link rel="home" href="../index.html" title="Chapter&#160;1.&#160;Boost.Compute">
  8. <link rel="up" href="../index.html" title="Chapter&#160;1.&#160;Boost.Compute">
  9. <link rel="prev" href="tutorial.html" title="Tutorial">
  10. <link rel="next" href="interop.html" title="Interoperability">
  11. </head>
  12. <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
  13. <table cellpadding="2" width="100%"><tr>
  14. <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td>
  15. <td align="center"><a href="../../../../../index.html">Home</a></td>
  16. <td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td>
  17. <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
  18. <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
  19. <td align="center"><a href="../../../../../more/index.htm">More</a></td>
  20. </tr></table>
  21. <hr>
  22. <div class="spirit-nav">
  23. <a accesskey="p" href="tutorial.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="interop.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
  24. </div>
  25. <div class="section">
  26. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  27. <a name="boost_compute.advanced_topics"></a><a class="link" href="advanced_topics.html" title="Advanced Topics">Advanced Topics</a>
  28. </h2></div></div></div>
  29. <div class="toc"><dl class="toc">
  30. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.vector_data_types">Vector
  31. Data Types</a></span></dt>
  32. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.custom_functions">Custom
  33. Functions</a></span></dt>
  34. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.custom_types">Custom Types</a></span></dt>
  35. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.complex_values">Complex
  36. Values</a></span></dt>
  37. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.lambda_expressions">Lambda
  38. Expressions</a></span></dt>
  39. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.asynchronous_operations">Asynchronous
  40. Operations</a></span></dt>
  41. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.performance_timing">Performance
  42. Timing</a></span></dt>
  43. <dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.opencl_api_interoperability">OpenCL
  44. API Interoperability</a></span></dt>
  45. </dl></div>
  46. <p>
  47. The following topics show advanced features of the Boost Compute library.
  48. </p>
  49. <div class="section">
  50. <div class="titlepage"><div><div><h3 class="title">
  51. <a name="boost_compute.advanced_topics.vector_data_types"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.vector_data_types" title="Vector Data Types">Vector
  52. Data Types</a>
  53. </h3></div></div></div>
  54. <p>
  55. In addition to the built-in scalar types (e.g. <code class="computeroutput"><span class="keyword">int</span></code>
  56. and <code class="computeroutput"><span class="keyword">float</span></code>), OpenCL also provides
  57. vector data types (e.g. <code class="computeroutput"><span class="identifier">int2</span></code>
  58. and <code class="computeroutput"><span class="identifier">vector4</span></code>). These can be
  59. used with the Boost Compute library on both the host and device.
  60. </p>
  61. <p>
  62. Boost.Compute provides typedefs for these types which take the form: <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">scalarN_</span></code> where <code class="computeroutput"><span class="identifier">scalar</span></code>
  63. is a scalar data type (e.g. <code class="computeroutput"><span class="keyword">int</span></code>,
  64. <code class="computeroutput"><span class="keyword">float</span></code>, <code class="computeroutput"><span class="keyword">char</span></code>)
  65. and <code class="computeroutput"><span class="identifier">N</span></code> is the size of the
  66. vector. Supported vector sizes are: 2, 4, 8, and 16.
  67. </p>
  68. <p>
  69. The following example shows how to transfer a set of 3D points stored as
  70. an array of <code class="computeroutput"><span class="keyword">float</span></code>s on the host
  71. the device and then calculate the sum of the point coordinates using the
  72. <code class="computeroutput"><a class="link" href="../boost/compute/accumulate.html" title="Function accumulate">accumulate()</a></code>
  73. function. The sum is transferred to the host and the centroid computed by
  74. dividing by the total number of points.
  75. </p>
  76. <p>
  77. Note that even though the points are in 3D, they are stored as <code class="computeroutput"><span class="identifier">float4</span></code> due to OpenCL's alignment requirements.
  78. </p>
  79. <p>
  80. </p>
  81. <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
  82. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">copy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  83. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">accumulate</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  84. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">container</span><span class="special">/</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  85. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">types</span><span class="special">/</span><span class="identifier">fundamental</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  86. <span class="keyword">namespace</span> <span class="identifier">compute</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">;</span>
  87. <span class="comment">// the point centroid example calculates and displays the</span>
  88. <span class="comment">// centroid of a set of 3D points stored as float4's</span>
  89. <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
  90. <span class="special">{</span>
  91. <span class="keyword">using</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">float4_</span><span class="special">;</span>
  92. <span class="comment">// get default device and setup context</span>
  93. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">device</span> <span class="identifier">device</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">system</span><span class="special">::</span><span class="identifier">default_device</span><span class="special">();</span>
  94. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">context</span><span class="special">(</span><span class="identifier">device</span><span class="special">);</span>
  95. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span> <span class="identifier">queue</span><span class="special">(</span><span class="identifier">context</span><span class="special">,</span> <span class="identifier">device</span><span class="special">);</span>
  96. <span class="comment">// point coordinates</span>
  97. <span class="keyword">float</span> <span class="identifier">points</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1.0f</span><span class="special">,</span> <span class="number">2.0f</span><span class="special">,</span> <span class="number">3.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
  98. <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">3.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
  99. <span class="number">1.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">2.5f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
  100. <span class="special">-</span><span class="number">7.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">3.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
  101. <span class="number">3.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">5.0f</span><span class="special">,</span> <span class="number">0.0f</span> <span class="special">};</span>
  102. <span class="comment">// create vector for five points</span>
  103. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">float4_</span><span class="special">&gt;</span> <span class="identifier">vector</span><span class="special">(</span><span class="number">5</span><span class="special">,</span> <span class="identifier">context</span><span class="special">);</span>
  104. <span class="comment">// copy point data to the device</span>
  105. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span>
  106. <span class="keyword">reinterpret_cast</span><span class="special">&lt;</span><span class="identifier">float4_</span> <span class="special">*&gt;(</span><span class="identifier">points</span><span class="special">),</span>
  107. <span class="keyword">reinterpret_cast</span><span class="special">&lt;</span><span class="identifier">float4_</span> <span class="special">*&gt;(</span><span class="identifier">points</span><span class="special">)</span> <span class="special">+</span> <span class="number">5</span><span class="special">,</span>
  108. <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span>
  109. <span class="identifier">queue</span>
  110. <span class="special">);</span>
  111. <span class="comment">// calculate sum</span>
  112. <span class="identifier">float4_</span> <span class="identifier">sum</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">accumulate</span><span class="special">(</span>
  113. <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">float4_</span><span class="special">(</span><span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">),</span> <span class="identifier">queue</span>
  114. <span class="special">);</span>
  115. <span class="comment">// calculate centroid</span>
  116. <span class="identifier">float4_</span> <span class="identifier">centroid</span><span class="special">;</span>
  117. <span class="keyword">for</span><span class="special">(</span><span class="identifier">size_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="number">3</span><span class="special">;</span> <span class="identifier">i</span><span class="special">++){</span>
  118. <span class="identifier">centroid</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">sum</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">5.0f</span><span class="special">;</span>
  119. <span class="special">}</span>
  120. <span class="comment">// print centroid</span>
  121. <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"centroid: "</span> <span class="special">&lt;&lt;</span> <span class="identifier">centroid</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
  122. <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
  123. <span class="special">}</span>
  124. </pre>
  125. <p>
  126. </p>
  127. </div>
  128. <div class="section">
  129. <div class="titlepage"><div><div><h3 class="title">
  130. <a name="boost_compute.advanced_topics.custom_functions"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.custom_functions" title="Custom Functions">Custom
  131. Functions</a>
  132. </h3></div></div></div>
  133. <p>
  134. The OpenCL runtime and the Boost Compute library provide a number of built-in
  135. functions such as sqrt() and dot() but many times these are not sufficient
  136. for solving the problem at hand.
  137. </p>
  138. <p>
  139. The Boost Compute library provides a few different ways to create custom
  140. functions that can be passed to the provided algorithms such as <code class="computeroutput"><a class="link" href="../boost/compute/transform.html" title="Function transform">transform()</a></code> and <code class="computeroutput"><a class="link" href="../boost/compute/reduce.html" title="Function reduce">reduce()</a></code>.
  141. </p>
  142. <p>
  143. The most basic method is to provide the raw source code for a function:
  144. </p>
  145. <p>
  146. </p>
  147. <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">function</span><span class="special">&lt;</span><span class="keyword">int</span> <span class="special">(</span><span class="keyword">int</span><span class="special">)&gt;</span> <span class="identifier">add_four</span> <span class="special">=</span>
  148. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">make_function_from_source</span><span class="special">&lt;</span><span class="keyword">int</span> <span class="special">(</span><span class="keyword">int</span><span class="special">)&gt;(</span>
  149. <span class="string">"add_four"</span><span class="special">,</span>
  150. <span class="string">"int add_four(int x) { return x + 4; }"</span>
  151. <span class="special">);</span>
  152. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">output</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
  153. </pre>
  154. <p>
  155. </p>
  156. <p>
  157. This can also be done more succinctly using the <code class="computeroutput">BOOST_COMPUTE_FUNCTION</code>
  158. macro:
  159. </p>
  160. <pre class="programlisting"><span class="identifier">BOOST_COMPUTE_FUNCTION</span><span class="special">(</span><span class="keyword">int</span><span class="special">,</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="special">(</span><span class="keyword">int</span> <span class="identifier">x</span><span class="special">),</span>
  161. <span class="special">{</span>
  162. <span class="keyword">return</span> <span class="identifier">x</span> <span class="special">+</span> <span class="number">4</span><span class="special">;</span>
  163. <span class="special">});</span>
  164. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">output</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
  165. </pre>
  166. <p>
  167. </p>
  168. <p>
  169. Also see <a href="http://kylelutz.blogspot.com/2014/03/custom-opencl-functions-in-c-with.html" target="_top">"Custom
  170. OpenCL functions in C++ with Boost.Compute"</a> for more details.
  171. </p>
  172. </div>
  173. <div class="section">
  174. <div class="titlepage"><div><div><h3 class="title">
  175. <a name="boost_compute.advanced_topics.custom_types"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.custom_types" title="Custom Types">Custom Types</a>
  176. </h3></div></div></div>
  177. <p>
  178. Boost.Compute provides the <code class="computeroutput">BOOST_COMPUTE_ADAPT_STRUCT</code>
  179. macro which allows a C++ struct/class to be wrapped and used in OpenCL.
  180. </p>
  181. </div>
  182. <div class="section">
  183. <div class="titlepage"><div><div><h3 class="title">
  184. <a name="boost_compute.advanced_topics.complex_values"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.complex_values" title="Complex Values">Complex
  185. Values</a>
  186. </h3></div></div></div>
  187. <p>
  188. While OpenCL itself doesn't natively support complex data types, the Boost
  189. Compute library provides them.
  190. </p>
  191. <p>
  192. To use complex values first include the following header:
  193. </p>
  194. <p>
  195. </p>
  196. <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">types</span><span class="special">/</span><span class="identifier">complex</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  197. </pre>
  198. <p>
  199. </p>
  200. <p>
  201. A vector of complex values can be created like so:
  202. </p>
  203. <p>
  204. </p>
  205. <pre class="programlisting"><span class="comment">// create vector on device</span>
  206. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">vector</span><span class="special">;</span>
  207. <span class="comment">// insert two complex values</span>
  208. <span class="identifier">vector</span><span class="special">.</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;(</span><span class="number">1.0f</span><span class="special">,</span> <span class="number">3.0f</span><span class="special">));</span>
  209. <span class="identifier">vector</span><span class="special">.</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;(</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">));</span>
  210. </pre>
  211. <p>
  212. </p>
  213. </div>
  214. <div class="section">
  215. <div class="titlepage"><div><div><h3 class="title">
  216. <a name="boost_compute.advanced_topics.lambda_expressions"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.lambda_expressions" title="Lambda Expressions">Lambda
  217. Expressions</a>
  218. </h3></div></div></div>
  219. <p>
  220. The lambda expression framework allows for functions and predicates to be
  221. defined at the call-site of an algorithm.
  222. </p>
  223. <p>
  224. Lambda expressions use the placeholders <code class="computeroutput"><span class="identifier">_1</span></code>
  225. and <code class="computeroutput"><span class="identifier">_2</span></code> to indicate the arguments.
  226. The following declarations will bring the lambda placeholders into the current
  227. scope:
  228. </p>
  229. <p>
  230. </p>
  231. <pre class="programlisting"><span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">lambda</span><span class="special">::</span><span class="identifier">_1</span><span class="special">;</span>
  232. <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">lambda</span><span class="special">::</span><span class="identifier">_2</span><span class="special">;</span>
  233. </pre>
  234. <p>
  235. </p>
  236. <p>
  237. The following examples show how to use lambda expressions along with the
  238. Boost.Compute algorithms to perform more complex operations on the device.
  239. </p>
  240. <p>
  241. To count the number of odd values in a vector:
  242. </p>
  243. <p>
  244. </p>
  245. <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">count_if</span><span class="special">(</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">_1</span> <span class="special">%</span> <span class="number">2</span> <span class="special">==</span> <span class="number">1</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
  246. </pre>
  247. <p>
  248. </p>
  249. <p>
  250. To multiply each value in a vector by three and subtract four:
  251. </p>
  252. <p>
  253. </p>
  254. <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">_1</span> <span class="special">*</span> <span class="number">3</span> <span class="special">-</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
  255. </pre>
  256. <p>
  257. </p>
  258. <p>
  259. Lambda expressions can also be used to create function&lt;&gt; objects:
  260. </p>
  261. <p>
  262. </p>
  263. <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">function</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">(</span><span class="keyword">int</span><span class="special">)&gt;</span> <span class="identifier">add_four</span> <span class="special">=</span> <span class="identifier">_1</span> <span class="special">+</span> <span class="number">4</span><span class="special">;</span>
  264. </pre>
  265. <p>
  266. </p>
  267. </div>
  268. <div class="section">
  269. <div class="titlepage"><div><div><h3 class="title">
  270. <a name="boost_compute.advanced_topics.asynchronous_operations"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.asynchronous_operations" title="Asynchronous Operations">Asynchronous
  271. Operations</a>
  272. </h3></div></div></div>
  273. <p>
  274. A major performance bottleneck in GPGPU applications is memory transfer.
  275. This can be alleviated by overlapping memory transfer with computation. The
  276. Boost Compute library provides the <code class="computeroutput"><a class="link" href="../boost/compute/copy_async.html" title="Function template copy_async">copy_async()</a></code>
  277. function which performs an asynchronous memory transfers between the host
  278. and the device.
  279. </p>
  280. <p>
  281. For example, to initiate a copy from the host to the device and then perform
  282. other actions:
  283. </p>
  284. <p>
  285. </p>
  286. <pre class="programlisting"><span class="comment">// data on the host</span>
  287. <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="identifier">host_vector</span> <span class="special">=</span> <span class="special">...</span>
  288. <span class="comment">// create a vector on the device</span>
  289. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="identifier">device_vector</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">size</span><span class="special">(),</span> <span class="identifier">context</span><span class="special">);</span>
  290. <span class="comment">// copy data to the device asynchronously</span>
  291. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">future</span><span class="special">&lt;</span><span class="keyword">void</span><span class="special">&gt;</span> <span class="identifier">f</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy_async</span><span class="special">(</span>
  292. <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">queue</span>
  293. <span class="special">);</span>
  294. <span class="comment">// perform other work on the host or device</span>
  295. <span class="comment">// ...</span>
  296. <span class="comment">// ensure the copy is completed</span>
  297. <span class="identifier">f</span><span class="special">.</span><span class="identifier">wait</span><span class="special">();</span>
  298. <span class="comment">// use data on the device (e.g. sort)</span>
  299. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">sort</span><span class="special">(</span><span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">queue</span><span class="special">);</span>
  300. </pre>
  301. <p>
  302. </p>
  303. </div>
  304. <div class="section">
  305. <div class="titlepage"><div><div><h3 class="title">
  306. <a name="boost_compute.advanced_topics.performance_timing"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.performance_timing" title="Performance Timing">Performance
  307. Timing</a>
  308. </h3></div></div></div>
  309. <p>
  310. For example, to measure the time to copy a vector of data from the host to
  311. the device:
  312. </p>
  313. <p>
  314. </p>
  315. <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">vector</span><span class="special">&gt;</span>
  316. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">cstdlib</span><span class="special">&gt;</span>
  317. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
  318. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">event</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  319. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">system</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  320. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">copy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  321. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">async</span><span class="special">/</span><span class="identifier">future</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  322. <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">container</span><span class="special">/</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  323. <span class="keyword">namespace</span> <span class="identifier">compute</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">;</span>
  324. <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
  325. <span class="special">{</span>
  326. <span class="comment">// get the default device</span>
  327. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">device</span> <span class="identifier">gpu</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">system</span><span class="special">::</span><span class="identifier">default_device</span><span class="special">();</span>
  328. <span class="comment">// create context for default device</span>
  329. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">context</span><span class="special">(</span><span class="identifier">gpu</span><span class="special">);</span>
  330. <span class="comment">// create command queue with profiling enabled</span>
  331. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span> <span class="identifier">queue</span><span class="special">(</span>
  332. <span class="identifier">context</span><span class="special">,</span> <span class="identifier">gpu</span><span class="special">,</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span><span class="special">::</span><span class="identifier">enable_profiling</span>
  333. <span class="special">);</span>
  334. <span class="comment">// generate random data on the host</span>
  335. <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">host_vector</span><span class="special">(</span><span class="number">16000000</span><span class="special">);</span>
  336. <span class="identifier">std</span><span class="special">::</span><span class="identifier">generate</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">rand</span><span class="special">);</span>
  337. <span class="comment">// create a vector on the device</span>
  338. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">device_vector</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">size</span><span class="special">(),</span> <span class="identifier">context</span><span class="special">);</span>
  339. <span class="comment">// copy data from the host to the device</span>
  340. <span class="identifier">compute</span><span class="special">::</span><span class="identifier">future</span><span class="special">&lt;</span><span class="keyword">void</span><span class="special">&gt;</span> <span class="identifier">future</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy_async</span><span class="special">(</span>
  341. <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">queue</span>
  342. <span class="special">);</span>
  343. <span class="comment">// wait for copy to finish</span>
  344. <span class="identifier">future</span><span class="special">.</span><span class="identifier">wait</span><span class="special">();</span>
  345. <span class="comment">// get elapsed time from event profiling information</span>
  346. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">milliseconds</span> <span class="identifier">duration</span> <span class="special">=</span>
  347. <span class="identifier">future</span><span class="special">.</span><span class="identifier">get_event</span><span class="special">().</span><span class="identifier">duration</span><span class="special">&lt;</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">milliseconds</span><span class="special">&gt;();</span>
  348. <span class="comment">// print elapsed time in milliseconds</span>
  349. <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"time: "</span> <span class="special">&lt;&lt;</span> <span class="identifier">duration</span><span class="special">.</span><span class="identifier">count</span><span class="special">()</span> <span class="special">&lt;&lt;</span> <span class="string">" ms"</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
  350. <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
  351. <span class="special">}</span>
  352. </pre>
  353. <p>
  354. </p>
  355. </div>
  356. <div class="section">
  357. <div class="titlepage"><div><div><h3 class="title">
  358. <a name="boost_compute.advanced_topics.opencl_api_interoperability"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.opencl_api_interoperability" title="OpenCL API Interoperability">OpenCL
  359. API Interoperability</a>
  360. </h3></div></div></div>
  361. <p>
  362. The Boost Compute library is designed to easily interoperate with the OpenCL
  363. API. All of the wrapped classes have conversion operators to their underlying
  364. OpenCL types which allows them to be passed directly to the OpenCL functions.
  365. </p>
  366. <p>
  367. For example,
  368. </p>
  369. <pre class="programlisting"><span class="comment">// create context object</span>
  370. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">ctx</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">default_context</span><span class="special">();</span>
  371. <span class="comment">// query number of devices using the OpenCL API</span>
  372. <span class="identifier">cl_uint</span> <span class="identifier">num_devices</span><span class="special">;</span>
  373. <span class="identifier">clGetContextInfo</span><span class="special">(</span><span class="identifier">ctx</span><span class="special">,</span> <span class="identifier">CL_CONTEXT_NUM_DEVICES</span><span class="special">,</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">cl_uint</span><span class="special">),</span> <span class="special">&amp;</span><span class="identifier">num_devices</span><span class="special">,</span> <span class="number">0</span><span class="special">);</span>
  374. <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"num_devices: "</span> <span class="special">&lt;&lt;</span> <span class="identifier">num_devices</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
  375. </pre>
  376. <p>
  377. </p>
  378. </div>
  379. </div>
  380. <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
  381. <td align="left"></td>
  382. <td align="right"><div class="copyright-footer">Copyright &#169; 2013, 2014 Kyle Lutz<p>
  383. Distributed under the Boost Software License, Version 1.0. (See accompanying
  384. file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
  385. </p>
  386. </div></td>
  387. </tr></table>
  388. <hr>
  389. <div class="spirit-nav">
  390. <a accesskey="p" href="tutorial.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="interop.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
  391. </div>
  392. </body>
  393. </html>