article.rst 38 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947
  1. +++++++++++++++++++++++++++++++++++++++++++
  2. Building Hybrid Systems with Boost.Python
  3. +++++++++++++++++++++++++++++++++++++++++++
  4. :Author: David Abrahams
  5. :Contact: dave@boost-consulting.com
  6. :organization: `Boost Consulting`_
  7. :date: 2003-05-14
  8. :Author: Ralf W. Grosse-Kunstleve
  9. :copyright: Copyright David Abrahams and Ralf W. Grosse-Kunstleve 2003. All rights reserved
  10. .. contents:: Table of Contents
  11. .. _`Boost Consulting`: http://www.boost-consulting.com
  12. ==========
  13. Abstract
  14. ==========
  15. Boost.Python is an open source C++ library which provides a concise
  16. IDL-like interface for binding C++ classes and functions to
  17. Python. Leveraging the full power of C++ compile-time introspection
  18. and of recently developed metaprogramming techniques, this is achieved
  19. entirely in pure C++, without introducing a new syntax.
  20. Boost.Python's rich set of features and high-level interface make it
  21. possible to engineer packages from the ground up as hybrid systems,
  22. giving programmers easy and coherent access to both the efficient
  23. compile-time polymorphism of C++ and the extremely convenient run-time
  24. polymorphism of Python.
  25. ==============
  26. Introduction
  27. ==============
  28. Python and C++ are in many ways as different as two languages could
  29. be: while C++ is usually compiled to machine-code, Python is
  30. interpreted. Python's dynamic type system is often cited as the
  31. foundation of its flexibility, while in C++ static typing is the
  32. cornerstone of its efficiency. C++ has an intricate and difficult
  33. compile-time meta-language, while in Python, practically everything
  34. happens at runtime.
  35. Yet for many programmers, these very differences mean that Python and
  36. C++ complement one another perfectly. Performance bottlenecks in
  37. Python programs can be rewritten in C++ for maximal speed, and
  38. authors of powerful C++ libraries choose Python as a middleware
  39. language for its flexible system integration capabilities.
  40. Furthermore, the surface differences mask some strong similarities:
  41. * 'C'-family control structures (if, while, for...)
  42. * Support for object-orientation, functional programming, and generic
  43. programming (these are both *multi-paradigm* programming languages.)
  44. * Comprehensive operator overloading facilities, recognizing the
  45. importance of syntactic variability for readability and
  46. expressivity.
  47. * High-level concepts such as collections and iterators.
  48. * High-level encapsulation facilities (C++: namespaces, Python: modules)
  49. to support the design of re-usable libraries.
  50. * Exception-handling for effective management of error conditions.
  51. * C++ idioms in common use, such as handle/body classes and
  52. reference-counted smart pointers mirror Python reference semantics.
  53. Given Python's rich 'C' interoperability API, it should in principle
  54. be possible to expose C++ type and function interfaces to Python with
  55. an analogous interface to their C++ counterparts. However, the
  56. facilities provided by Python alone for integration with C++ are
  57. relatively meager. Compared to C++ and Python, 'C' has only very
  58. rudimentary abstraction facilities, and support for exception-handling
  59. is completely missing. 'C' extension module writers are required to
  60. manually manage Python reference counts, which is both annoyingly
  61. tedious and extremely error-prone. Traditional extension modules also
  62. tend to contain a great deal of boilerplate code repetition which
  63. makes them difficult to maintain, especially when wrapping an evolving
  64. API.
  65. These limitations have lead to the development of a variety of wrapping
  66. systems. SWIG_ is probably the most popular package for the
  67. integration of C/C++ and Python. A more recent development is SIP_,
  68. which was specifically designed for interfacing Python with the Qt_
  69. graphical user interface library. Both SWIG and SIP introduce their
  70. own specialized languages for customizing inter-language bindings.
  71. This has certain advantages, but having to deal with three different
  72. languages (Python, C/C++ and the interface language) also introduces
  73. practical and mental difficulties. The CXX_ package demonstrates an
  74. interesting alternative. It shows that at least some parts of
  75. Python's 'C' API can be wrapped and presented through a much more
  76. user-friendly C++ interface. However, unlike SWIG and SIP, CXX does
  77. not include support for wrapping C++ classes as new Python types.
  78. The features and goals of Boost.Python_ overlap significantly with
  79. many of these other systems. That said, Boost.Python attempts to
  80. maximize convenience and flexibility without introducing a separate
  81. wrapping language. Instead, it presents the user with a high-level
  82. C++ interface for wrapping C++ classes and functions, managing much of
  83. the complexity behind-the-scenes with static metaprogramming.
  84. Boost.Python also goes beyond the scope of earlier systems by
  85. providing:
  86. * Support for C++ virtual functions that can be overridden in Python.
  87. * Comprehensive lifetime management facilities for low-level C++
  88. pointers and references.
  89. * Support for organizing extensions as Python packages,
  90. with a central registry for inter-language type conversions.
  91. * A safe and convenient mechanism for tying into Python's powerful
  92. serialization engine (pickle).
  93. * Coherence with the rules for handling C++ lvalues and rvalues that
  94. can only come from a deep understanding of both the Python and C++
  95. type systems.
  96. The key insight that sparked the development of Boost.Python is that
  97. much of the boilerplate code in traditional extension modules could be
  98. eliminated using C++ compile-time introspection. Each argument of a
  99. wrapped C++ function must be extracted from a Python object using a
  100. procedure that depends on the argument type. Similarly the function's
  101. return type determines how the return value will be converted from C++
  102. to Python. Of course argument and return types are part of each
  103. function's type, and this is exactly the source from which
  104. Boost.Python deduces most of the information required.
  105. This approach leads to *user guided wrapping*: as much information is
  106. extracted directly from the source code to be wrapped as is possible
  107. within the framework of pure C++, and some additional information is
  108. supplied explicitly by the user. Mostly the guidance is mechanical
  109. and little real intervention is required. Because the interface
  110. specification is written in the same full-featured language as the
  111. code being exposed, the user has unprecedented power available when
  112. she does need to take control.
  113. .. _Python: http://www.python.org/
  114. .. _SWIG: http://www.swig.org/
  115. .. _SIP: http://www.riverbankcomputing.co.uk/sip/index.php
  116. .. _Qt: http://www.trolltech.com/
  117. .. _CXX: http://cxx.sourceforge.net/
  118. .. _Boost.Python: http://www.boost.org/libs/python/doc
  119. ===========================
  120. Boost.Python Design Goals
  121. ===========================
  122. The primary goal of Boost.Python is to allow users to expose C++
  123. classes and functions to Python using nothing more than a C++
  124. compiler. In broad strokes, the user experience should be one of
  125. directly manipulating C++ objects from Python.
  126. However, it's also important not to translate all interfaces *too*
  127. literally: the idioms of each language must be respected. For
  128. example, though C++ and Python both have an iterator concept, they are
  129. expressed very differently. Boost.Python has to be able to bridge the
  130. interface gap.
  131. It must be possible to insulate Python users from crashes resulting
  132. from trivial misuses of C++ interfaces, such as accessing
  133. already-deleted objects. By the same token the library should
  134. insulate C++ users from low-level Python 'C' API, replacing
  135. error-prone 'C' interfaces like manual reference-count management and
  136. raw ``PyObject`` pointers with more-robust alternatives.
  137. Support for component-based development is crucial, so that C++ types
  138. exposed in one extension module can be passed to functions exposed in
  139. another without loss of crucial information like C++ inheritance
  140. relationships.
  141. Finally, all wrapping must be *non-intrusive*, without modifying or
  142. even seeing the original C++ source code. Existing C++ libraries have
  143. to be wrappable by third parties who only have access to header files
  144. and binaries.
  145. ==========================
  146. Hello Boost.Python World
  147. ==========================
  148. And now for a preview of Boost.Python, and how it improves on the raw
  149. facilities offered by Python. Here's a function we might want to
  150. expose::
  151. char const* greet(unsigned x)
  152. {
  153. static char const* const msgs[] = { "hello", "Boost.Python", "world!" };
  154. if (x > 2)
  155. throw std::range_error("greet: index out of range");
  156. return msgs[x];
  157. }
  158. To wrap this function in standard C++ using the Python 'C' API, we'd
  159. need something like this::
  160. extern "C" // all Python interactions use 'C' linkage and calling convention
  161. {
  162. // Wrapper to handle argument/result conversion and checking
  163. PyObject* greet_wrap(PyObject* args, PyObject * keywords)
  164. {
  165. int x;
  166. if (PyArg_ParseTuple(args, "i", &x)) // extract/check arguments
  167. {
  168. char const* result = greet(x); // invoke wrapped function
  169. return PyString_FromString(result); // convert result to Python
  170. }
  171. return 0; // error occurred
  172. }
  173. // Table of wrapped functions to be exposed by the module
  174. static PyMethodDef methods[] = {
  175. { "greet", greet_wrap, METH_VARARGS, "return one of 3 parts of a greeting" }
  176. , { NULL, NULL, 0, NULL } // sentinel
  177. };
  178. // module initialization function
  179. DL_EXPORT init_hello()
  180. {
  181. (void) Py_InitModule("hello", methods); // add the methods to the module
  182. }
  183. }
  184. Now here's the wrapping code we'd use to expose it with Boost.Python::
  185. #include <boost/python.hpp>
  186. using namespace boost::python;
  187. BOOST_PYTHON_MODULE(hello)
  188. {
  189. def("greet", greet, "return one of 3 parts of a greeting");
  190. }
  191. and here it is in action::
  192. >>> import hello
  193. >>> for x in range(3):
  194. ... print hello.greet(x)
  195. ...
  196. hello
  197. Boost.Python
  198. world!
  199. Aside from the fact that the 'C' API version is much more verbose,
  200. it's worth noting a few things that it doesn't handle correctly:
  201. * The original function accepts an unsigned integer, and the Python
  202. 'C' API only gives us a way of extracting signed integers. The
  203. Boost.Python version will raise a Python exception if we try to pass
  204. a negative number to ``hello.greet``, but the other one will proceed
  205. to do whatever the C++ implementation does when converting an
  206. negative integer to unsigned (usually wrapping to some very large
  207. number), and pass the incorrect translation on to the wrapped
  208. function.
  209. * That brings us to the second problem: if the C++ ``greet()``
  210. function is called with a number greater than 2, it will throw an
  211. exception. Typically, if a C++ exception propagates across the
  212. boundary with code generated by a 'C' compiler, it will cause a
  213. crash. As you can see in the first version, there's no C++
  214. scaffolding there to prevent this from happening. Functions wrapped
  215. by Boost.Python automatically include an exception-handling layer
  216. which protects Python users by translating unhandled C++ exceptions
  217. into a corresponding Python exception.
  218. * A slightly more-subtle limitation is that the argument conversion
  219. used in the Python 'C' API case can only get that integer ``x`` in
  220. *one way*. PyArg_ParseTuple can't convert Python ``long`` objects
  221. (arbitrary-precision integers) which happen to fit in an ``unsigned
  222. int`` but not in a ``signed long``, nor will it ever handle a
  223. wrapped C++ class with a user-defined implicit ``operator unsigned
  224. int()`` conversion. Boost.Python's dynamic type conversion
  225. registry allows users to add arbitrary conversion methods.
  226. ==================
  227. Library Overview
  228. ==================
  229. This section outlines some of the library's major features. Except as
  230. neccessary to avoid confusion, details of library implementation are
  231. omitted.
  232. ------------------
  233. Exposing Classes
  234. ------------------
  235. C++ classes and structs are exposed with a similarly-terse interface.
  236. Given::
  237. struct World
  238. {
  239. void set(std::string msg) { this->msg = msg; }
  240. std::string greet() { return msg; }
  241. std::string msg;
  242. };
  243. The following code will expose it in our extension module::
  244. #include <boost/python.hpp>
  245. BOOST_PYTHON_MODULE(hello)
  246. {
  247. class_<World>("World")
  248. .def("greet", &World::greet)
  249. .def("set", &World::set)
  250. ;
  251. }
  252. Although this code has a certain pythonic familiarity, people
  253. sometimes find the syntax bit confusing because it doesn't look like
  254. most of the C++ code they're used to. All the same, this is just
  255. standard C++. Because of their flexible syntax and operator
  256. overloading, C++ and Python are great for defining domain-specific
  257. (sub)languages
  258. (DSLs), and that's what we've done in Boost.Python. To break it down::
  259. class_<World>("World")
  260. constructs an unnamed object of type ``class_<World>`` and passes
  261. ``"World"`` to its constructor. This creates a new-style Python class
  262. called ``World`` in the extension module, and associates it with the
  263. C++ type ``World`` in the Boost.Python type conversion registry. We
  264. might have also written::
  265. class_<World> w("World");
  266. but that would've been more verbose, since we'd have to name ``w``
  267. again to invoke its ``def()`` member function::
  268. w.def("greet", &World::greet)
  269. There's nothing special about the location of the dot for member
  270. access in the original example: C++ allows any amount of whitespace on
  271. either side of a token, and placing the dot at the beginning of each
  272. line allows us to chain as many successive calls to member functions
  273. as we like with a uniform syntax. The other key fact that allows
  274. chaining is that ``class_<>`` member functions all return a reference
  275. to ``*this``.
  276. So the example is equivalent to::
  277. class_<World> w("World");
  278. w.def("greet", &World::greet);
  279. w.def("set", &World::set);
  280. It's occasionally useful to be able to break down the components of a
  281. Boost.Python class wrapper in this way, but the rest of this article
  282. will stick to the terse syntax.
  283. For completeness, here's the wrapped class in use: ::
  284. >>> import hello
  285. >>> planet = hello.World()
  286. >>> planet.set('howdy')
  287. >>> planet.greet()
  288. 'howdy'
  289. Constructors
  290. ============
  291. Since our ``World`` class is just a plain ``struct``, it has an
  292. implicit no-argument (nullary) constructor. Boost.Python exposes the
  293. nullary constructor by default, which is why we were able to write: ::
  294. >>> planet = hello.World()
  295. However, well-designed classes in any language may require constructor
  296. arguments in order to establish their invariants. Unlike Python,
  297. where ``__init__`` is just a specially-named method, In C++
  298. constructors cannot be handled like ordinary member functions. In
  299. particular, we can't take their address: ``&World::World`` is an
  300. error. The library provides a different interface for specifying
  301. constructors. Given::
  302. struct World
  303. {
  304. World(std::string msg); // added constructor
  305. ...
  306. we can modify our wrapping code as follows::
  307. class_<World>("World", init<std::string>())
  308. ...
  309. of course, a C++ class may have additional constructors, and we can
  310. expose those as well by passing more instances of ``init<...>`` to
  311. ``def()``::
  312. class_<World>("World", init<std::string>())
  313. .def(init<double, double>())
  314. ...
  315. Boost.Python allows wrapped functions, member functions, and
  316. constructors to be overloaded to mirror C++ overloading.
  317. Data Members and Properties
  318. ===========================
  319. Any publicly-accessible data members in a C++ class can be easily
  320. exposed as either ``readonly`` or ``readwrite`` attributes::
  321. class_<World>("World", init<std::string>())
  322. .def_readonly("msg", &World::msg)
  323. ...
  324. and can be used directly in Python: ::
  325. >>> planet = hello.World('howdy')
  326. >>> planet.msg
  327. 'howdy'
  328. This does *not* result in adding attributes to the ``World`` instance
  329. ``__dict__``, which can result in substantial memory savings when
  330. wrapping large data structures. In fact, no instance ``__dict__``
  331. will be created at all unless attributes are explicitly added from
  332. Python. Boost.Python owes this capability to the new Python 2.2 type
  333. system, in particular the descriptor interface and ``property`` type.
  334. In C++, publicly-accessible data members are considered a sign of poor
  335. design because they break encapsulation, and style guides usually
  336. dictate the use of "getter" and "setter" functions instead. In
  337. Python, however, ``__getattr__``, ``__setattr__``, and since 2.2,
  338. ``property`` mean that attribute access is just one more
  339. well-encapsulated syntactic tool at the programmer's disposal.
  340. Boost.Python bridges this idiomatic gap by making Python ``property``
  341. creation directly available to users. If ``msg`` were private, we
  342. could still expose it as attribute in Python as follows::
  343. class_<World>("World", init<std::string>())
  344. .add_property("msg", &World::greet, &World::set)
  345. ...
  346. The example above mirrors the familiar usage of properties in Python
  347. 2.2+: ::
  348. >>> class World(object):
  349. ... __init__(self, msg):
  350. ... self.__msg = msg
  351. ... def greet(self):
  352. ... return self.__msg
  353. ... def set(self, msg):
  354. ... self.__msg = msg
  355. ... msg = property(greet, set)
  356. Operator Overloading
  357. ====================
  358. The ability to write arithmetic operators for user-defined types has
  359. been a major factor in the success of both languages for numerical
  360. computation, and the success of packages like NumPy_ attests to the
  361. power of exposing operators in extension modules. Boost.Python
  362. provides a concise mechanism for wrapping operator overloads. The
  363. example below shows a fragment from a wrapper for the Boost rational
  364. number library::
  365. class_<rational<int> >("rational_int")
  366. .def(init<int, int>()) // constructor, e.g. rational_int(3,4)
  367. .def("numerator", &rational<int>::numerator)
  368. .def("denominator", &rational<int>::denominator)
  369. .def(-self) // __neg__ (unary minus)
  370. .def(self + self) // __add__ (homogeneous)
  371. .def(self * self) // __mul__
  372. .def(self + int()) // __add__ (heterogenous)
  373. .def(int() + self) // __radd__
  374. ...
  375. The magic is performed using a simplified application of "expression
  376. templates" [VELD1995]_, a technique originally developed for
  377. optimization of high-performance matrix algebra expressions. The
  378. essence is that instead of performing the computation immediately,
  379. operators are overloaded to construct a type *representing* the
  380. computation. In matrix algebra, dramatic optimizations are often
  381. available when the structure of an entire expression can be taken into
  382. account, rather than evaluating each operation "greedily".
  383. Boost.Python uses the same technique to build an appropriate Python
  384. method object based on expressions involving ``self``.
  385. .. _NumPy: http://www.pfdubois.com/numpy/
  386. Inheritance
  387. ===========
  388. C++ inheritance relationships can be represented to Boost.Python by adding
  389. an optional ``bases<...>`` argument to the ``class_<...>`` template
  390. parameter list as follows::
  391. class_<Derived, bases<Base1,Base2> >("Derived")
  392. ...
  393. This has two effects:
  394. 1. When the ``class_<...>`` is created, Python type objects
  395. corresponding to ``Base1`` and ``Base2`` are looked up in
  396. Boost.Python's registry, and are used as bases for the new Python
  397. ``Derived`` type object, so methods exposed for the Python ``Base1``
  398. and ``Base2`` types are automatically members of the ``Derived``
  399. type. Because the registry is global, this works correctly even if
  400. ``Derived`` is exposed in a different module from either of its
  401. bases.
  402. 2. C++ conversions from ``Derived`` to its bases are added to the
  403. Boost.Python registry. Thus wrapped C++ methods expecting (a
  404. pointer or reference to) an object of either base type can be
  405. called with an object wrapping a ``Derived`` instance. Wrapped
  406. member functions of class ``T`` are treated as though they have an
  407. implicit first argument of ``T&``, so these conversions are
  408. neccessary to allow the base class methods to be called for derived
  409. objects.
  410. Of course it's possible to derive new Python classes from wrapped C++
  411. class instances. Because Boost.Python uses the new-style class
  412. system, that works very much as for the Python built-in types. There
  413. is one significant detail in which it differs: the built-in types
  414. generally establish their invariants in their ``__new__`` function, so
  415. that derived classes do not need to call ``__init__`` on the base
  416. class before invoking its methods : ::
  417. >>> class L(list):
  418. ... def __init__(self):
  419. ... pass
  420. ...
  421. >>> L().reverse()
  422. >>>
  423. Because C++ object construction is a one-step operation, C++ instance
  424. data cannot be constructed until the arguments are available, in the
  425. ``__init__`` function: ::
  426. >>> class D(SomeBoostPythonClass):
  427. ... def __init__(self):
  428. ... pass
  429. ...
  430. >>> D().some_boost_python_method()
  431. Traceback (most recent call last):
  432. File "<stdin>", line 1, in ?
  433. TypeError: bad argument type for built-in operation
  434. This happened because Boost.Python couldn't find instance data of type
  435. ``SomeBoostPythonClass`` within the ``D`` instance; ``D``'s ``__init__``
  436. function masked construction of the base class. It could be corrected
  437. by either removing ``D``'s ``__init__`` function or having it call
  438. ``SomeBoostPythonClass.__init__(...)`` explicitly.
  439. Virtual Functions
  440. =================
  441. Deriving new types in Python from extension classes is not very
  442. interesting unless they can be used polymorphically from C++. In
  443. other words, Python method implementations should appear to override
  444. the implementation of C++ virtual functions when called *through base
  445. class pointers/references from C++*. Since the only way to alter the
  446. behavior of a virtual function is to override it in a derived class,
  447. the user must build a special derived class to dispatch a polymorphic
  448. class' virtual functions::
  449. //
  450. // interface to wrap:
  451. //
  452. class Base
  453. {
  454. public:
  455. virtual int f(std::string x) { return 42; }
  456. virtual ~Base();
  457. };
  458. int calls_f(Base const& b, std::string x) { return b.f(x); }
  459. //
  460. // Wrapping Code
  461. //
  462. // Dispatcher class
  463. struct BaseWrap : Base
  464. {
  465. // Store a pointer to the Python object
  466. BaseWrap(PyObject* self_) : self(self_) {}
  467. PyObject* self;
  468. // Default implementation, for when f is not overridden
  469. int f_default(std::string x) { return this->Base::f(x); }
  470. // Dispatch implementation
  471. int f(std::string x) { return call_method<int>(self, "f", x); }
  472. };
  473. ...
  474. def("calls_f", calls_f);
  475. class_<Base, BaseWrap>("Base")
  476. .def("f", &Base::f, &BaseWrap::f_default)
  477. ;
  478. Now here's some Python code which demonstrates: ::
  479. >>> class Derived(Base):
  480. ... def f(self, s):
  481. ... return len(s)
  482. ...
  483. >>> calls_f(Base(), 'foo')
  484. 42
  485. >>> calls_f(Derived(), 'forty-two')
  486. 9
  487. Things to notice about the dispatcher class:
  488. * The key element which allows overriding in Python is the
  489. ``call_method`` invocation, which uses the same global type
  490. conversion registry as the C++ function wrapping does to convert its
  491. arguments from C++ to Python and its return type from Python to C++.
  492. * Any constructor signatures you wish to wrap must be replicated with
  493. an initial ``PyObject*`` argument
  494. * The dispatcher must store this argument so that it can be used to
  495. invoke ``call_method``
  496. * The ``f_default`` member function is needed when the function being
  497. exposed is not pure virtual; there's no other way ``Base::f`` can be
  498. called on an object of type ``BaseWrap``, since it overrides ``f``.
  499. Deeper Reflection on the Horizon?
  500. =================================
  501. Admittedly, this formula is tedious to repeat, especially on a project
  502. with many polymorphic classes. That it is neccessary reflects some
  503. limitations in C++'s compile-time introspection capabilities: there's
  504. no way to enumerate the members of a class and find out which are
  505. virtual functions. At least one very promising project has been
  506. started to write a front-end which can generate these dispatchers (and
  507. other wrapping code) automatically from C++ headers.
  508. Pyste_ is being developed by Bruno da Silva de Oliveira. It builds on
  509. GCC_XML_, which generates an XML version of GCC's internal program
  510. representation. Since GCC is a highly-conformant C++ compiler, this
  511. ensures correct handling of the most-sophisticated template code and
  512. full access to the underlying type system. In keeping with the
  513. Boost.Python philosophy, a Pyste interface description is neither
  514. intrusive on the code being wrapped, nor expressed in some unfamiliar
  515. language: instead it is a 100% pure Python script. If Pyste is
  516. successful it will mark a move away from wrapping everything directly
  517. in C++ for many of our users. It will also allow us the choice to
  518. shift some of the metaprogram code from C++ to Python. We expect that
  519. soon, not only our users but the Boost.Python developers themselves
  520. will be "thinking hybrid" about their own code.
  521. .. _`GCC_XML`: http://www.gccxml.org/HTML/Index.html
  522. .. _`Pyste`: http://www.boost.org/libs/python/pyste
  523. ---------------
  524. Serialization
  525. ---------------
  526. *Serialization* is the process of converting objects in memory to a
  527. form that can be stored on disk or sent over a network connection. The
  528. serialized object (most often a plain string) can be retrieved and
  529. converted back to the original object. A good serialization system will
  530. automatically convert entire object hierarchies. Python's standard
  531. ``pickle`` module is just such a system. It leverages the language's strong
  532. runtime introspection facilities for serializing practically arbitrary
  533. user-defined objects. With a few simple and unintrusive provisions this
  534. powerful machinery can be extended to also work for wrapped C++ objects.
  535. Here is an example::
  536. #include <string>
  537. struct World
  538. {
  539. World(std::string a_msg) : msg(a_msg) {}
  540. std::string greet() const { return msg; }
  541. std::string msg;
  542. };
  543. #include <boost/python.hpp>
  544. using namespace boost::python;
  545. struct World_picklers : pickle_suite
  546. {
  547. static tuple
  548. getinitargs(World const& w) { return make_tuple(w.greet()); }
  549. };
  550. BOOST_PYTHON_MODULE(hello)
  551. {
  552. class_<World>("World", init<std::string>())
  553. .def("greet", &World::greet)
  554. .def_pickle(World_picklers())
  555. ;
  556. }
  557. Now let's create a ``World`` object and put it to rest on disk::
  558. >>> import hello
  559. >>> import pickle
  560. >>> a_world = hello.World("howdy")
  561. >>> pickle.dump(a_world, open("my_world", "w"))
  562. In a potentially *different script* on a potentially *different
  563. computer* with a potentially *different operating system*::
  564. >>> import pickle
  565. >>> resurrected_world = pickle.load(open("my_world", "r"))
  566. >>> resurrected_world.greet()
  567. 'howdy'
  568. Of course the ``cPickle`` module can also be used for faster
  569. processing.
  570. Boost.Python's ``pickle_suite`` fully supports the ``pickle`` protocol
  571. defined in the standard Python documentation. Like a __getinitargs__
  572. function in Python, the pickle_suite's getinitargs() is responsible for
  573. creating the argument tuple that will be use to reconstruct the pickled
  574. object. The other elements of the Python pickling protocol,
  575. __getstate__ and __setstate__ can be optionally provided via C++
  576. getstate and setstate functions. C++'s static type system allows the
  577. library to ensure at compile-time that nonsensical combinations of
  578. functions (e.g. getstate without setstate) are not used.
  579. Enabling serialization of more complex C++ objects requires a little
  580. more work than is shown in the example above. Fortunately the
  581. ``object`` interface (see next section) greatly helps in keeping the
  582. code manageable.
  583. ------------------
  584. Object interface
  585. ------------------
  586. Experienced 'C' language extension module authors will be familiar
  587. with the ubiquitous ``PyObject*``, manual reference-counting, and the
  588. need to remember which API calls return "new" (owned) references or
  589. "borrowed" (raw) references. These constraints are not just
  590. cumbersome but also a major source of errors, especially in the
  591. presence of exceptions.
  592. Boost.Python provides a class ``object`` which automates reference
  593. counting and provides conversion to Python from C++ objects of
  594. arbitrary type. This significantly reduces the learning effort for
  595. prospective extension module writers.
  596. Creating an ``object`` from any other type is extremely simple::
  597. object s("hello, world"); // s manages a Python string
  598. ``object`` has templated interactions with all other types, with
  599. automatic to-python conversions. It happens so naturally that it's
  600. easily overlooked::
  601. object ten_Os = 10 * s[4]; // -> "oooooooooo"
  602. In the example above, ``4`` and ``10`` are converted to Python objects
  603. before the indexing and multiplication operations are invoked.
  604. The ``extract<T>`` class template can be used to convert Python objects
  605. to C++ types::
  606. double x = extract<double>(o);
  607. If a conversion in either direction cannot be performed, an
  608. appropriate exception is thrown at runtime.
  609. The ``object`` type is accompanied by a set of derived types
  610. that mirror the Python built-in types such as ``list``, ``dict``,
  611. ``tuple``, etc. as much as possible. This enables convenient
  612. manipulation of these high-level types from C++::
  613. dict d;
  614. d["some"] = "thing";
  615. d["lucky_number"] = 13;
  616. list l = d.keys();
  617. This almost looks and works like regular Python code, but it is pure
  618. C++. Of course we can wrap C++ functions which accept or return
  619. ``object`` instances.
  620. =================
  621. Thinking hybrid
  622. =================
  623. Because of the practical and mental difficulties of combining
  624. programming languages, it is common to settle a single language at the
  625. outset of any development effort. For many applications, performance
  626. considerations dictate the use of a compiled language for the core
  627. algorithms. Unfortunately, due to the complexity of the static type
  628. system, the price we pay for runtime performance is often a
  629. significant increase in development time. Experience shows that
  630. writing maintainable C++ code usually takes longer and requires *far*
  631. more hard-earned working experience than developing comparable Python
  632. code. Even when developers are comfortable working exclusively in
  633. compiled languages, they often augment their systems by some type of
  634. ad hoc scripting layer for the benefit of their users without ever
  635. availing themselves of the same advantages.
  636. Boost.Python enables us to *think hybrid*. Python can be used for
  637. rapidly prototyping a new application; its ease of use and the large
  638. pool of standard libraries give us a head start on the way to a
  639. working system. If necessary, the working code can be used to
  640. discover rate-limiting hotspots. To maximize performance these can
  641. be reimplemented in C++, together with the Boost.Python bindings
  642. needed to tie them back into the existing higher-level procedure.
  643. Of course, this *top-down* approach is less attractive if it is clear
  644. from the start that many algorithms will eventually have to be
  645. implemented in C++. Fortunately Boost.Python also enables us to
  646. pursue a *bottom-up* approach. We have used this approach very
  647. successfully in the development of a toolbox for scientific
  648. applications. The toolbox started out mainly as a library of C++
  649. classes with Boost.Python bindings, and for a while the growth was
  650. mainly concentrated on the C++ parts. However, as the toolbox is
  651. becoming more complete, more and more newly added functionality can be
  652. implemented in Python.
  653. .. image:: images/python_cpp_mix.png
  654. This figure shows the estimated ratio of newly added C++ and Python
  655. code over time as new algorithms are implemented. We expect this
  656. ratio to level out near 70% Python. Being able to solve new problems
  657. mostly in Python rather than a more difficult statically typed
  658. language is the return on our investment in Boost.Python. The ability
  659. to access all of our code from Python allows a broader group of
  660. developers to use it in the rapid development of new applications.
  661. =====================
  662. Development history
  663. =====================
  664. The first version of Boost.Python was developed in 2000 by Dave
  665. Abrahams at Dragon Systems, where he was privileged to have Tim Peters
  666. as a guide to "The Zen of Python". One of Dave's jobs was to develop
  667. a Python-based natural language processing system. Since it was
  668. eventually going to be targeting embedded hardware, it was always
  669. assumed that the compute-intensive core would be rewritten in C++ to
  670. optimize speed and memory footprint [#proto]_. The project also wanted to
  671. test all of its C++ code using Python test scripts [#test]_. The only
  672. tool we knew of for binding C++ and Python was SWIG_, and at the time
  673. its handling of C++ was weak. It would be false to claim any deep
  674. insight into the possible advantages of Boost.Python's approach at
  675. this point. Dave's interest and expertise in fancy C++ template
  676. tricks had just reached the point where he could do some real damage,
  677. and Boost.Python emerged as it did because it filled a need and
  678. because it seemed like a cool thing to try.
  679. This early version was aimed at many of the same basic goals we've
  680. described in this paper, differing most-noticeably by having a
  681. slightly more cumbersome syntax and by lack of special support for
  682. operator overloading, pickling, and component-based development.
  683. These last three features were quickly added by Ullrich Koethe and
  684. Ralf Grosse-Kunstleve [#feature]_, and other enthusiastic contributors arrived
  685. on the scene to contribute enhancements like support for nested
  686. modules and static member functions.
  687. By early 2001 development had stabilized and few new features were
  688. being added, however a disturbing new fact came to light: Ralf had
  689. begun testing Boost.Python on pre-release versions of a compiler using
  690. the EDG_ front-end, and the mechanism at the core of Boost.Python
  691. responsible for handling conversions between Python and C++ types was
  692. failing to compile. As it turned out, we had been exploiting a very
  693. common bug in the implementation of all the C++ compilers we had
  694. tested. We knew that as C++ compilers rapidly became more
  695. standards-compliant, the library would begin failing on more
  696. platforms. Unfortunately, because the mechanism was so central to the
  697. functioning of the library, fixing the problem looked very difficult.
  698. Fortunately, later that year Lawrence Berkeley and later Lawrence
  699. Livermore National labs contracted with `Boost Consulting`_ for support
  700. and development of Boost.Python, and there was a new opportunity to
  701. address fundamental issues and ensure a future for the library. A
  702. redesign effort began with the low level type conversion architecture,
  703. building in standards-compliance and support for component-based
  704. development (in contrast to version 1 where conversions had to be
  705. explicitly imported and exported across module boundaries). A new
  706. analysis of the relationship between the Python and C++ objects was
  707. done, resulting in more intuitive handling for C++ lvalues and
  708. rvalues.
  709. The emergence of a powerful new type system in Python 2.2 made the
  710. choice of whether to maintain compatibility with Python 1.5.2 easy:
  711. the opportunity to throw away a great deal of elaborate code for
  712. emulating classic Python classes alone was too good to pass up. In
  713. addition, Python iterators and descriptors provided crucial and
  714. elegant tools for representing similar C++ constructs. The
  715. development of the generalized ``object`` interface allowed us to
  716. further shield C++ programmers from the dangers and syntactic burdens
  717. of the Python 'C' API. A great number of other features including C++
  718. exception translation, improved support for overloaded functions, and
  719. most significantly, CallPolicies for handling pointers and
  720. references, were added during this period.
  721. In October 2002, version 2 of Boost.Python was released. Development
  722. since then has concentrated on improved support for C++ runtime
  723. polymorphism and smart pointers. Peter Dimov's ingenious
  724. ``boost::shared_ptr`` design in particular has allowed us to give the
  725. hybrid developer a consistent interface for moving objects back and
  726. forth across the language barrier without loss of information. At
  727. first, we were concerned that the sophistication and complexity of the
  728. Boost.Python v2 implementation might discourage contributors, but the
  729. emergence of Pyste_ and several other significant feature
  730. contributions have laid those fears to rest. Daily questions on the
  731. Python C++-sig and a backlog of desired improvements show that the
  732. library is getting used. To us, the future looks bright.
  733. .. _`EDG`: http://www.edg.com
  734. =============
  735. Conclusions
  736. =============
  737. Boost.Python achieves seamless interoperability between two rich and
  738. complimentary language environments. Because it leverages template
  739. metaprogramming to introspect about types and functions, the user
  740. never has to learn a third syntax: the interface definitions are
  741. written in concise and maintainable C++. Also, the wrapping system
  742. doesn't have to parse C++ headers or represent the type system: the
  743. compiler does that work for us.
  744. Computationally intensive tasks play to the strengths of C++ and are
  745. often impossible to implement efficiently in pure Python, while jobs
  746. like serialization that are trivial in Python can be very difficult in
  747. pure C++. Given the luxury of building a hybrid software system from
  748. the ground up, we can approach design with new confidence and power.
  749. ===========
  750. Citations
  751. ===========
  752. .. [VELD1995] T. Veldhuizen, "Expression Templates," C++ Report,
  753. Vol. 7 No. 5 June 1995, pp. 26-31.
  754. http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html
  755. ===========
  756. Footnotes
  757. ===========
  758. .. [#proto] In retrospect, it seems that "thinking hybrid" from the
  759. ground up might have been better for the NLP system: the
  760. natural component boundaries defined by the pure python
  761. prototype turned out to be inappropriate for getting the
  762. desired performance and memory footprint out of the C++ core,
  763. which eventually caused some redesign overhead on the Python
  764. side when the core was moved to C++.
  765. .. [#test] We also have some reservations about driving all C++
  766. testing through a Python interface, unless that's the only way
  767. it will be ultimately used. Any transition across language
  768. boundaries with such different object models can inevitably
  769. mask bugs.
  770. .. [#feature] These features were expressed very differently in v1 of
  771. Boost.Python