pickle.qbk 8.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159
  1. [section Pickle support]
  2. [section Introduction]
  3. Pickle is a Python module for object serialization, also known as persistence, marshalling, or flattening.
  4. It is often necessary to save and restore the contents of an object to a file. One approach to this problem is to write a pair of functions that read and write data from a file in a special format. A powerful alternative approach is to use Python's pickle module. Exploiting Python's ability for introspection, the pickle module recursively converts nearly arbitrary Python objects into a stream of bytes that can be written to a file.
  5. The Boost Python Library supports the pickle module through the interface as described in detail in the [@https://docs.python.org/2/library/pickle.html Python Library Reference for pickle]. This interface involves the special methods `__getinitargs__`, `__getstate__` and `__setstate__` as described in the following. Note that `Boost.Python` is also fully compatible with Python's cPickle module.
  6. [endsect]
  7. [section The Pickle Interface]
  8. At the user level, the Boost.Python pickle interface involves three special methods:
  9. [variablelist
  10. [[__getinitargs__][When an instance of a Boost.Python extension class is pickled, the pickler tests if the instance has a `__getinitargs__` method. This method must return a Python `tuple` (it is most convenient to use a [link object_wrappers.boost_python_tuple_hpp.class_tuple `boost::python::tuple`]). When the instance is restored by the unpickler, the contents of this tuple are used as the arguments for the class constructor.
  11. If `__getinitargs__` is not defined, `pickle.load` will call the constructor (`__init__`) without arguments; i.e., the object must be default-constructible.]]
  12. [[__getstate__][When an instance of a `Boost.Python` extension class is pickled, the pickler tests if the instance has a `__getstate__` method. This method should return a Python object representing the state of the instance.]]
  13. [[__setstate__][When an instance of a `Boost.Python` extension class is restored by the unpickler (`pickle.load`), it is first constructed using the result of `__getinitargs__` as arguments (see above). Subsequently the unpickler tests if the new instance has a `__setstate__` method. If so, this method is called with the result of `__getstate__` (a Python object) as the argument.]]
  14. ]
  15. The three special methods described above may be `.def()`\ 'ed individually by the user. However, `Boost.Python` provides an easy to use high-level interface via the `boost::python::pickle_suite` class that also enforces consistency: `__getstate__` and `__setstate__` must be defined as pairs. Use of this interface is demonstrated by the following examples.
  16. [endsect]
  17. [section Example]
  18. There are three files in `python/test` that show how to provide pickle support.
  19. [section pickle1.cpp]
  20. The C++ class in this example can be fully restored by passing the appropriate argument to the constructor. Therefore it is sufficient to define the pickle interface method `__getinitargs__`. This is done in the following way:
  21. Definition of the C++ pickle function:
  22. ``
  23. struct world_pickle_suite : boost::python::pickle_suite
  24. {
  25. static
  26. boost::python::tuple
  27. getinitargs(world const& w)
  28. {
  29. return boost::python::make_tuple(w.get_country());
  30. }
  31. };
  32. ``
  33. Establishing the Python binding:
  34. ``
  35. class_<world>("world", args<const std::string&>())
  36. // ...
  37. .def_pickle(world_pickle_suite())
  38. // ...
  39. ``
  40. [endsect]
  41. [section pickle2.cpp]
  42. The C++ class in this example contains member data that cannot be restored by any of the constructors. Therefore it is necessary to provide the `__getstate__`/`__setstate__` pair of pickle interface methods:
  43. Definition of the C++ pickle functions:
  44. ``
  45. struct world_pickle_suite : boost::python::pickle_suite
  46. {
  47. static
  48. boost::python::tuple
  49. getinitargs(const world& w)
  50. {
  51. // ...
  52. }
  53. static
  54. boost::python::tuple
  55. getstate(const world& w)
  56. {
  57. // ...
  58. }
  59. static
  60. void
  61. setstate(world& w, boost::python::tuple state)
  62. {
  63. // ...
  64. }
  65. };
  66. ``
  67. Establishing the Python bindings for the entire suite:
  68. ``
  69. class_<world>("world", args<const std::string&>())
  70. // ...
  71. .def_pickle(world_pickle_suite())
  72. // ...
  73. ``
  74. For simplicity, the `__dict__` is not included in the result of `__getstate__`. This is not generally recommended, but a valid approach if it is anticipated that the object's `__dict__` will always be empty. Note that the safety guard described below will catch the cases where this assumption is violated.
  75. [endsect]
  76. [section pickle3.cpp]
  77. This example is similar to pickle2.cpp. However, the object's `__dict__` is included in the result of `__getstate__`. This requires a little more code but is unavoidable if the object's `__dict__` is not always empty.
  78. [endsect]
  79. [endsect]
  80. [section Pitfall and Safety Guard]
  81. The pickle protocol described above has an important pitfall that the end user of a Boost.Python extension module might not be aware of:
  82. [*`__getstate__` is defined and the instance's `__dict__` is not empty.]
  83. The author of a `Boost.Python` extension class might provide a `__getstate__` method without considering the possibilities that:
  84. * his class is used in Python as a base class. Most likely the `__dict__` of instances of the derived class needs to be pickled in order to restore the instances correctly.
  85. * the user adds items to the instance's `__dict__` directly. Again, the `__dict__` of the instance then needs to be pickled.
  86. To alert the user to this highly unobvious problem, a safety guard is provided. If `__getstate__` is defined and the instance's `__dict__` is not empty, `Boost.Python` tests if the class has an attribute `__getstate_manages_dict__`. An exception is raised if this attribute is not defined:
  87. ``
  88. RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set)
  89. ``
  90. To resolve this problem, it should first be established that the `__getstate__` and `__setstate__` methods manage the instances's `__dict__` correctly. Note that this can be done either at the C++ or the Python level. Finally, the safety guard should intentionally be overridden. E.g. in C++ (from pickle3.cpp):
  91. ``
  92. struct world_pickle_suite : boost::python::pickle_suite
  93. {
  94. // ...
  95. static bool getstate_manages_dict() { return true; }
  96. };
  97. ``
  98. Alternatively in Python:
  99. ``
  100. import your_bpl_module
  101. class your_class(your_bpl_module.your_class):
  102. __getstate_manages_dict__ = 1
  103. def __getstate__(self):
  104. # your code here
  105. def __setstate__(self, state):
  106. # your code here
  107. ``
  108. [endsect]
  109. [section Practical Advice]
  110. * In `Boost.Python` extension modules with many extension classes, providing complete pickle support for all classes would be a significant overhead. In general complete pickle support should only be implemented for extension classes that will eventually be pickled.
  111. * Avoid using `__getstate__` if the instance can also be reconstructed by way of `__getinitargs__`. This automatically avoids the pitfall described above.
  112. * If `__getstate__` is required, include the instance's `__dict__` in the Python object that is returned.
  113. [endsect]
  114. [section Light-weight alternative: pickle support implemented in Python]
  115. The pickle4.cpp example demonstrates an alternative technique for implementing pickle support. First we direct Boost.Python via the class_::enable_pickling() member function to define only the basic attributes required for pickling:
  116. ``
  117. class_<world>("world", args<const std::string&>())
  118. // ...
  119. .enable_pickling()
  120. // ...
  121. ``
  122. This enables the standard Python pickle interface as described in the Python documentation. By "injecting" a `__getinitargs__` method into the definition of the wrapped class we make all instances pickleable:
  123. ``
  124. # import the wrapped world class
  125. from pickle4_ext import world
  126. # definition of __getinitargs__
  127. def world_getinitargs(self):
  128. return (self.get_country(),)
  129. # now inject __getinitargs__ (Python is a dynamic language!)
  130. world.__getinitargs__ = world_getinitargs
  131. ``
  132. See also the tutorial section on injecting additional methods from Python.
  133. [endsect]
  134. [endsect]