tips_n_tricks.qbk 3.8 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
  1. [/
  2. / Copyright (c) 2008 Eric Niebler
  3. /
  4. / Distributed under the Boost Software License, Version 1.0. (See accompanying
  5. / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  6. /]
  7. [section:tips_n_tricks Tips 'N Tricks]
  8. Squeeze the most performance out of xpressive with these tips and tricks.
  9. [h2 Compile Patterns Once And Reuse Them]
  10. Compiling a regex (dynamic or static) is /far/ more expensive than executing a
  11. match or search. If you have the option, prefer to compile a pattern into
  12. a _basic_regex_ object once and reuse it rather than recreating it over
  13. and over.
  14. Since _basic_regex_ objects are not mutated by any of the regex algorithms, they
  15. are completely thread-safe once their initialization (and that of any grammars of
  16. which they are members) completes. The easiest way to reuse your patterns is
  17. to simply make your _basic_regex_ objects "static const".
  18. [h2 Reuse _match_results_ Objects]
  19. The _match_results_ object caches dynamically allocated memory. For this
  20. reason, it is far better to reuse the same _match_results_ object if you
  21. have to do many regex searches.
  22. Caveat: _match_results_ objects are not thread-safe, so don't go wild
  23. reusing them across threads.
  24. [h2 Prefer Algorithms That Take A _match_results_ Object]
  25. This is a corollary to the previous tip. If you are doing multiple searches,
  26. you should prefer the regex algorithms that accept a _match_results_ object
  27. over the ones that don't, and you should reuse the same _match_results_ object
  28. each time. If you don't provide a _match_results_ object, a temporary one
  29. will be created for you and discarded when the algorithm returns. Any
  30. memory cached in the object will be deallocated and will have to be reallocated
  31. the next time.
  32. [h2 Prefer Algorithms That Accept Iterator Ranges Over Null-Terminated Strings]
  33. xpressive provides overloads of the _regex_match_ and _regex_search_
  34. algorithms that operate on C-style null-terminated strings. You should
  35. prefer the overloads that take iterator ranges. When you pass a
  36. null-terminated string to a regex algorithm, the end iterator is calculated
  37. immediately by calling `strlen`. If you already know the length of the string,
  38. you can avoid this overhead by calling the regex algorithms with a `[begin, end)`
  39. pair.
  40. [h2 Use Static Regexes]
  41. On average, static regexes execute about 10 to 15% faster than their
  42. dynamic counterparts. It's worth familiarizing yourself with the static
  43. regex dialect.
  44. [h2 Understand [^syntax_option_type::optimize]]
  45. The `optimize` flag tells the regex compiler to spend some extra time analyzing
  46. the pattern. It can cause some patterns to execute faster, but it increases
  47. the time to compile the pattern, and often increases the amount of memory
  48. consumed by the pattern. If you plan to reuse your pattern, `optimize` is
  49. usually a win. If you will only use the pattern once, don't use `optimize`.
  50. [h1 Common Pitfalls]
  51. Keep the following tips in mind to avoid stepping in potholes with xpressive.
  52. [h2 Create Grammars On A Single Thread]
  53. With static regexes, you can create grammars by nesting regexes inside one
  54. another. When compiling the outer regex, both the outer and inner regex objects,
  55. and all the regex objects to which they refer either directly or indirectly, are
  56. modified. For this reason, it's dangerous for global regex objects to participate
  57. in grammars. It's best to build regex grammars from a single thread. Once built,
  58. the resulting regex grammar can be executed from multiple threads without
  59. problems.
  60. [h2 Beware Nested Quantifiers]
  61. This is a pitfall common to many regular expression engines. Some patterns can
  62. cause exponentially bad performance. Often these patterns involve one quantified
  63. term nested withing another quantifier, such as `"(a*)*"`, although in many
  64. cases, the problem is harder to spot. Beware of patterns that have nested
  65. quantifiers.
  66. [endsect]