dalzhim.github.io C++ Developer

Boost interval_map with a Dynamic Finite Domain

2015-08-06

I have been exploring some particular use cases with boost ICL’s interval_map in order to map values to intervals based on a dynamic finite domain. I would like to share what I have learned in the process so I will start with a quick overview of a few concepts of the ICL and then show some code snippets to get it working.

Quick introduction to boost ICL

The Interval Container Library (ICL) of boost offers interesting data structures to work with ranges. The tutorial of the library takes you through the common uses of the library by illustrating the examples with date/time related scenarios that make it easy to understand how those data structures solve real problems. As an example, you could fill an interval_set with business days so that you can easily look up whether or not the business will be opened on a specific date. You could also use an interval_map to create a schedule where employees are assigned to specific days, allowing you to easily look up who is working on a specific day.

An interval_map is very similar to a std::map as its value_type is a std::pair that associates a key_type with a mapped_type. In this particular case, the key_type is an interval based on the domain of the interval_map, and the mapped type is the codomain of the interval_map. These are the first template parameters that an interval_map accepts (domain_type and codomain_type). Using the chosen domain_type, an interval_map can represent interval_type with open intervals, semi-open intervals, or closed intervals. In any case, an interval is always made of two domain_type values. Hence, interval_type is templated on domain_type.

There are three different interval_maps which handle the underlying pairs in different ways. The first is the boost::icl::interval_map. Whenever there are overlapping or adjacent intervals that share the same codomain value, they will be merge together into a single value_type instance. The boost::icl::separate_interval_map has a slightly different behavior as it will only merge overlapping ranges. Adjacent ranges will be kept separate. Finally, the boost::icl::split_interval_map doesn't merge anything at all, so by adding the ranges [1, 4] and [2, 3], we end up with the ranges [1, 1], [2, 3] and [4, 4]. When overlapping occurs, the various interval_map structures handle the codomain value according to the type codomain_combine which makes the combining behavior customizable.

What is a dynamic finite domain?

Now, what exactly do I mean by a dynamic finite abstract domain? Dynamic means the elements part of the domain may change over time. As an example, if we would consider the first names of everyone in the room you are currently in as a domain, it would be dynamic as it will most certainly change at some point in time. The dynamic domain I will implement for the interval_map is based on an ordered set of objects. As for finite, there is a finite amount of elements within the ordered set. It can be zero, it can be 5000, etc. The important point is that it even if it may differ over time, it can always be counted.

An important thing to note is that once an object becomes part of the domain, its position relative to neighboring elements of the domain cannot be changed without breaking things down. If the ordering of an element would change because of a modification, then it means that the object has become a different element of the domain. In order to do so, it must first be removed from the ordered set, then it can be modified, and finally it can be inserted again.

Let's start with this short piece of code where I assume that DynamicDomain is a simple wrapper for int and is ordered the same way: ```cpp class DynamicDomain; class Value; std::set<std::unique_ptr> domainSet; domainSet.insert(std::unique_ptr(new DynamicDomain(1))); domainSet.insert(std::unique_ptr(new DynamicDomain(2))); domainSet.insert(std::unique_ptr(new DynamicDomain(3))); domainSet.insert(std::unique_ptr(new DynamicDomain(4))); domainSet.insert(std::unique_ptr(new DynamicDomain(5))); typedef boost::icl::interval_map<DynamicDomain*, Value> MapType; MapType intervalMap; ``` Please note that for simplicity, domainSet holds unique_ptr instances and MapType holds raw pointers. In order to use shared_ptr in both cases, the ICL's interval_map would need a different solution than what I'm about to present in this article. As such, the elements in domainSet must outlive the raw pointers used in intervalMap.</p>

There are 5 requirements to use `DynamicDomain*` as the domain_type:

  1. domainSet must be ordered according to the LessThanComparable concept
  2. DynamicDomain instances can be modified as long as their ordering is preserved
  3. MapType must use closed_interval exclusively
  4. DynamicDomain instances shall be able to return pointers to their immediate neighbors within domainSet
  5. A nullptr bound within an interval implies an empty interval
## 1. domainSet must be ordered according to the LessThanComparable concept

Both std::set and interval_map use std::less as the default comparator. The implementation of this standard functor calls operator<. But this won't order pointers properly because in that case, it will result in an arithmetic comparison of the addresses. That's not the behavior that we are looking for. In order to provide a more useful behavior, we need to specialize std::less for `T = DynamicDomain*`. ```cpp namespace std { template <> struct less<DynamicDomain*> { bool operator()(const DynamicDomain* lhs, const DynamicDomain* rhs) const { return lhs->intValue < rhs->intValue; } }; } ``` ## 2. DynamicDomain instances can be modified as long as their ordering is preserved

If we change the intValue of the first element of domainSet to 6, then the order has been broken and no operation (insertion, lookup, removal) shall be performed on the container until the ordering is restored. Otherwise, operations might randomly fail or succeed depending on which elements are tested by binary searches. We can then proceed to modify the rest of the DynamicDomain instances and once we every instance has been incremented by 5, all operations on the container become valid once again. In a similar way, any corresponding `DynamicDomain*` values in intervalMap should be erased before it can be removed from domainSet, otherwise all operations might fail.

## 3. MapType must use closed_interval exclusively

The ICL offers two categories of interval types : static and dynamic. Static interval types are boost::icl::open_interval, boost::icl::left_open_interval, boost::icl::right_open_interval and boost::icl::closed_interval. Those types correspond to different interpretations of a pair of bounds. An open bound is an exclusive bound, while a closed bound is an inclusive one. In other words, open bounds perform comparisons with operator< and operator> while closed bounds perform comparisons with operator>= and operator<=. Mathematical representations of these types of interval are (lower, upper), (lower, upper], [lower, upper) and [lower, upper] respectively.

On the other hand, dynamic interval types are boost::icl::continuous_interval and boost::icl::discrete_interval. The difference between dynamic and static interval types, is that dynamic intervals may mix any of the four static interval types in the same collection, while static interval types won't.

The reason I am telling you about the various types of interval is that only one among them can be safely used within MapType and all the others might lead to crashes. The answer is boost::icl::closed_interval. Let's suppose intervalMap holds one single range: [0, 4]. Now we mean to get rid of the fifth DynamicDomain instance, so we ask the map to remove it from its range and the map changes its internal representation this way: [0, 4). We effectively removed the fifth element from the range, but intervalMap still holds a pointer to it. Which means it might start crashing when the real object is actually removed from domainSet and freed.

The reason for which the map has changed its internal representation that way is that the default interval type used for pointers is boost::icl::discrete_interval, which is a dynamic interval type, which allows a closed range to change into a half-open range. Thus we need to make sure it only ever uses boost::icl::closed_interval to avoid this problem. Here's a new typedef for MapType which does the trick: ```cpp typedef boost::icl::interval_map<DynamicDomain*, Value, boost::icl::partial_absorber, std::less, boost::icl::inplace_plus, boost::icl::inter_section, boost::icl::closed_interval<DynamicDomain*>> MapType; ``` We had to find out the default values of many template parameters in order to change the interval_type. This might get tedious if we have many different maps to build this way. So here's a generic typedef that will default to closed_interval instances and make it easier to create typedefs for pointer-based interval maps: ```cpp template <typename Domain, typename Codomain, typename Traits = boost::icl::partial_absorber, templateclass Compare = std::less, templateclass Combine = boost::icl::inplace_plus, templateclass Section = boost::icl::inter_section> using closed_interval_map = boost::icl::interval_map<Domain*, Codomain, Traits, Compare, Combine, Section, boost::icl::closed_interval<Domain*>>; typedef closed_interval_map<DynamicDomain, Value> MapType; ``` Now we can use the closed_interval_map type and provide it with the Domain and the Codomain which helps us keep things sane when defining multiple types.</p> ## 4. DynamicDomain instances shall be able to return pointers to their immediate neighbors within domainSet

First, let's start by the problem that motivates this requirement: crashes. Even with all that we've done up to now, overlapping ranges will crash. Let's suppose domainSet contains five elements and intervalMap contains a single range [0, 4]. Now let's insert a new range: [2, 2]. Suddenly, the internal representation of the map goes from holding one range, to holding three of them : [0, 1], [2, 2] and [3, 4]. Well, how is the map supposed to guess the addresses of elements 1 and 3 if we have never provided it directly? It does so through the boost::icl::detail::successor and boost::icl::detail::predecessor templates. The responsibility of these templates is to provide the neighbouring elements of the domain when provided any element of the domain.

It turns out that there are two variations of both templates, each of which receives a distinct bool parameter. It is an optimization meant to avoid if conditions on whether or not the sorting is std::less (the default) or std::greater.

Now we need to do quite a lot of work to get this working. First, the DynamicDomain elements must be aware of their siblings. Second, we need to specialize the ICL templates.

Here is a quick way to prototype our little experiment. The class has a pointer to the collection that owns it. When looking for neighbours, it simply looks itself up and then finds the neighbour. A better implementation could be based on a double linked list and the domain type could be list nodes holding the `DynamicDomain*`. Anyway, here's the simple prototype: ```cpp class DynamicDomain { public: DynamicDomain(int intValue, std::set<DynamicDomain*>* collection); DynamicDomain* getPredecessor(); DynamicDomain* getSuccessor(); int intValue; private: std::set<DynamicDomain*>* collection; }; std::set<std::unique_ptr> domainSet; domainSet.insert(std::unique_ptr(new DynamicDomain(1, &domainSet))); domainSet.insert(std::unique_ptr(new DynamicDomain(2, &domainSet))); domainSet.insert(std::unique_ptr(new DynamicDomain(3, &domainSet))); domainSet.insert(std::unique_ptr(new DynamicDomain(4, &domainSet))); domainSet.insert(std::unique_ptr(new DynamicDomain(5, &domainSet))); ``` And here are the template specializations: ```cpp namespace boost { namespace icl { namespace detail { template<> struct successor<DynamicDomain*, true> { inline static DynamicDomain* apply(DynamicDomain* value) { return value->getSuccessor(); } }; template<> struct successor<DynamicDomain*, false> { inline static DynamicDomain* apply(DynamicDomain* value) { return value->getPredecessor(); } }; template<> struct predecessor<DynamicDomain*, true> { inline static DynamicDomain* apply(DynamicDomain* value) { return value->getPredecessor(); } }; template<> struct predecessor<DynamicDomain*, false> { inline static DynamicDomain* apply(DynamicDomain* value) { return value->getSuccessor(); } }; }}} ``` ## 5. A nullptr bound within an interval implies an empty interval

In order to fix some more crashes when using mapInstance, we need to help the interval_map understand that some of the elements it discovers through the successor and predecessor templates are invalid as bounds. This invalid value is nullptr. We must once again specialize one of the ICL's templates: boost::icl::is_empty. The default implementation of is_empty for closed_interval checks if the upper bound is smaller than the lower bound. Because this implies dereferencing our pointers, we need to check for nullptr before the default behavior can be applied. ```cpp namespace boost { namespace icl { template bool is_empty(const boost::icl::closed_interval& object) { return object.lower() == nullptr || object.upper() == nullptr || boost::icl::domain_less(object.upper(), object.lower()); } }} ``` ## Conclusion

Now that we have met the 5 requirements stated earlier, we can use the members of domainSet as a dynamic finite domain for the ICL's data structures. This has been most useful to me in order to solve problems where an interval_map would be appropriate except for one problem: some operations invalidate a large part of the data structure because the discrete values used in ordering the DynamicDomain instances would shift. As an example, consider a spreadsheet software. When you select rows 5-10 and you insert a new row at index 3, then your selection has shifted to rows 6-10. With the dynamic finite domain, I can make changes to the discrete values and prevent the interval_map from being invalidated.

The complete code for this article can be found over here : https://github.com/Dalzhim/ArticleDynamicFiniteDomain


Similar Posts

Comments