I have been exploring some particular use cases with boost ICL’s interval_map in order to map values to intervals based on a dynamic finite domain. I would like to share what I have learned in the process so I will start with a quick overview of a few concepts of the ICL and then show some code snippets to get it working.
Quick introduction to boost ICL
The Interval Container Library (ICL) of boost offers interesting data structures to work with ranges. The tutorial of the library takes you through the common uses of the library by illustrating the examples with date/time related scenarios that make it easy to understand how those data structures solve real problems. As an example, you could fill an interval_set
with business days so that you can easily look up whether or not the business will be opened on a specific date. You could also use an interval_map
to create a schedule where employees are assigned to specific days, allowing you to easily look up who is working on a specific day.
An interval_map
is very similar to a std::map
as its value_type
is a std::pair
that associates a key_type
with a mapped_type
. In this particular case, the key_type
is an interval based on the domain of the interval_map
, and the mapped type is the codomain of the interval_map
. These are the first template parameters that an interval_map
accepts (domain_type
and codomain_type
). Using the chosen domain_type
, an interval_map
can represent interval_type
with open intervals, semi-open intervals, or closed intervals. In any case, an interval is always made of two domain_type
values. Hence, interval_type
is templated on domain_type
.
There are three different interval_map
s which handle the underlying pairs in different ways. The first is the boost::icl::interval_map
. Whenever there are overlapping or adjacent intervals that share the same codomain value, they will be merge together into a single value_type
instance. The boost::icl::separate_interval_map
has a slightly different behavior as it will only merge overlapping ranges. Adjacent ranges will be kept separate. Finally, the boost::icl::split_interval_map
doesn't merge anything at all, so by adding the ranges [1, 4] and [2, 3], we end up with the ranges [1, 1], [2, 3] and [4, 4]. When overlapping occurs, the various interval_map
structures handle the codomain value according to the type codomain_combine
which makes the combining behavior customizable.
What is a dynamic finite domain?
Now, what exactly do I mean by a dynamic finite abstract domain? Dynamic means the elements part of the domain may change over time. As an example, if we would consider the first names of everyone in the room you are currently in as a domain, it would be dynamic as it will most certainly change at some point in time. The dynamic domain I will implement for the interval_map is based on an ordered set of objects. As for finite, there is a finite amount of elements within the ordered set. It can be zero, it can be 5000, etc. The important point is that it even if it may differ over time, it can always be counted.
An important thing to note is that once an object becomes part of the domain, its position relative to neighboring elements of the domain cannot be changed without breaking things down. If the ordering of an element would change because of a modification, then it means that the object has become a different element of the domain. In order to do so, it must first be removed from the ordered set, then it can be modified, and finally it can be inserted again.
Let's start with this short piece of code where I assume that DynamicDomain is a simple wrapper for There are 5 requirements to use `DynamicDomain*` as the Both If we change the intValue of the first element of domainSet to 6, then the order has been broken and no operation (insertion, lookup, removal) shall be performed on the container until the ordering is restored. Otherwise, operations might randomly fail or succeed depending on which elements are tested by binary searches. We can then proceed to modify the rest of the DynamicDomain instances and once we every instance has been incremented by 5, all operations on the container become valid once again. In a similar way, any corresponding `DynamicDomain*` values in The ICL offers two categories of interval types : static and dynamic. Static interval types are On the other hand, dynamic interval types are The reason I am telling you about the various types of interval is that only one among them can be safely used within The reason for which the map has changed its internal representation that way is that the default interval type used for pointers is First, let's start by the problem that motivates this requirement: crashes. Even with all that we've done up to now, overlapping ranges will crash. Let's suppose It turns out that there are two variations of both templates, each of which receives a distinct bool parameter. It is an optimization meant to avoid if conditions on whether or not the sorting is Now we need to do quite a lot of work to get this working. First, the Here is a quick way to prototype our little experiment. The class has a pointer to the collection that owns it. When looking for neighbours, it simply looks itself up and then finds the neighbour. A better implementation could be based on a double linked list and the domain type could be list nodes holding the `DynamicDomain*`. Anyway, here's the simple prototype:
```cpp
class DynamicDomain
{
public:
DynamicDomain(int intValue, std::set<DynamicDomain*>* collection);
DynamicDomain* getPredecessor();
DynamicDomain* getSuccessor();
int intValue;
private:
std::set<DynamicDomain*>* collection;
};
std::set<std::unique_ptr In order to fix some more crashes when using Now that we have met the 5 requirements stated earlier, we can use the members of The complete code for this article can be found over here : https://github.com/Dalzhim/ArticleDynamicFiniteDomainint
and is ordered the same way:
```cpp
class DynamicDomain;
class Value;
std::set<std::unique_ptrdomainSet
holds unique_ptr
instances and MapType
holds raw pointers. In order to use shared_ptr
in both cases, the ICL's interval_map
would need a different solution than what I'm about to present in this article. As such, the elements in domainSet
must outlive the raw pointers used in intervalMap
.</p>
domain_type
:
## 1. domainSet
must be ordered according to the LessThanComparable conceptDynamicDomain
instances can be modified as long as their ordering is preservedMapType
must use closed_interval
exclusivelyDynamicDomain
instances shall be able to return pointers to their immediate neighbors within domainSet
nullptr
bound within an interval implies an empty intervaldomainSet
must be ordered according to the LessThanComparable
concept
std::set
and interval_map
use std::less
as the default comparator. The implementation of this standard functor calls operator<
. But this won't order pointers properly because in that case, it will result in an arithmetic comparison of the addresses. That's not the behavior that we are looking for. In order to provide a more useful behavior, we need to specialize std::less
for `T = DynamicDomain*`.
```cpp
namespace std
{
template <>
struct less<DynamicDomain*>
{
bool operator()(const DynamicDomain* lhs, const DynamicDomain* rhs) const
{
return lhs->intValue < rhs->intValue;
}
};
}
```
## 2. DynamicDomain
instances can be modified as long as their ordering is preserved
intervalMap
should be erased before it can be removed from domainSet
, otherwise all operations might fail.MapType
must use closed_interval
exclusively
boost::icl::open_interval
, boost::icl::left_open_interval
, boost::icl::right_open_interval
and boost::icl::closed_interval
. Those types correspond to different interpretations of a pair of bounds. An open bound is an exclusive bound, while a closed bound is an inclusive one. In other words, open bounds perform comparisons with operator<
and operator>
while closed bounds perform comparisons with operator>=
and operator<=
. Mathematical representations of these types of interval are (lower, upper), (lower, upper], [lower, upper) and [lower, upper] respectively.boost::icl::continuous_interval
and boost::icl::discrete_interval
. The difference between dynamic and static interval types, is that dynamic intervals may mix any of the four static interval types in the same collection, while static interval types won't.MapType
and all the others might lead to crashes. The answer is boost::icl::closed_interval
. Let's suppose intervalMap
holds one single range: [0, 4]. Now we mean to get rid of the fifth DynamicDomain
instance, so we ask the map to remove it from its range and the map changes its internal representation this way: [0, 4). We effectively removed the fifth element from the range, but intervalMap
still holds a pointer to it. Which means it might start crashing when the real object is actually removed from domainSet
and freed.boost::icl::discrete_interval
, which is a dynamic interval type, which allows a closed range to change into a half-open range. Thus we need to make sure it only ever uses boost::icl::closed_interval
to avoid this problem. Here's a new typedef for MapType
which does the trick:
```cpp
typedef boost::icl::interval_map<DynamicDomain*, Value, boost::icl::partial_absorber, std::less, boost::icl::inplace_plus, boost::icl::inter_section, boost::icl::closed_interval<DynamicDomain*>> MapType;
```
We had to find out the default values of many template parameters in order to change the interval_type
. This might get tedious if we have many different maps to build this way. So here's a generic typedef that will default to closed_interval
instances and make it easier to create typedefs for pointer-based interval maps:
```cpp
template <typename Domain, typename Codomain, typename Traits = boost::icl::partial_absorber, templateclosed_interval_map
type and provide it with the Domain
and the Codomain
which helps us keep things sane when defining multiple types.</p>
## 4. DynamicDomain
instances shall be able to return pointers to their immediate neighbors within domainSet
domainSet
contains five elements and intervalMap
contains a single range [0, 4]. Now let's insert a new range: [2, 2]. Suddenly, the internal representation of the map goes from holding one range, to holding three of them : [0, 1], [2, 2] and [3, 4]. Well, how is the map supposed to guess the addresses of elements 1 and 3 if we have never provided it directly? It does so through the boost::icl::detail::successor
and boost::icl::detail::predecessor
templates. The responsibility of these templates is to provide the neighbouring elements of the domain when provided any element of the domain.std::less
(the default) or std::greater
.DynamicDomain
elements must be aware of their siblings. Second, we need to specialize the ICL templates.nullptr
bound within an interval implies an empty interval
mapInstance
, we need to help the interval_map
understand that some of the elements it discovers through the successor and predecessor templates are invalid as bounds. This invalid value is nullptr
. We must once again specialize one of the ICL's templates: boost::icl::is_empty
. The default implementation of is_empty
for closed_interval
checks if the upper bound is smaller than the lower bound. Because this implies dereferencing our pointers, we need to check for nullptr
before the default behavior can be applied.
```cpp
namespace boost { namespace icl {
template
bool is_emptydomainSet
as a dynamic finite domain for the ICL's data structures. This has been most useful to me in order to solve problems where an interval_map would be appropriate except for one problem: some operations invalidate a large part of the data structure because the discrete values used in ordering the DynamicDomain instances would shift. As an example, consider a spreadsheet software. When you select rows 5-10 and you insert a new row at index 3, then your selection has shifted to rows 6-10. With the dynamic finite domain, I can make changes to the discrete values and prevent the interval_map
from being invalidated.