Getting around the immutability of std::set in the name of performance.
June 01, 2003
URL:http://drdobbs.com/building-a-mutable-set/184401664
I recently implemented an algorithm that computes the intersection (if any) among each pair of segments in a collection of nonvertical line segments, as shown in Figure 1. The requirements of the algorithm and the application in which it was used forced me to look for a new type of container to provide the flexibility and efficiency I needed. What I found was not a new container, but a twisted version of std::set that perfectly suited my needs. Though based on std::set, this construction is by no means general-purpose; it was developed to help solve a very specific problem. On the other hand, this is not just a fancy parlor trick. I have used this construction effectively in production code.
The basic algorithm is shown in Figure 2. As the algorithm proceeds and intersections are found, the x-coordinates of these intersection points are added to ScanSet. The segments in the subcollection ActiveSegs are ordered based on the current scan position x_c. For each segment in ActiveSegs, its intersection with the vertical line at x_c is calculated. The segments are then sorted by increasing value of the y-coordinates of these points of intersection. What makes this algorithm difficult is that this sort order changes each time the scan position moves. In particular, if two or more segments intersect at x_c, the relative order of their y-coordinates to the left of x_c is the reverse of their order to the right of x_c (see Figure 3). A naive implementation of the algorithm would reorder the entire collection of segments each time the scanline moves. But this is a very inefficient ( O(n^2 * log(n) ) algorithm. A more efficient solution exploits the fact that the algorithm only needs to re-order small groups of intersecting segments, rather than the entire collection. I'll describe that solution later in this article.
The details of detecting and reporting segment intersections are not fundamental to this article, so they are omitted. To put the algorithm into context, the line segment collections may contain several hundred thousand segments, so runtime efficiency is paramount. On the other hand, the users of this code typically have ample memory on their machines, so memory efficiency is of lesser importance. Also, all segment endpoints and intersections have integer coordinates.
1. Change the sort order of ActiveSegs on each iteration.
2. Insert a segment in sorted order.
3. Remove a segment.
4. Reverse the order of intersecting segments.
The following chart summarizes the runtime behavior of the standard containers on each of the four requirements. map is excluded, since no associations are being formed, as is deque, since it behaves more or less like vector. That leaves list, vector, and set as possible candidates.
Requirement list vector set ----------------------------------------------------------- 1) Change sorting O(n*log(n)) O(n*log(n)) not possible 2) Insert segment O(n) O(n) O(log(n)) in sorted order 3) Remove segment constant O(n) O(log(n)) 4) Reverse segments O(n) O(n) not possiblelist and vector perform similarly on everything but removal. Unfortunately, they have linear run time or worse for most of the required operations. Since these operations are performed within the main loop of the algorithm, run times are quadratic, so list and vector are simply too slow.
That leaves set, which does quite well on insertion and removal, because that is what it is designed to do. Unfortunately, set is not designed for any form of reorganization. Copying the data from one set to another could satisfy requirements 1 and 4, but this approach would be very slow.
In summary, the standard containers make trade offs between flexibility and runtime efficiency. list and vector are flexible and slow, while set is fast but limited in what it can do. Or is it?
The set declaration looks like this:
int x_coord = some_scan_position; std::set<Segment, fCompare> SegSet(fCompare(x_coord));This set is a perfectly acceptable and will correctly order all of the segments inserted into it based on the x-coordinate stored in fCompare. But what happens when the scanline moves and this x-coordinate needs to change? Unfortunately, std::set cannot provide this flexibility.
A first attempt at circumventing this problem might be to make the state data m_scan_pos a static variable within fCompare, and then provide a static method set_scan_position() to change the data. This way, a copy of the comparison object could be modified and the changes would be reflected in the object stored by the set. Of course, this strategy would prevent fCompare from being used in more than one set at a time unless all the sets want their scan positions changed simultaneously.
A more subtle approach would be to pass the set a reference to the comparison object, rather than the object itself. Then the referenced object could be modified and the changes would be reflected within the set. Unfortunately (for this discussion), the Standards committee knew the dangers in allowing a set's comparison object to be modified in this way, so the Standard explicitly prevents std::set from storing such a reference [2].
class fCompare { public: fCompare(const int& coord) : m_scan_pos(coord) {} private: const int& m_scan_pos; }; int scan_pos; std::set<Segment, fCompare> SegSet( fCompare(scan_pos) );
0 1 2 3 4 4 3 2 0 1 2 3 4 1 0No matter what the intentions were when creating this set, this probably was not the expected outcome. For a set to do its job, it must rely on the state of its data and the consistency of its comparison object. Once either the state or the comparison object is modified, the set will still attempt to do its job, but likely with unintended results. This point is so important that there are entire items in both Scott Meyers' book, Effective STL [3] (Item 22) and in Herb Sutter's book, More Exceptional C++ [4] (Item 8), illustrating the dangers in modifying a set. To be accurate, both items talk of the pitfalls of modifying a set's contents by means of const_cast, but in summary, they both say that once you do something that makes the set's contents inconsistent with the set's comparison object, the set is no longer responsible for what may go wrong, so you'd better know what you're doing.
Getting back to the algorithm, at each scan position x_c, the collection of active segments needs to have its ordering updated to reflect this new position. If intersections exist at this scan position, manual reordering of the segments is required to keep the segment order compatible with the comparison object. That is, if segment s1 intersects segment s2 at x_c, the order of these two segments needs to be reversed when the scanline crosses their point of intersection. But their order relative to the remaining segments in the collection does not change; any segments that were above or below the two intersecting segments will still be above or below them. The upshot of this is that if the order of the segments in each intersecting group is manually reversed each time the scan position is updated, the segment order will agree with the current state of the comparison object, and the underlying std::set will behave consistently.
I have two options for performing the reversal of intersecting segments:
The task of reordering intersecting segments is performed using the second option just described. This option can be made safer by creating a private MutableIter class inside MutableSegmentSet. This iterator inherits from std::set::iterator and redefines operator* to make it return a Segment&, allowing write access to the contents of the iterator. In addition, the increment and decrement operators are redefined to return MutableIters, so that they can be used in standard algorithms. Now the order of a range (first, last) can be reversed by writing
MutableIter mi_first(first); MutableIter mi_last(last); std::reverse(mi_first, mi_last);In the main loop of the algorithm, each time a new scan position is reached, the method set_scan_pos() is called, which, in turn, calls the private method reorder(). This method is responsible for detecting groups of segments that intersect at the current scan position and then calling the reverse() method to reverse their order. reverse() makes use of the MutableIter and uses std::reverse to do the actual reordering. All of the usual set methods (insert, erase, find) are passed through to the contained m_set. In its full context, the MutableSegmentSet would also add functionality for reporting any discovered intersections and for updating the collection of scan positions.
Aside from the details of reporting segment intersections, this implementation shows the necessary framework for creating a "mutable" set. Access to the mutable portion of the set's functionality is limited to one method, set_scan_pos(), which is in charge of maintaining the consistency of the comparison object and the order of segments in the set.
[1] Jon Bentley and Thomas Ottmann, "Algorithms for Reporting and Counting Geometric Intersections," IEEE Trans. Computers C-28, 643-647 (1979).
[2] ISO/IEC, International Standard, Programming Languages -- C++, Reference Number ISO/IEC 14882:1998(E), 1998, Section 23.1.2.
[3] Scott Meyers, Effective STL, Addison-Wesley, 2001.
[4] Herb Sutter, More Exceptional C++, Addison-Wesley, 2002.
Brian Ruud holds a Ph.D. in mathematics from the University of Washington. He is a software engineer in the field of electronic design automation and is currently working for Virage Logic (www.viragelogic.com/), focusing on geometric algorithm design and infrastructure. His interests are in STL usage and generic programming. He can be reached at [email protected].
Figure 1: A collection of non-vertical line segments
Figure 2: Outline of the Bentley-Ottmann algorithm
C = collection of all non-vertical line segments. ScanSet = sorted collection of x-values of all segment endpoints. ActiveSegs = sorted subcollection of segments from C which are intersected by the current scanline. for (each x_c in ScanSet) Reorder ActiveSegs based on x_c for each segment s in C if (x_c is the left endpoint of s) insert s into ActiveSegs report any new intersections created and update ScanSet for each segment s in ActiveSegs if (x_c is the right endpoint of s) remove s from ActiveSegs report any new intersections created and update ScanSet else // x_c is a point of intersection of two or more segments in ActiveSegs reverse the order of the segments that intersect s at this point report any new intersections created and update ScanSet
Figure 3: To the left of x_c, segment1 precedes segment2 in the sort order. To the right of x_c, segment 2 precedes segment1.
Listing 1: Segment class and a function object that orders the segments based on a scanline position
class Segment { public: // Return y-coord for point on segment given its x-coord. int evaluate(const int xcoord) const { return lower_y + slope() * (xcoord - lower_x); } // other functions not shown private: int lower_x, lower_y, upper_x, upper_y; // other members not shown }; class fCompare { public: fCompare(int scan_pos) : m_scan_pos(scan_pos) {} bool operator() (const Segment& lhs, const Segment& rhs) const { if (lhs.evaluate(m_scan_pos) < rhs.evaluate(m_scan_pos)) return true; else // handle ties } private: int m_scan_pos; };
#include <set> #include <algorithm> #include <iostream> template<class T> struct print : public std::unary_function<T, void> { print(std::ostream& out) : os(out) {} void operator() (T x) { os << x;} std::ostream& os; }; enum SortOrder { Increasing, Decreasing }; class fCompare { public: fCompare(const SortOrder& o) : m_order(o) {} bool operator() (const int a, const int b) const { if (m_order == Increasing) return a < b; else return b < a; } private: const SortOrder& m_order; }; int main() { SortOrder sort_state = Increasing; fCompare comp_obj(sort_state); std::set<int, fCompare> IntSet(comp_obj); for (int i = 0; i < 5; ++i) { IntSet.insert(i); } for_each(IntSet.begin(), IntSet.end(), print<int>(std::cout)); std::cout << std::endl; sort_state = Decreasing; for (int i = 0; i < 5; ++i) { IntSet.insert(i); } for_each(IntSet.begin(), IntSet.end(), print<int>(std::cout)); std::cout << std::endl; }
#include <set> #include "segment.h" class MutableSegmentSet; class fCompare { public: fCompare(const MutableSegmentSet& set) : m_set(set) {} bool operator() (const Segment& lhs, const Segment& rhs) const; private: const MutableSegmentSet& m_set; }; class MutableSegmentSet { public: typedef std::set<Segment, fCompare> SegSet; typedef SegSet::iterator iterator; typedef SegSet::const_iterator const_iterator; MutableSegmentSet() : m_set(fCompare(*this)), m_scan_pos(0) {} void set_coord(int x) { m_scan_pos = x; reorder(); } int get_coord() const { return m_scan_pos; } // methods begin, end // methods insert, erase, find private: std::set<Segment, fCompare> m_set; int m_scan_pos; class MutableIter : public SegSet::iterator { public: typedef SegSet::iterator BaseIter; // constructors Segment& operator*() { return const_cast<Segment&>(this->BaseIter::operator*()); } MutableIter& operator++(); const MutableIter operator++(int); MutableIter& operator--(); const MutableIter operator--(int); bool operator==( const MutableIter& rhs); bool operator!=( const MutableIter& rhs); Segment* operator->(); }; void reorder(); void reverse(iterator first, iterator last); iterator get_intersecting_segs(iterator first, iterator last); };
Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.