头图

Implement your own iterator II

Implement a tree structure container, and then implement an STL-style iterator instance for it.

This article is to provide a supplementary case on how to implement a custom iterator

Implementation of tree_t

I plan to implement a simple but not simple tree container and make it a standard file directory structure container type. But the simplicity is that I only plan to implement the most necessary tree structure interfaces, such as traversal.

This is an imitation of a very standard file directory, dedicated to completely imitating the performance of a file folder. It has nothing to do with binary trees, AVL, or red-black trees.

The first thing to be sure is that tree_t depends on generic_node_t, tree_t itself is not really responsible for the tree algorithm, it just holds a root node pointer. All content related to tree operations is in generic_node_t.

tree_t

Therefore, the specific implementation of tree_t is first given below:

namespace dp::tree{
  template<typename Data, typename Node = detail::generic_node_t<Data>>
  class tree_t : detail::generic_tree_ops<Node> {
    public:
    using Self = tree_t<Data, Node>;
    using BaseT = detail::generic_tree_ops<Node>;
    using NodeT = Node;
    using NodePtr = Node *;
    using iterator = typename Node::iterator;
    using const_iterator = typename Node::const_iterator;
    using reverse_iterator = typename Node::reverse_iterator;
    using const_reverse_iterator = typename Node::const_reverse_iterator;

    using difference_type = std::ptrdiff_t;
    using value_type = typename iterator::value_type;
    using pointer = typename iterator::pointer;
    using reference = typename iterator::reference;
    using const_pointer = typename iterator::const_pointer;
    using const_reference = typename iterator::const_reference;

    ~tree_t() { clear(); }

    void clear() override {
      if (_root) delete _root;
      BaseT::clear();
    }

    void insert(Data const &data) {
      if (!_root) {
        _root = new NodeT{data};
        return;
      }
      _root->insert(data);
    }
    void insert(Data &&data) {
      if (!_root) {
        _root = new NodeT{data};
        return;
      }
      _root->insert(std::move(data));
    }
    template<typename... Args>
    void emplace(Args &&...args) {
      if (!_root) {
        _root = new NodeT{std::forward<Args>(args)...};
        return;
      }
      _root->emplace(std::forward<Args>(args)...);
    }

    Node const &root() const { return *_root; }
    Node &root() { return *_root; }

    iterator begin() { return _root->begin(); }
    iterator end() { return _root->end(); }
    const_iterator begin() const { return _root->begin(); }
    const_iterator end() const { return _root->end(); }
    reverse_iterator rbegin() { return _root->rbegin(); }
    reverse_iterator rend() { return _root->rend(); }
    const_reverse_iterator rbegin() const { return _root->rbegin(); }
    const_reverse_iterator rend() const { return _root->rend(); }

    private:
    NodePtr _root{nullptr};
  }; // class tree_t

} // namespace dp::tree

The necessary interfaces are basically transferred to _root.

generic_node_t

Let's study the implementation of node again.

A tree node holds the following data:

namespace dp::tree::detail{
  template<typename Data>
  struct generic_node_t {
    using Node = generic_node_t<Data>;
    using NodePtr = Node *; //std::unique_ptr<Node>;
    using Nodes = std::vector<NodePtr>;

    private:
    Data _data{};
    NodePtr _parent{nullptr};
    Nodes _children{};
    
    // ...
  }
}

Based on this, we can implement node insertion, deletion, and basic access operations.

These contents are omitted due to space reasons.

If you are interested, please refer to the source code dp-tree.hh and tree.cc .

Forward iterator

The complete implementation of its forward iterator is given below in order to make a more complete explanation of the previous article.

Forward iterator refers to begin() and end() and several operations represented by them. Simply put, it supports one-way traversal of container elements from start to end.

For the tree structure, begin() refers to the root node. The traversal algorithm is root-left subtree-right subtree, which is the preorder traversal algorithm. This is completely different from the main use of in-order traversal such as AVL.

According to this, end() refers to the right and lowest leaf node of the right and lowest subtree. What's the meaning? In the last leaf node, it increments again, essentially setting the _invalid flag to true to indicate that the end has been reached.

In order to avoid access exceptions in the evaluation of the STL end() iterator, the end() we implemented can be safely evaluated, although the evaluation result is actually meaningless ( end() - 1 is the correct back() element).
namespace dp::tree::detail{
  template<typename Data>
  struct generic_node_t {

    // ...

    struct preorder_iter_data {

      // iterator traits
      using difference_type = std::ptrdiff_t;
      using value_type = Node;
      using pointer = value_type *;
      using reference = value_type &;
      using iterator_category = std::forward_iterator_tag;
      using self = preorder_iter_data;
      using const_pointer = value_type const *;
      using const_reference = value_type const &;

      preorder_iter_data() {}
      preorder_iter_data(pointer ptr_, bool invalid_ = false)
        : _ptr(ptr_)
          , _invalid(invalid_) {}
      preorder_iter_data(const preorder_iter_data &o)
        : _ptr(o._ptr)
          , _invalid(o._invalid) {}
      preorder_iter_data &operator=(const preorder_iter_data &o) {
        _ptr = o._ptr, _invalid = o._invalid;
        return *this;
      }

      bool operator==(self const &r) const { return _ptr == r._ptr && _invalid == r._invalid; }
      bool operator!=(self const &r) const { return _ptr != r._ptr || _invalid != r._invalid; }
      reference data() { return *_ptr; }
      const_reference data() const { return *_ptr; }
      reference operator*() { return data(); }
      const_reference operator*() const { return data(); }
      pointer operator->() { return &(data()); }
      const_pointer operator->() const { return &(data()); }
      self &operator++() { return _incr(); }
      self operator++(int) {
        self copy{_ptr, _invalid};
        ++(*this);
        return copy;
      }

      static self begin(const_pointer root_) {
        return self{const_cast<pointer>(root_)};
      }
      static self end(const_pointer root_) {
        if (root_ == nullptr) return self{const_cast<pointer>(root_)};
        pointer p = const_cast<pointer>(root_), last{nullptr};
        while (p) {
          last = p;
          if (p->empty())
            break;
          p = &((*p)[p->size() - 1]);
        }
        auto it = self{last, true};
        ++it;
        return it;
      }

      private:
      self &_incr() {
        if (_invalid) {
          return (*this);
        }

        auto *cc = _ptr;
        if (cc->empty()) {
          Node *pp = cc;
          size_type idx;
          go_up_level:
          pp = pp->parent();
          idx = 0;
          for (auto *vv : pp->_children) {
            ++idx;
            if (vv == _ptr) break;
          }
          if (idx < pp->size()) {
            _ptr = &((*pp)[idx]);
          } else {
            if (pp->parent()) {
              goto go_up_level;
            }
            _invalid = true;
          }
        } else {
          _ptr = &((*cc)[0]);
        }
        return (*this);
      }

      pointer _ptr{};
      bool _invalid{};
      // size_type _child_idx{};
    };

    using iterator = preorder_iter_data;
    using const_iterator = iterator;
    iterator begin() { return iterator::begin(this); }
    const_iterator begin() const { return const_iterator::begin(this); }
    iterator end() { return iterator::end(this); }
    const_iterator end() const { return const_iterator::end(this); }

    // ...
  }
}

This forward iterator traverses the tree structure from the root node from top to bottom and from left to right.

There is something to say, the master just stood there with flaws all over his body, and then all of them were flawed. For preorder_iter_data, it is also a bit of a taste: after too many details, after they are all satisfied, then it is impossible to comment on the reason for the implementation of the code.

It's just a joke, but it actually consumes too much space to describe, so if you look at the code directly, I will save pen and ink.

Reverse iterator

Similar to the forward iterator, but the specific algorithm is different.

This article is not listed due to space limitations. If you are interested, please refer to the source code dp-tree.hh and tree.cc .

Things to take care of

Reiterate the precautions of the complete handwritten iterator again, and add some content that was not explained in detail in the previous palindrome, including:

  1. begin() and end()
  2. The iterator embedding class (not necessarily limited to embedding), at least realize:

    1. Increment operator overloaded in order to walk
    2. Decrement operator overload, if it is bidirectional walking (bidirectional_iterator_tag) or random walking (random_access_iterator_tag)
    3. operator* operator overloaded for iterator evaluation: enable (*it).xxx
    4. Supporting implementation of operator-> to enable it->xxx
    5. operator!= operator overloaded to calculate the iteration range; if necessary, you can also explicitly overload operator== (by default, the compiler automatically generates a matching substitute !=

Supplementary note:

  1. In order to be <algorithm> algorithm, you need to manually define iterator traits, like this:

    struct preorder_iter_data {
    
      // iterator traits
      using difference_type = std::ptrdiff_t;
      using value_type = Node;
      using pointer = value_type *;
      using reference = value_type &;
      using iterator_category = std::forward_iterator_tag;
    }

The purpose of this is to enable std::find_if and other algorithms to correctly reference distance, advance, ++ or - etc. iterator_catagory If your iterator does not support bidirectional walking, then - will be simulated: traverse and register from the first element of the container until it reaches the position of it, and then return last_it. Most other predicates will also have similar analog versions.

Originally, these traits were automatically defined by deriving from std::iterator. But since C++17, it is recommended to write and define them directly by hand for the time being.

You do not need to define them, it is not mandatory.

  1. In most cases, you declare the std::forward_iterator_tag type and define the ++ operator to match it; if you define it as the std::bidirectional_iterator_tag type, you also need to define the - operator.

    The increment and decrement operators need to define the prefix and suffix at the same time. Please refer to the relevant chapter in the on how to implement a custom iterator

  2. In the iterator, define begin() and end() to borrow them in the container class (in the tree_t example in this article, the container class refers to generic_node_t.
  3. If you want to define rbegin/rend, they are not a substitute for --. They usually require you to define another set independently of the forward iterator. There is a clear implementation of this in tree_t, but it is not listed in this article due to space limitations. If you are interested, please refer to the source code dp-tree.hh and tree.cc .

Use/test code

List some test codes:

void test_g_tree() {
  dp::tree::tree_t<tree_data> t;
  UNUSED(t);
  assert(t.rbegin() == t.rend());
  assert(t.begin() == t.end());

  std::array<char, 128> buf;

  //     1
  // 2 3 4 5 6 7
  for (auto v : {1, 2, 3, 4, 5, 6, 7}) {
    std::sprintf(buf.data(), "str#%d", v);
    // t.insert(tree_data{v, buf.data()});
    tree_data vd{v, buf.data()};
    t.insert(std::move(vd));
    // tree_info(t);
  }

  {
    auto v = 8;
    std::sprintf(buf.data(), "str#%d", v);
    tree_data td{v, buf.data()};
    t.insert(td);

    v = 9;
    std::sprintf(buf.data(), "str#%d", v);
    t.emplace(v, buf.data());

    {
      auto b = t.root().begin(), e = t.root().end();
      auto &bNode = (*b), &eNode = (*e);
      std::cout << "::: " << (*bNode) << '\n'; // print bNode.data()
      std::cout << "::: " << (eNode.data()) << '\n';
    }

    {
      int i;
      i = 0;
      for (auto &vv : t) {
        std::cout << i << ": " << (*vv) << ", " << '\n';
        if (i == 8) {
          std::cout << ' ';
        }
        i++;
      }
      std::cout << '\n';
    }

    using T = decltype(t);
    auto it = std::find_if(t.root().begin(), t.root().end(), [](typename T::NodeT &n) -> bool { return (*n) == 9; });

    v = 10;
    std::sprintf(buf.data(), "str#%d", v);
    it->emplace(v, buf.data());

    v = 11;
    std::sprintf(buf.data(), "str#%d", v);
    (*it).emplace(v, buf.data());

    #if defined(_DEBUG)
    auto const itv = t.find([](T::const_reference n) { return (*n) == 10; });
    assert(*(*itv) == 10);
    #endif
  }

  //

  int i;

  i = 0;
  for (auto &v : t) {
    std::cout << i << ": " << (*v) << ", " << '\n';
    if (i == 8) {
      std::cout << ' ';
    }
    i++;
  }
  std::cout << '\n';

  i = 0;
  for (auto it = t.rbegin(); it != t.rend(); ++it, ++i) {
    auto &v = (*it);
    std::cout << i << ": " << (*v) << ", " << '\n';
    if (i == 8) {
      std::cout << ' ';
    }
  }
  std::cout << '\n';
}

These codes simply demonstrate the usage, and are not written in accordance with the unit test method-and it is not necessary.

postscript

This article gives a real working container class with corresponding iterator implementation, I believe they will be your excellent coding implementation template.

:end:


hedzr
95 声望19 粉丝