CPSC 327 Data Structures and Algorithms Spring 2024

Design a data structure to efficiently support:

Answer:

[This reflects as far as we got in class; it may not be the final answer.]

Use a balanced search tree to store (key,element) pairs, a pointer to the node with the minimum key, and a hashtable-based dictionary with (element,node) pairs.

The hashtable stores the node containing each element — but insert and remove in a balanced search tree may involve restructuring of the tree. How does that affect the contents of the hashtable?

With 2-4 trees, all operations may move elements between nodes — but only a couple of nodes (and a fixed number of nodes) are affected in each operation and there are at most three elements per node, so that gives a maximum number of updates in the hashtable no matter the total number of elements. For AVL trees, nodes are relinked but their elements aren't changed (except for one node with a swap during delete). So updating the hashtable is O(1) per restructuring operation.

Discussion:

[This isn't part of the answer or a writeup, but instead outlines a thought process for finding the answer.]

A strategy for designing a data structure is to start with something straightforward and see where it breaks.

insert, findMin, and removeMin sound like OrderedDictionary, but decreaseKey is not an OrderedDictionary operation and doesn't seem like something that could be implemented with OrderedDictionary operations (nothing works with elements instead of keys). So we consider a data structure underlying OrderedDictionary — working directly with the data structure may allow implementation of additional operations.

OrderedDictionary is commonly implemented with a balanced search tree. (No need to get more specific about the kind unless it is matter.) Let's consider how to implement the desired operations in a balanced search tree.

New plan: a balanced search tree with a pointer to the node with the min key.

New plan: need to be able to find the node containing an element given the element. This sounds like a lookup task — add a dictionary with (element,node) as the (key,value) pairs. So now we have a balanced search tree with the (key,element) pairs, a pointer to the node with the minimum key, and a dictionary with (element,node) pairs. Since the implementation of the dictionary matters for running time, let's go with a hashtable.

...but wait, what about the search tree delete and insert? That may involve restructuring of the tree, what does that mean for the (element,node) hashtable? With 2-4 trees, all operations may move elements between nodes — but only a couple of nodes (and a fixed number of nodes) are affected in each operation and there are at most three elements per node, so that gives a maximum number of updates in the hashtable no matter the total number of elements. For AVL trees, nodes are relinked but their elements aren't changed (except for one node with a swap during delete). So updating the hashtable is O(1) per restructuring operation and thus doesn't change the overall big-Oh.

So we now have O(log n) for modification operations and O(1) for find. It's hard to expect that we'll do better for a balanced search tree because decreaseKey would potentially involve moving the element all the way up or down the tree, so something more clever than delete and insert doesn't seem helpful. And it seems pretty good, particularly because the better option for insert(k,e) and remove(k) — a hashtable — is not efficient for ordered dictionaries...but the O(log n) for insert, remove, decrease is mostly due to the balanced search tree, not maintaining the min — a hashtable would reduce everything to O(1) except for an O(N) update-the-min step in removeMin. Could we do better with updating the min?