Skip to content

Conversation

bruno-roustant
Copy link
Collaborator

In the case of SortedIterationKTypeVTypeHashMap, we want to sort the key set of a delegate hash map. We know there are no duplicate keys, and they are randomly distributed, and we don't need to have a stable sort. Hence it is a good case to use an in-place Quick sort instead of IndirectSort merge sort (which needs a copy of the source array).

So with this change, SortedIterationKTypeVTypeHashMap sorts with in-place quick sort. It is faster and does not require a clone of the indices array anymore.

This new QuickSort implementation, with 3-way partitioning, works on a provided comparator and also a provided swapper.

  • The comparator is different than the one in IndirectSort because it receives the indices of the elements to compare (not the value of the indirect indices-array). So it can be used for both direct and indirect sorting. (It is used for indirect sorting in SortedIterationKTypeVTypeHashMap, and for direct sorting in SortTest)
  • The swapper interface has two benefits. It allows the sort implementation to not depend on the array type anymore. And it also allows the caller to provide a custom swapper, able for example to swap the elements of two arrays at the same time. (Useful when we have two simple arrays for keys and values and we sort the keys while the values follow, then we can do a binary search on the keys, very compact)

I extracted the sort certification code from IndirectSortTest to move it to a new SortTest, which now tests both IndirectSort and QuickSort.

final int[] order;
switch (algo) {
case MERGESORT:
order = IndirectSort.mergesort(0, x.length, Integer::compare);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bug here where the comparator used was not indirect. I fixed that in the new SortTest.Algorithm.INDIRECT_MERGESORT.

@@ -34,7 +34,8 @@ private IndirectSort() {
* Returns the order of elements between indices <code>start</code> and <code>length</code>, as
* indicated by the given <code>comparator</code>.
*
* <p>This routine uses merge sort. It is guaranteed to be stable.
* <p>This routine uses merge sort. It is guaranteed to be stable. It creates a new indices array,
* and clones it while sorting.
*/
public static int[] mergesort(int start, int length, IntBinaryOperator comparator) {
final int[] src = createOrderArray(start, length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... was it like this before? Becuase that version of mergesort that accepts an external array shouldn't clone it - it doesn't make sense? And right now the mergesort(int, int, comparator) creates a new array just so that it can be cloned in a call to sort(int[], comparator)...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was like this before. AFAIK merge sort requires a source and a destination array. Its memory requirement is O(n).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. Clear.

/*
* HPPC
*
* Copyright (C) 2010-2020 Carrot Search s.c.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update the template (2020)....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried 2021 but spotless corrected it :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. It's the template under gradle/validation/spotless/license-header.txt. Don't worry about it.

@bruno-roustant
Copy link
Collaborator Author

What do you think of this proposal Dawid?

@dweiss
Copy link
Member

dweiss commented Oct 19, 2021

Uh, sorry - I thought this was silent consensus - this looks good to me!

@bruno-roustant bruno-roustant merged commit d882975 into carrotsearch:master Oct 19, 2021
@bruno-roustant
Copy link
Collaborator Author

Thanks! I'll try to remember this silent consensus next time.

@dweiss
Copy link
Member

dweiss commented Oct 20, 2021

...or just ping me. :)

@dweiss dweiss added this to the 0.9.1 milestone Dec 15, 2021
@bruno-roustant bruno-roustant deleted the sort branch August 1, 2023 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants