[ODE] Sweep-and-Prune space for ODE
gfyffe at gmail.com
Fri Nov 26 11:00:20 MST 2004
Yes, I did modify all the algorithms I tested to have the same
interface and operate on the same data. I also tried with and without
prefetching on the radix sort, and even without it it's very fast. I
didn't get them to produce ranks, however.
To solve that, I could 1) sort arbitrary objects based on a key, like
std::sort does, or I could 2) produce a list of ranks instead of
sorting the data. Which one does ODE need? It would be pretty easy
to change the interface to the sort routines.
On Fri, 26 Nov 2004 16:10:25 +0100, Pierre Terdiman
<pierre.terdiman at novodex.com> wrote:
> > Okay, now THAT is a nice implementation of radix sort. For floats, it
> > is 2.3 times faster than std::sort! That means it's 5 times faster
> > than Pierre's code. Five. I would strongly suggest that anyone using
> > Pierre's code should switch to the stereopsis code.
> BTW, if you really want relevant numbers you should make sure all versions
> do the same things. For example the memory buffers for the radix sort are
> allocated outside the sort routine in Michael's code snippet, while it's
> done within the RadixSort class in mine (that is, it's usually profiled
> too!). Allocating memory for 50,000,000 elements might take some significant
> time, for example.
> The prefetch instructions is another thing you might want to consider. As
> far as I'm concerned, I didn't want to use them in sake of compatibility
> with non-SSE machines (at the time, I didn't even have one). So, *of course*
> taking advantage of this gives the 25% speedup Michael mentions. But there's
> a price to pay for this.
> Over the years, I've been sent quite a few "faster" sort routines by
> different people. The most common mistake is the problem I already mentioned
> : not providing ranks as output. This is quite a significant speed boost !
> Adding ranks immediately makes you touch twice as much memory, and it's
> often enough to double the CPU time as well.
> Overall I found that radix sort to be quite ok, often faster than the
> - Pierre
More information about the ODE