API

Functions

ChipSort.bitonic_merge — Method.

bitonic_merge(input_a::Vec{N,T}, input_b::Vec{N,T}) where {N,T}

Merges two SIMD.Vec objects of the same type and size using a bitonic sort network. The inputs are assumed to be sorted. Returns a pair of vectors with the first and second halves of the merged sequence.

source

ChipSort.bitonic_merge_interleaved — Method.

bitonic_merge_interleaved(output_a, output_b, input_a, input_b, Val(N))

Merges K pairs of vectors using a bitonic sort network, with interleaved execution. The inputs are assumed to be sorted. Inputs and outputs are acessed directly in memory.

source

ChipSort.chipsort! — Method.

Regular Comb sort followed by Insertion sort when the interval becomes 1.

source

ChipSort.chipsort_medium! — Method.

chipsort_medium!(data, Val(V), Val(J), Val(K))

Sort a medium array in-place. The data is divided between K blocks of J SIMD vectors of size V. The array should typically fit inside lower levels of cache memory (L1 or L2), for instance 8192 Int32 values on an conventional desktop computer. Our recipe is:

Generate small sorted vectors.
Use vectorized Comb sort.
Transpose in-place.
Finish with insertion sort over the nearly sorted array.

Examples

julia> chipsort_medium!(Int32[1:2^13...] .* 0xf0ca .%0x10000, Val(8), Val(8), Val(64))'
1×8192 LinearAlgebra.Adjoint{Int32,Array{Int32,1}}:
 30  32  34  36  38  70  72  74  76  108  110  112  114  116  …  65488  65490  65492  65494  65496  65528  65530  65532  65534

source

ChipSort.chipsort_small! — Method.

chipsort_small!(data, Val(V), Val(J))

Sort a small array using a vectorized sorting network and bitonic merge networks. The initial sort is over J SIMD vectors of size V. The array should typically fit inside register memory, for instance 64 Int32 values on an AVX2 machine.

Examples

julia> chipsort_small!(Int32[1:16...] .* Int32(1729) .%0x100, Val(4), Val(4))'
1×16 LinearAlgebra.Adjoint{Int32,Array{Int32,1}}:
 4  8  12  16  67  71  75  79  130  134  138  142  193  197  201  205

source

ChipSort.combsort! — Method.

Regular version of Comb sort.

source

ChipSort.insertion_sort! — Method.

Insertion sort, adapted from the Julia codebase.

source

ChipSort.merge_vecs — Method.

merge_vecs(input...)

This brave function merges 4, 8 vectors or even more. Relies on the great ability from SIMD.jl and LLVM to handle large vectors. Performance not yet tested.

source

ChipSort.sort_vecs! — Method.

Sort blocks from the input array, optionally transposing to generate sorted sequences of size V (default), or further merging those into sequences of size V*L.

source

ChipSort.transpose! — Method.

In-place transpose of a 3 dimensional array VKJ into VJK.

source

ChipSort.transpose_vecs — Method.

transpose_vecs(input::Vararg{Vec{N,T}, L}) where {L,N,T}

Transposes a matrix of L vectors of size N into N vectors of size L. Sizes should be powers of 2.

source

ChipSort.find_transpose_cycles — Method.

Find the cycle seeds to transpose a matrix with K rows and J columns into J rows and K columns.

source

ChipSort.first_stream — Method.

Returns the stream with the smallest first element. If the two streams are empty, returns nothing.

source