Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Algol68
Analysis
Android Java
Android Kotlin
Assembly
C
C3
Carbon
C with Coccinelle
C++ with Coccinelle
C++ (Circle)
CIRCT
Clean
Clojure
CMake
CMakeScript
COBOL
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Elixir
Erlang
Fortran
F#
GLSL
Go
Haskell
HLSL
Helion
Hook
Hylo
IL
ispc
Java
Julia
Kotlin
LLVM IR
LLVM MIR
Modula-2
Mojo
Nim
Numba
Nix
Objective-C
Objective-C++
OCaml
Odin
OpenCL C
Pascal
Pony
PTX
Python
Racket
Raku
Ruby
Rust
Sail
Snowball
Scala
Slang
Solidity
Spice
SPIR-V
Swift
LLVM TableGen
Toit
Triton
TypeScript Native
V
Vala
Visual Basic
Vyper
WASM
Yul (Solidity IR)
Zig
Javascript
GIMPLE
Ygen
sway
rust source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Verbose demangling
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Debug intrinsics
Compiler
mrustc (master)
rustc 1.0.0
rustc 1.1.0
rustc 1.10.0
rustc 1.11.0
rustc 1.12.0
rustc 1.13.0
rustc 1.14.0
rustc 1.15.1
rustc 1.16.0
rustc 1.17.0
rustc 1.18.0
rustc 1.19.0
rustc 1.2.0
rustc 1.20.0
rustc 1.21.0
rustc 1.22.0
rustc 1.23.0
rustc 1.24.0
rustc 1.25.0
rustc 1.26.0
rustc 1.27.0
rustc 1.27.1
rustc 1.28.0
rustc 1.29.0
rustc 1.3.0
rustc 1.30.0
rustc 1.31.0
rustc 1.32.0
rustc 1.33.0
rustc 1.34.0
rustc 1.35.0
rustc 1.36.0
rustc 1.37.0
rustc 1.38.0
rustc 1.39.0
rustc 1.4.0
rustc 1.40.0
rustc 1.41.0
rustc 1.42.0
rustc 1.43.0
rustc 1.44.0
rustc 1.45.0
rustc 1.45.2
rustc 1.46.0
rustc 1.47.0
rustc 1.48.0
rustc 1.49.0
rustc 1.5.0
rustc 1.50.0
rustc 1.51.0
rustc 1.52.0
rustc 1.53.0
rustc 1.54.0
rustc 1.55.0
rustc 1.56.0
rustc 1.57.0
rustc 1.58.0
rustc 1.59.0
rustc 1.6.0
rustc 1.60.0
rustc 1.61.0
rustc 1.62.0
rustc 1.63.0
rustc 1.64.0
rustc 1.65.0
rustc 1.66.0
rustc 1.67.0
rustc 1.68.0
rustc 1.69.0
rustc 1.7.0
rustc 1.70.0
rustc 1.71.0
rustc 1.72.0
rustc 1.73.0
rustc 1.74.0
rustc 1.75.0
rustc 1.76.0
rustc 1.77.0
rustc 1.78.0
rustc 1.79.0
rustc 1.8.0
rustc 1.80.0
rustc 1.81.0
rustc 1.82.0
rustc 1.83.0
rustc 1.84.0
rustc 1.85.0
rustc 1.86.0
rustc 1.87.0
rustc 1.88.0
rustc 1.89.0
rustc 1.9.0
rustc 1.90.0
rustc 1.91.0
rustc beta
rustc nightly
rustc-cg-gcc (master)
x86-64 GCCRS (GCC master)
x86-64 GCCRS (GCCRS master)
x86-64 GCCRS 14.1 (GCC assertions)
x86-64 GCCRS 14.1 (GCC)
x86-64 GCCRS 14.2 (GCC assertions)
x86-64 GCCRS 14.2 (GCC)
x86-64 GCCRS 14.3 (GCC assertions)
x86-64 GCCRS 14.3 (GCC)
x86-64 GCCRS 15.1 (GCC assertions)
x86-64 GCCRS 15.1 (GCC)
x86-64 GCCRS 15.2 (GCC assertions)
x86-64 GCCRS 15.2 (GCC)
Options
Source code
#![allow(unused_imports)] #![feature(ptr_sub_ptr, core_intrinsics, inline_const)] use std::intrinsics; use std::mem::{self, ManuallyDrop, MaybeUninit}; use std::ptr; struct GapGuard<T> { pos: *mut T, value: ManuallyDrop<T>, } impl<T> Drop for GapGuard<T> { fn drop(&mut self) { unsafe { ptr::write(self.pos, ManuallyDrop::take(&mut self.value)); } } } fn partition<T, F: FnMut(&T, &T) -> bool>(v: &mut [T], pivot: &T, is_less: &mut F) -> usize { // Novel partition implementation by Lukas Bergdoll and Orson Peters. Branchless Lomuto // partition paired with a cyclic permutation. TODO link writeup. let len = v.len(); let v_base = v.as_mut_ptr(); if len == 0 { return 0; } // Manually unrolled as micro-optimization as only x86 gets auto-unrolling but not Arm. let unroll_len = if const { mem::size_of::<T>() <= 16 } { 2 } else { 1 }; // SAFETY: We checked that `len` is more than zero, which means that reading `v_base` is safe to // do. From there we have a bounded loop where `v_base.add(i)` is guaranteed in-bounds. `v` and // `pivot` can't alias because of type system rules. The drop-guard `gap` ensures that should // `is_less` panic we always overwrite the duplicate in the input. `gap.pos` stores the previous // value of `right` and starts at `v_base` and so it too is in-bounds. Given `UNROLL_LEN == 2` // after the main loop we either have A) the last element in `v` that has not yet been processed // because `len % 2 != 0`, or B) all elements have been processed except the gap value that was // saved at the beginning with `ptr::read(v_base)`. In the case A) the loop will iterate twice, // first performing loop_body to take care of the last element that didn't fit into the unroll. // After that the behavior is the same as for B) where we use the saved value as `right` to // overwrite the duplicate. If this very last call to `is_less` panics the saved value will be // copied back including all possible changes via interior mutability. If `is_less` does not // panic and the code continues we overwrite the duplicate and do `right = right.add(1)`, this // is safe to do with `&mut *gap.value` because `T` is the same as `[T; 1]` and generating a // pointer one past the allocation is safe. unsafe { let mut lt_count = 0; let mut right = v_base.add(1); let mut gap = GapGuard { pos: v_base, value: ManuallyDrop::new(ptr::read(v_base)), }; macro_rules! loop_body { () => {{ let right_is_lt = is_less(&*right, pivot); let left = v_base.add(lt_count); ptr::copy(left, gap.pos, 1); ptr::copy_nonoverlapping(right, left, 1); gap.pos = right; lt_count += right_is_lt as usize; right = right.add(1); }}; } let unroll_end = v_base.add(len - (unroll_len - 1)); while right < unroll_end { for _ in 0..unroll_len { loop_body!(); } } let end = v_base.add(len); loop { let is_done = right == end; right = if is_done { &mut *gap.value } else { right }; loop_body!(); if is_done { mem::forget(gap); break; } } lt_count } } type TestT = u64; pub fn xx(v: &mut [TestT], pivot: &TestT) -> usize { partition(v, pivot, &mut |a, b| a.lt(b)) }
Become a Patron
Sponsor on GitHub
Donate via PayPal
Compiler Explorer Shop
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
CE on Mastodon
CE on Bluesky
Statistics
Changelog
Version tree