admin管理员组

文章数量:1391947

I would like to build a SmallVec which is like a Vec but stores up to N values inline. Inline storage should use MaybeUninit so that creating a SmallVec is very cheap. A reduced sample of my attempt:

use std::mem::MaybeUninit;
const N: usize = 24;

pub enum SmallVec<T> {
    Inline([MaybeUninit<T>; N]),
    Heap(*mut T)
}

impl<T> SmallVec<T> {
    fn new() -> Self {
        SmallVec::Inline([const { MaybeUninit::uninit() }; N])
    }

    fn clear(&mut self) {
        *self = SmallVec::new();
    }
}

This compiles, but the codegen always zeros the inline array, defeating the optimization I wanted. For example, I expected the clear() function to merely set the enum discriminant via writing a single byte; instead it zeros the whole thing with a big fat call to memset:

test_small_vec_clear:
    movl    $584, %edx
    xorl    %esi, %esi
    jmpq    *memset@GOTPCREL(%rip)
  • Rust playground link
  • GodBolt link

How can I avoid this zeroing, to realize the purported performance advantages of MaybeUninit? Thanks for any help.

I would like to build a SmallVec which is like a Vec but stores up to N values inline. Inline storage should use MaybeUninit so that creating a SmallVec is very cheap. A reduced sample of my attempt:

use std::mem::MaybeUninit;
const N: usize = 24;

pub enum SmallVec<T> {
    Inline([MaybeUninit<T>; N]),
    Heap(*mut T)
}

impl<T> SmallVec<T> {
    fn new() -> Self {
        SmallVec::Inline([const { MaybeUninit::uninit() }; N])
    }

    fn clear(&mut self) {
        *self = SmallVec::new();
    }
}

This compiles, but the codegen always zeros the inline array, defeating the optimization I wanted. For example, I expected the clear() function to merely set the enum discriminant via writing a single byte; instead it zeros the whole thing with a big fat call to memset:

test_small_vec_clear:
    movl    $584, %edx
    xorl    %esi, %esi
    jmpq    *memset@GOTPCREL(%rip)
  • Rust playground link
  • GodBolt link

How can I avoid this zeroing, to realize the purported performance advantages of MaybeUninit? Thanks for any help.

Share Improve this question asked Mar 17 at 0:35 ridiculous_fishridiculous_fish 18.7k1 gold badge61 silver badges71 bronze badges 3
  • I mean, you can just look at how others solve that problem – cafce25 Commented Mar 17 at 0:59
  • the smallvec crate seems to have a similar issue though the optimization appears to come and go as sizes are tweaked; I wonder if it's some LLVM heuristic... – ridiculous_fish Commented Mar 17 at 1:15
  • This might be interesting as well. – fdan Commented Mar 17 at 1:15
Add a comment  | 

1 Answer 1

Reset to default 3

It appears you get the expected optimization if you avoid const { } for uninitialized values:

impl<T> SmallVec<T> {
    fn new() -> Self {
        SmallVec::Inline(unsafe { MaybeUninit::uninit().assume_init() })
    }

    fn clear(&mut self) {
        *self = SmallVec::new();
    }
}
test_small_vec_new:
    movq    %rdi, %rax
    movq    $0, (%rdi)
    retq

test_small_vec_clear:
    movq    $0, (%rdi)
    retq

Playground


Since official documentation encourages your original code for creating MaybeUninit arrays but resulted in poor codegen, I filed an issue. It was quickly picked up, fixed, and merged. Now, your original code with the latest nightly compiler generates the desired result:

test_small_vec_new:                     # @test_small_vec_new
# %bb.0:
    movq    %rdi, %rax
    movq    $0, (%rdi)
    retq
                                        # -- End function

test_small_vec_clear:                   # @test_small_vec_clear
# %bb.0:
    movq    $0, (%rdi)
    retq
                                        # -- End function

If all continues as normal this will be in the stable compiler in 1.87.0.

本文标签: rustHow do I keep MaybeUninit from eagerly zeroingStack Overflow