admin管理员组

文章数量:1313006

I am writing a simplified version of std::move_only_function in C++11.

In my implementation, I provide Small Object Optimization (a.k.a SOO) for my code, which is basically a union that stores both a function pointer and a type erase pointer.

I realize that when the received function object is smaller than the union, I can use placement new to store the wrapped object directly on the memory area where the union is located, then access it by reinterpret_cast.

Here is my implementation:

template<typename Signature>
class UniqueFunction;

template<typename Ret, typename... Args>
class UniqueFunction<Ret( Args... )> {
  struct AnyFn {
    virtual ~AnyFn() noexcept              = default;
    virtual Ret operator()( Args... args ) = 0;
  };
  template<typename Fn>
  struct FnContainer : public AnyFn {
    Fn fntor_;

    FnContainer( Fn fntor ) : fntor_ { std::move( fntor ) } {}

    FnContainer( const FnContainer& )              = delete;
    FnContainer& operator=( const FnContainer& ) & = delete;
    virtual ~FnContainer() noexcept                = default;

    Ret operator()( Args... args ) override { return fntor_( std::forward<Args>( args )... ); }
  };

  union {
    typename std::add_pointer<Ret( Args... )>::type fptr_;
    AnyFn* ftor_;
  } data_; // The union is here.
  // Tag that identifies the type of data currently stored in the union.
  enum class Tag : std::uint8_t { None, Fptr, FtorInline, FtorDync } tag_;

  // other methods...
};

In the constructor, if I find that the size of the object is less than union, I store it directly in union with placement new and discard the return value of the placement new.

template<typename F>
UniqueFunction( F functor )
{
  if ( sizeof( FnContainer<F> <= sizeof data_ ) {
    new ( &data_ ) FnContainer<F>( std::move( functor ) );
    tag_ = Tag::FtorInline;
  } else {
    data_.ftor_ = new FnContainer<F>( std::move( functor ) );
    tag_ = Tag::FtorDync;
}

In this case, I would reinterpret the address of the union as AnyFn* via reinterpret_cast to try to access the derived object stored on this area using base pointer.

if ( tag_ == Tag::FtorInline ) {
  // When I'm trying to deconstruct it, I do this:
  // reinterpret_cast<AnyFn*>( &data_ )->~AnyFn();

  // Other cases, I do this:
  ( *reinterpret_cast<AnyFn*>( &data_ ) );
}

I know that there's a rule called strict aliasing in the C++ standard, so I suspect that my optimization is violating the standard and causing some sort of UB; I'm not so sure, though, since this code worked well in testing.

Here is the online test.

The problem is:

Is this an UB? If so, is there another way for me to achieve the small object optimization I want?

I am writing a simplified version of std::move_only_function in C++11.

In my implementation, I provide Small Object Optimization (a.k.a SOO) for my code, which is basically a union that stores both a function pointer and a type erase pointer.

I realize that when the received function object is smaller than the union, I can use placement new to store the wrapped object directly on the memory area where the union is located, then access it by reinterpret_cast.

Here is my implementation:

template<typename Signature>
class UniqueFunction;

template<typename Ret, typename... Args>
class UniqueFunction<Ret( Args... )> {
  struct AnyFn {
    virtual ~AnyFn() noexcept              = default;
    virtual Ret operator()( Args... args ) = 0;
  };
  template<typename Fn>
  struct FnContainer : public AnyFn {
    Fn fntor_;

    FnContainer( Fn fntor ) : fntor_ { std::move( fntor ) } {}

    FnContainer( const FnContainer& )              = delete;
    FnContainer& operator=( const FnContainer& ) & = delete;
    virtual ~FnContainer() noexcept                = default;

    Ret operator()( Args... args ) override { return fntor_( std::forward<Args>( args )... ); }
  };

  union {
    typename std::add_pointer<Ret( Args... )>::type fptr_;
    AnyFn* ftor_;
  } data_; // The union is here.
  // Tag that identifies the type of data currently stored in the union.
  enum class Tag : std::uint8_t { None, Fptr, FtorInline, FtorDync } tag_;

  // other methods...
};

In the constructor, if I find that the size of the object is less than union, I store it directly in union with placement new and discard the return value of the placement new.

template<typename F>
UniqueFunction( F functor )
{
  if ( sizeof( FnContainer<F> <= sizeof data_ ) {
    new ( &data_ ) FnContainer<F>( std::move( functor ) );
    tag_ = Tag::FtorInline;
  } else {
    data_.ftor_ = new FnContainer<F>( std::move( functor ) );
    tag_ = Tag::FtorDync;
}

In this case, I would reinterpret the address of the union as AnyFn* via reinterpret_cast to try to access the derived object stored on this area using base pointer.

if ( tag_ == Tag::FtorInline ) {
  // When I'm trying to deconstruct it, I do this:
  // reinterpret_cast<AnyFn*>( &data_ )->~AnyFn();

  // Other cases, I do this:
  ( *reinterpret_cast<AnyFn*>( &data_ ) );
}

I know that there's a rule called strict aliasing in the C++ standard, so I suspect that my optimization is violating the standard and causing some sort of UB; I'm not so sure, though, since this code worked well in testing.

Here is the online test.

The problem is:

Is this an UB? If so, is there another way for me to achieve the small object optimization I want?

Share Improve this question edited Feb 1 at 19:39 Konvt asked Feb 1 at 13:01 KonvtKonvt 1257 bronze badges 4
  • What is "SOO"? I don't see you using it inside your post. – Thomas Matthews Commented Feb 1 at 19:27
  • There is no "C/C++ standard". So which standard are you referring to? – Thomas Matthews Commented Feb 1 at 19:28
  • @ThomasMatthews I mean Small Object Optimization and the strict aliasing rule in C++ standard. – Konvt Commented Feb 1 at 19:40
  • The problem with many standard containers and helpers - they cannot be implemented without UB. UB is something none of official documentation had declared as guaranteed. Internal design documentaion of compiler ins't part of that. Which means you cannot reliably create a portable analog - you have to tailor it to each compiler line and version. – Swift - Friday Pie Commented Feb 1 at 19:56
Add a comment  | 

1 Answer 1

Reset to default 1

The placement new call is not UB. But it ends the lifetime of the union object since it reuses the storage that the union object occupies. See [basic.life]/2.5. &data_ becomes a pointer to an object that is not within its lifetime.

The reinterpret_cast doesn't produce a pointer to the AnyFn object. It is equivalent to a static_cast to void* followed by a second static_cast to AnyFn*. The first cast is fine. The second one will only actually produce a pointer to the AnyFn object if that object is pointer-interconvertible with the original pointee (the dead union object). See [expr.static.cast]/13. This condition will not be satisfied. Access through the resulting pointer will be UB.

It is sometimes possible to obtain a pointer to the AnyFn base class subobject by passing the result of the cast to std::launder. However, this will work only if the AnyFn base class subobject is actually located at the beginning of the storage for data_. This condition is not guaranteed to hold.

To implement the small object optimization it is suggested to have an array of unsigned char or std::byte as one of the members of the union. The small object is then placement new'd into that array. The pointer returned by placement new is stored as a separate member and is used to access the created object.

本文标签: cIs placement new in an union UB for SOOStack Overflow