Is there a way to have overlapping runtime-sized arrays of different types in C? - Stack Overflow

IT技术

更新时间：2025-02-221

admin管理员组
文章数量:1244269

I would like to have a runtime-sized buffer (allocated via malloc() for example) where I can safely access any sufficiently aligned/sized slot as either a uint8_t uint16_t uint32_t or uint64_t. An implementation of this for a fixed-sized buffer would be fairly straightforward where N is a compile-time constant:

union Buffer {
    uint8_t u8[N * 8];
    uint16_t u16[N * 4];
    uint32_t u32[N * 2];
    uint64_t u64[N];
};

An alternative approach is to have each unit as a union and then allocate an array of these:

union Buffer {
    uint8_t u8[8];
    uint16_t u16[4];
    uint32_t u32[2];
    uint64_t u64;
};

But this approach has shortcomings as described in a different question I asked. Most notably:

The buffer is not guaranteed to not have padding between the successive union elements.
One cannot safely cast the buffer into one a pointer to one of uint8_t uint16_t uint32_t uint64_t and then index the pointer beyond the boundary of the first element.

So to summarize:

I want overlapping (sharing the same memory) arrays of uint8_t uint16_t uint32_t and uint64_t where I can safely obtain any sufficiently aligned slot in the array as an lvalue of any of the above types.
The object is runtime-sized and allocated via malloc()/realloc()/etc.
No strict-aliasing violations or any undefined behavior.
All elements are contiguous.

Is there any way to do what I described? Or is the only way to use memcpy() to safely read/write the elements as the desired types?

I would like to have a runtime-sized buffer (allocated via malloc() for example) where I can safely access any sufficiently aligned/sized slot as either a uint8_t uint16_t uint32_t or uint64_t. An implementation of this for a fixed-sized buffer would be fairly straightforward where N is a compile-time constant:

union Buffer {
    uint8_t u8[N * 8];
    uint16_t u16[N * 4];
    uint32_t u32[N * 2];
    uint64_t u64[N];
};

An alternative approach is to have each unit as a union and then allocate an array of these:

union Buffer {
    uint8_t u8[8];
    uint16_t u16[4];
    uint32_t u32[2];
    uint64_t u64;
};

But this approach has shortcomings as described in a different question I asked. Most notably:

The buffer is not guaranteed to not have padding between the successive union elements.
One cannot safely cast the buffer into one a pointer to one of uint8_t uint16_t uint32_t uint64_t and then index the pointer beyond the boundary of the first element.

So to summarize:

I want overlapping (sharing the same memory) arrays of uint8_t uint16_t uint32_t and uint64_t where I can safely obtain any sufficiently aligned slot in the array as an lvalue of any of the above types.
The object is runtime-sized and allocated via malloc()/realloc()/etc.
No strict-aliasing violations or any undefined behavior.
All elements are contiguous.

Is there any way to do what I described? Or is the only way to use memcpy() to safely read/write the elements as the desired types?

Share Improve this question asked Feb 16 at 22:45 CPlus 4,83644 gold badges30 silver badges73 bronze badges

Full disclosure: This is a generalization/follow-up of a different question I asked about a possible implementation of this. This new question focuses on the idea as a whole of having overlapping arrays of different types that can be safely accessed. – CPlus Commented Feb 16 at 22:45

Add a comment |

5 Answers 5

Sorted by: Reset to default 2

Simply use helper functions and memcpy (which will not be called if optimizing)

uint8_t get_8t(const void *buffer, size_t index)
{
    uint8_t val;
    const unsigned char *ucbuff = buffer; 
    ucbuff += sizeof(val) * index;
    memcpy(&val, ucbuff, sizeof(val));
    return val;
}

uint16_t get_16t(const void *buffer, size_t index)
{
    uint16_t val;
    const unsigned char *ucbuff = buffer; 
    ucbuff += sizeof(val) * index;
    memcpy(&val, ucbuff, sizeof(val));
    return val;
}

uint32_t get_32t(const void *buffer, size_t index)
{
    uint32_t val;
    const unsigned char *ucbuff = buffer; 
    ucbuff += sizeof(val) * index;
    memcpy(&val, ucbuff, sizeof(val));
    return val;
}

uint64_t get_64t(const void *buffer, size_t index)
{
    uint64_t val;
    const unsigned char *ucbuff = buffer; 
    ucbuff += sizeof(val) * index;
    memcpy(&val, ucbuff, sizeof(val));
    return val;
}

https://godbolt./z/hjh7Mevbx

and it will translate to:

get_8t:
        mov     al, BYTE PTR [rdi+rsi]
        ret
get_16t:
        mov     ax, WORD PTR [rdi+rsi*2]
        ret
get_32t:
        mov     eax, DWORD PTR [rdi+rsi*4]
        ret
get_64t:
        mov     rax, QWORD PTR [rdi+rsi*8]
        ret

Here you have example write function:

void set_32t(void *buffer, size_t index, uint32_t val)
{
    unsigned char *ucbuff = buffer; 
    ucbuff += sizeof(val) * index;
    memcpy(ucbuff, &val, sizeof(val));
}

If you never want to read memory as a different type than you previously wrote it, then this is trivial. Dynamically allocated memory is intended to be flexible. You do not need arrays at all. Simply convert the pointer to the memory to a pointer to the desired type and use it. Given pv a void * that has been assigned from malloc:

// Convert pointer.
uint16_t *pu16 = p;

// Write element.
pu16[i] = 3;

// Read element.
uint16_t x = pu16[i];

Later, you could write the same memory as uint32_t or other types. As long as you do not write as one type and read as another, this is defined by the C standard, because the effective type for a write to dynamically allocated memory is the type of the lvalue used to access it, and the effective type for a read is the type last used to write to it, except that character types may always be used and do not change the effective type.

The array arithmetic is guaranteed by the specification for malloc et al in C 2024 7.24.4.1: “The pointer returned if the allocation succeeds is suitably aligned so that it can be assigned to a pointer to any type of object with a fundamental alignment requirement and size less than or equal to the size requested. It can then be used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).”

If you do need to read memory as a different type than it was written, and not a character type, then you could use memcpy to copy the bytes from the memory into an object of the desired type.

As detailed in your prior question, what you want to do isn't possible, at least not the way you want.

The closest you can do is create an array of the smaller union and perform a two-step process to retrieve the desired element.

union Buffer {
    uint8_t u8[8];
    uint16_t u16[4];
    uint32_t u32[2];
    uint64_t u64;
};

uint8_t *get_item_8(union Buffer *buffer, int index)
{
    return &buffer[index/8].u8[index%8];
}

uint16_t *get_item_16(union Buffer *buffer, int index)
{
    return &buffer[index/4].u16[index%4];
}

uint32_t *get_item_32(union Buffer *buffer, int index)
{
    return &buffer[index/2].u32[index%2];
}

uint64_t *get_item_64(union Buffer *buffer, int index)
{
    return &buffer[index].u64;
}

Your only option for using a single index doesn't cause undefined behavior is to memcpy the data to an array of the desired type.

// all types in the union are the same size, there can be no padding. 
union Buffer {
    uint8_t* u8;
    uint16_t* u16;
    uint32_t* u32;
    uint64_t* u64;
};

Buffer allocate_buffer(size_t N)
{
   Buffer B;
   B.u64 = (uint64_t*)malloc(sizeof(uint64_t) * N);
   return B;
}

Obvious caveats apply: If you are running on platforms with differing byte orders (MSB/LSB), then the ordering of the bytes in the larger types will differ.

I think you misunderstood the answers you referring to.

The buffer is not guaranteed to not have padding between the successive union elements.

Yes, it does. If you define a union of arrays - there is no padding between the elements of array since they are a union. There could be padding if you make an array of unions of arrays - yes. But:

union Buffer {
    uint8_t u8[N * 8];
    uint16_t u16[N * 4];
    uint32_t u32[N * 2];
    uint64_t u64[N];
};

union Buffer u;
printf("%p == %p == %p == %p\n" u.u8, u.u16, u.32, u.64);

But if you do arrays of different size in a union and then combine this union into array:

union Buffer {
    uint8_t u8[N];
    uint16_t u16[N];
    uint32_t u32[N];
    uint64_t u64[N];
};
union Buffer arr[2];

printf("%p == %p %p == %p\n", arr[0].u8, arr[0].u64, arr[1].u8, arr[1].u64);
// and inside arrays there are no padding
printf("%p == %p\n", &(arr[0].u8[1])+1, &arr[0].u8[2]);

// But 
printf("%p == %p\n", &(arr[0].u64[N])+1, &arr[1].u64[0]);
printf("%p != %p\n", &(arr[0].u8[N])+1, &arr[1].u8[0]);

One cannot safely cast the buffer into one a pointer to one of uint8_t uint16_t uint32_t uint64_t and then index the pointer beyond the boundary of the first element.

Yes, you can. And yes, you can go out of bound as with any array:

union Buffer u;
uint8_t *p8 = u.u8;
uint16_t *p16 = u.u16;
uint64_t *p64 = (uint64_t*)p8;
// all pointers here are equal
// You can even do:
uint32_t *p32 = u.u32;
p32 += 10; // points to u.u32[10]
uint8_t *p8_a = (uint8_t*)p32; // points to a first byte of u.u32[10]

I would like to have a runtime-sized buffer (allocated via malloc() for example) where I can safely access any sufficiently aligned/sized slot as either a uint8_t uint16_t uint32_t or uint64_t.

Just use pointer typing:

void *pv = malloc(SOME_VALUE);

uint8_t *p8 = pv;  
uint64_t *p64 = pv;

// any pointer can be used as array
p8[ sizeof(unit64_t) * N] = 1;
printf("%ld\n", p64[N]);

And of course, you would need to control out of bound indexing yourself. But since C does not have array bound control in the first place - it would be hardly an issue here.

本文标签： Is there a way to have overlapping runtimesized arrays of different types in CStack Overflow

版权声明：本文标题：Is there a way to have overlapping runtime-sized arrays of different types in C? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1740187983a2238571.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

Is there a way to have overlapping runtime-sized arrays of different types in C? - Stack Overflow

5 Answers 5

更多相关文章