admin管理员组文章数量:1321248
For this code:
#include <stdfloat>
std::bfloat16_t foo(std::float32_t f)
{
return f;
}
GCC generates this code:
foo(_Float32):
sub rsp, 8
call __truncsfbf2
add rsp, 8
ret
Here we see call __truncsfbf2
, which is (?) a software implementation (libgcc/soft-fp/truncsfbf2.c).
It is known that:
- bfloat16 instructions are part of Intel DL Boost.
- "DL Boost features were introduced in the Cascade Lake architecture."
Hence, when targeting Cascade Lake I expect to see Intel DL Boost bfloat16 instruction VCVT...
(instead of call __truncsfbf2
).
I've already tried to add -march=cascadelake
. However, GCC still generates call __truncsfbf2
.
Are there any GCC options to generate Intel DL Boost bfloat16 instructions?
The same question goes for Clang.
本文标签:
版权声明:本文标题:floating point - What are GCC and Clang options to generate Intel DL Boost bfloat16 instructions? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742097369a2420632.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论