admin管理员组文章数量:1401849
In the last example of Mark Harris' webinar I don't understand the indexing before the parallel reduction part. In "Reduction #6" the gridSize
/number of dispatches was ceil
[N
(the size of the data) divided by blockSize
/WorkgroupSize
and by 2
because we accessed two items at once].
Now "it's as many as necessary". But what is the gridSize
? It can't be ceil[N/2/workgroupSize]
like before because then in the while loop it would exit on first iteration. What's the number of dispatches when calling the kernel then?
本文标签: cudaNVIDIA webinar on parallel reduction gridsizeStack Overflow
版权声明:本文标题:cuda - NVIDIA webinar on parallel reduction gridsize - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744311563a2600054.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论