Skip to content

[Question] How to Make the Internal Allocator in thrust::reduce Use Pinned Memory? #4317

Answered by pauleonix
JigaoLuo asked this question in Thrust
Discussion options

You must be logged in to vote

While the mentioned buffers can actually be configured to use pinned memory by passing an allocator with a thrust::cuda::universal_host_pinned_memory_resource to the execution policy (See e.g. thrust/examples/cuda/custom_temporary_allocation.cu), I'm not sure if this solves your issue as I think thrust::reduce will still copy the result from these buffers to the host-stack and synchronize afterwards because it needs to return by value. I would also expect bad performance from using pinned memory for the device scratch space as it is not only used for storing the final result. As mentioned on Discord, cub::DeviceReduce() is the right choice in this situation.

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
1 reply
@pauleonix
Comment options

Answer selected by JigaoLuo
Comment options

You must be logged in to vote
3 replies
@pauleonix
Comment options

@JigaoLuo
Comment options

@pauleonix
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Thrust
Labels
None yet
2 participants