-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Seems like default number of processes (which is later used to set number of jobs) should be the number of logical CPU cores minus 1 by default.
Curious why default number of threads is 2? If that improves PDAL pipeline performance, then we should set default number of processes as number of physical CPU cores minus 1, assuming modern CPU with hyperthreading.
I'm also not clear on where the default memory limits per worker are set - I think this is all done on the dask side. I'm seeing a sequence of these warnings, when using 5x5 km tiles for a small AOI requiring 4 tiles total, on a machine with 64 GB RAM...
2025-07-11 15:35:38,884 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 11.20 GiB -- Worker memory limit: 16.00 GiB
2025-07-11 15:35:46,663 - distributed.worker.memory - WARNING - Worker is at 81% memory usage. Pausing worker. Process memory: 12.98 GiB -- Worker memory limit: 16.00 GiB
2025-07-11 15:37:05,132 - distributed.worker.memory - WARNING - Worker is at 21% memory usage. Resuming worker. Process memory: 3.50 GiB -- Worker memory limit: 16.00 GiB
Which slows down the processing.
Some options presented here: https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os. Some of this could be related to older OS version on this machine - modern OS may be better about freeing memory more quickly, which dask expects.
I will reduce tile size for now, which should help as I kick off processing for larger AOI, but I think we can do a better job setting number of processes based on user-specified tile size, expected memory requirements for each tile, and the total available RAM, instead of just the total number of available CPU cores.