Add accelerated option to TransmissionAbsorptionConverter using numba #2036

hrobarts · 2025-01-13T09:41:26Z

Changes

Add accelerated option to TransmissionAbsorptionConverter using numba

Testing you performed

Please add any demo scripts to https://github.com/TomographicImaging/CIL-Demos/tree/main/misc

Related issues/links

Closes #2032

Checklist

I have performed a self-review of my code
I have added docstrings in line with the guidance in the developer guide
I have updated the relevant documentation
I have implemented unit tests that cover any new or modified functionality
CHANGELOG.md has been updated with any functionality change
Request review from all relevant developers
Change pull request label to 'Waiting for review'

Contribution Notes

Please read and adhere to the developer guide and local patterns and conventions.

The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in CIL (the Work) under the terms and conditions of the Apache-2.0 License
I confirm that the contribution does not violate any intellectual property rights of third parties

--->

hrobarts · 2025-01-13T09:45:33Z

Testing to check the execution time for calculating the log of a large array serially or using numba (either creating the numba loop per pixel or per projection)

hrobarts · 2025-01-13T16:20:32Z

Comparison of parallelising the log calculation either by indexing a range of values:

@numba.njit(parallel=True)
def numba_loop1(arr_in, num_proj, proj_size, arr_out):
    in_flat = arr_in.ravel()
    out_flat = arr_out.ravel()
    for i in numba.prange(num_proj):
        out_flat[i*proj_size:(i+1)*proj_size] = -np.log(in_flat[i*proj_size:(i+1)*proj_size])

Or using a for loop inside the numba loop:

@numba.njit(parallel=True)
def numba_loop2(arr_in, num_proj, proj_size, arr_out):
    in_flat = arr_in.ravel()
    out_flat = arr_out.ravel()
    for i in numba.prange(num_proj):
        for ij in range(proj_size):
            out_flat[i*proj_size+ij] = -np.log(in_flat[i*proj_size+ij])

@numba.njit(parallel=True)
def numba_loop3(arr_in, num_proj, proj_size, arr_out):
    in_flat = arr_in.ravel()
    out_flat = arr_out.ravel()
    for i in numba.prange(num_proj):
        for ij in range(proj_size):
            out_flat[i*proj_size+ij] = -math.log(in_flat[i*proj_size+ij])

For data.shape = (1800, 1024, 1024)

hrobarts · 2025-01-13T16:24:23Z

Also compare varying the size of the data chunk to use in each thread. Here the number of threads was 28 and the dashed line shows the chunk size and number of chunks if we had used number of chunks = number of projections for this dataset. The serial version on this dataset takes over 50 seconds.

Large increase is around chunk_size = 10^7 for both datasets. Suggest to set default chunk size to 6400

Signed-off-by: Hannah Robarts <[email protected]>

hrobarts · 2025-01-21T11:47:51Z

Repeat measurements on windows with 20 cores

Measure execution time of the following code on a 1800, 1024, 1024 dataset and chunk size 6400

@numba.njit(parallel=True)
def numba_loop1(arr_in, num_chunks, chunk_size, remainder, arr_out):
    in_flat = arr_in.ravel()
    out_flat = arr_out.ravel()
    for i in numba.prange(num_chunks):
        start = i * chunk_size
        end = start + chunk_size
        out_flat[start:end] = -np.log(in_flat[start:end])

    if remainder > 0:
        start = num_chunks * chunk_size
        end = start + remainder
        out_flat[start:end] = -np.log(in_flat[start:end])

Measure the execution time as a function of the number of chunks

And compare serial versus accelerated execution time for the whole processor, where the CIL default number of threads is 10

CHANGELOG.md

lauramurgatroyd · 2025-01-22T12:08:34Z

Wrappers/Python/cil/processors/TransmissionAbsorptionConverter.py

+        chunk_size = 6400
+        num_chunks = data.size // chunk_size
+
+        if (self._accelerated) & (num_chunks > 5):


Why do we compare to 5 here?

Hi Laura, so we don't bother making multiple threads if the dataset is very small so the overhead outweighs the benefit of creating the thread. This will only be the the case for very small datasets

…cImaging/CIL into transmission_absorption_numba

paskino · 2025-02-06T09:40:11Z

Wrappers/Python/cil/processors/TransmissionAbsorptionConverter.py

+    in_flat = arr_in.ravel()
+    out_flat = arr_out.ravel()
+    for i in numba.prange(num_chunks):
+        start = i * chunk_size


would start += chunk_size be faster than reallocating and multiplying at each iteration?

Hi Edo, yes it seems to be quicker. I'll update the code

Add numba loop

3f1ff4f

hrobarts self-assigned this Jan 13, 2025

Fix name error

284b474

hrobarts added 2 commits January 16, 2025 08:30

Update changelog

0fe29a6

Change number of chunks for numba processing

25929d7

hrobarts marked this pull request as ready for review January 16, 2025 08:54

Merge branch 'master' into transmission_absorption_numba

336f81e

Signed-off-by: Hannah Robarts <[email protected]>

hrobarts requested a review from gfardell January 16, 2025 09:20

hrobarts added 2 commits January 16, 2025 09:20

Changelog formatting

58ac5f9

Update remainder calculation, use fixed chunk size

f7dc674

Merge branch 'master' into transmission_absorption_numba

6d1ff77

lauramurgatroyd reviewed Jan 22, 2025

View reviewed changes

hrobarts and others added 3 commits February 5, 2025 17:47

Changelog update

5cedf48

Merge branch 'master' into transmission_absorption_numba

1228505

Merge branch 'transmission_absorption_numba' of github.com:Tomographi…

5608c94

…cImaging/CIL into transmission_absorption_numba

hrobarts requested a review from lauramurgatroyd February 5, 2025 17:57

paskino reviewed Feb 6, 2025

View reviewed changes

hrobarts added 2 commits February 7, 2025 12:13

Initialise start before numba loop

c91f81c

Revert change

41f9046

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add accelerated option to TransmissionAbsorptionConverter using numba #2036

Add accelerated option to TransmissionAbsorptionConverter using numba #2036

hrobarts commented Jan 13, 2025 •

edited

Loading

hrobarts commented Jan 13, 2025

hrobarts commented Jan 13, 2025 •

edited

Loading

hrobarts commented Jan 13, 2025 •

edited

Loading

hrobarts commented Jan 21, 2025 •

edited

Loading

lauramurgatroyd Jan 22, 2025

hrobarts Feb 5, 2025

paskino Feb 6, 2025

hrobarts Feb 7, 2025

Add accelerated option to TransmissionAbsorptionConverter using numba #2036

Are you sure you want to change the base?

Add accelerated option to TransmissionAbsorptionConverter using numba #2036

Conversation

hrobarts commented Jan 13, 2025 • edited Loading

Changes

Testing you performed

Related issues/links

Checklist

Contribution Notes

hrobarts commented Jan 13, 2025

hrobarts commented Jan 13, 2025 • edited Loading

hrobarts commented Jan 13, 2025 • edited Loading

hrobarts commented Jan 21, 2025 • edited Loading

lauramurgatroyd Jan 22, 2025

Choose a reason for hiding this comment

hrobarts Feb 5, 2025

Choose a reason for hiding this comment

paskino Feb 6, 2025

Choose a reason for hiding this comment

hrobarts Feb 7, 2025

Choose a reason for hiding this comment

hrobarts commented Jan 13, 2025 •

edited

Loading

hrobarts commented Jan 13, 2025 •

edited

Loading

hrobarts commented Jan 13, 2025 •

edited

Loading

hrobarts commented Jan 21, 2025 •

edited

Loading