fromnumbaimportjit,prangeimportnumpyasnp# Numpy array of 10k elementsinput_ndarray=np.random.rand(10000).reshape(10000)# This is the only extra line of code you need to add# which is a decorator@jit(nopython=True)defgo_fast(a):trace=0foriinrange(a.shape[0]):trace+=np.tanh(a[i])returna+trace%timeitgo_fast(input_ndarray)
161 µs ± 2.62 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
This is much slower, time measured in the millisecond space rather than microsecond with @jit(nopython=True) or @njit
# Without numba: notice how this is really slowdefgo_normal(a):trace=0foriinrange(a.shape[0]):trace+=np.tanh(a[i])returna+trace%timeitgo_normal(input_ndarray)
10.5 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Here, instead of the normal range() function we would use for loops, we would need to use prange() which allows us to execute the loops in parallel on separate threads
As you can see, it's slightly faster than @jit(nopython=True)