Optimize series.rolling.sum() #608

densmirn · 2020-02-16T16:45:15Z

Previous implementation results:

name	nthreads	type	size	median
Series.rolling.sum	4	Python	10000000	0.616
Series.rolling.sum	4	SDC	10000000	4.544

Optimized implementation results:

name	nthreads	type	size	median
Series.rolling.sum	4	Python	10000000	0.552
Series.rolling.sum	4	SDC	10000000	0.053

The optimized implementation executes faster up to ~85 times than previous one and faster up to ~10 times than Python. There is no scalability due to prange isn't used at all because variable nfinite (number of finite values) is common for all threads.

…ure/series_rolling_sum_opt

AlexanderKalistratov · 2020-02-17T21:18:41Z

sdc/datatypes/hpat_pandas_series_rolling_functions.py

+        output_arr = numpy.empty(length, dtype=float64)
+
+        chunks = get_chunks(length)
+        for i in prange(len(chunks)):


Does it helped? Can't wait to know the result 😄

BTW you are not going to write all this monstrous code for every rolling function, don't you?

I'm expecting to see generic implementation for the most of series methods something like this:

windows = [WindowKind(window_size)] for i in range(1, len(chunks)): windows.append(WindowKind(window_size)) for i in prange(len(chunks)): chunk = chunks[i] window = windows[i] prelude_start = max(0, chunk.start - window_size) prelude_stop = max(0, chunk.start) for j in range(interlude_start, interlude_stop): window.add(data, j) for j in range(chunk.start, chunk.stop) window.add(data, j) result[j] = window.get_result()

This is a pseudocode. You need to think about exact details

accidental approval

…ure/series_rolling_sum_opt

pep8speaks · 2020-02-18T12:04:34Z

Hello @densmirn! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-20 05:35:37 UTC

…ure/series_rolling_sum_opt

densmirn

Current performance:

name	nthreads	type	size	median
Series.rolling.sum	1	Python	200000	1.219
Series.rolling.sum	1	SDC	200000	0.905
Series.rolling.sum	4	Python	200000	1.23
Series.rolling.sum	4	SDC	200000	0.409

Python 1 / SDC 4 = 2,98
The scalability was enabled.

densmirn

Current performance:

name	nthreads	type	size	median
Series.rolling.sum	1	Python	800000	3.903
Series.rolling.sum	1	SDC	800000	0.517
Series.rolling.sum	4	Python	800000	3.947
Series.rolling.sum	4	SDC	800000	0.254

Python 1 / SDC 1 = 7.549
Python 1 / SDC 4 = 15,366

Remeasured linear implementation b2a4d9d:

name	nthreads	type	size	median
Series.rolling.sum	1	Python	800000	4.01
Series.rolling.sum	1	SDC	800000	0.401

SDC_LINEAR 1 / SDC_PARALLEL 1 = 0.776
SDC_LINEAR 1 / SDC_PARALLEL 4 = 1.579

I think it's a victory.

AlexanderKalistratov · 2020-02-19T20:26:27Z

sdc/datatypes/hpat_pandas_series_rolling_functions.py

+    return nfinite, result
+
+
+def gen_sdc_pandas_series_rolling_impl(pop, put, init_result=numpy.nan):


Please consider the following option:

@sdc_register_jitable def result_or_nan(nfinite, minp, result): if nfinite < minp: return numpy.nan return result def gen_sdc_pandas_series_rolling_impl(pop, put, init_result=numpy.nan): """Generate series rolling methods implementations based on pop/put funcs""" def impl(self): win = self._window minp = self._min_periods input_series = self._data input_arr = input_series._data length = len(input_arr) output_arr = numpy.empty(length, dtype=float64) chunks = parallel_chunks(length) for i in prange(len(chunks)): chunk = chunks[i] nfinite = 0 result = init_result prelude_start = max(0, chunk.start - win + 1) prelude_stop = min(chunk.start, prelude_start + win) interlude_start = prelude_stop interlude_stop = min(prelude_start + win, chunk.stop) for idx in range(prelude_start, prelude_stop): value = input_arr[idx] nfinite, result = put(value, nfinite, result) for idx in range(interlude_start, interlude_stop): value = input_arr[idx] nfinite, result = put(value, nfinite, result) output_arr[idx] = result_or_nan(nfinite, minp, result) for idx in range(interlude_stop, chunk.stop): put_value = input_arr[idx] pop_value = input_arr[idx - win] nfinite, result = put(put_value, nfinite, result) nfinite, result = pop(pop_value, nfinite, result) output_arr[idx] = result_or_nan(nfinite, minp, result) return pandas.Series(output_arr, input_series._index, name=input_series._name) return impl

It's not the most elegant one, but it could give us some performance (due to elimination of condition in loop and extra counter). If it doesn't, your solution is preferable.

Also, I've changed order of put and pop (firstly put, then pop). It shouldn't affect sum, but could be useful for min and max - if we have added new min/max - we don't need to recalculate result

I didn't get visible result, but I like the code. So let me apply the patch.

AlexanderKalistratov · 2020-02-19T20:28:58Z

Also please keep in mind, that for some functions you need to keep more than one result (e.g. variance)

…ure/series_rolling_sum_opt

densmirn added the Waiting on CI label Feb 16, 2020

densmirn requested review from AlexanderKalistratov and kozlov-alexey February 16, 2020 16:45

densmirn added Ready for Review and removed Waiting on CI labels Feb 17, 2020

densmirn force-pushed the feature/series_rolling_sum_opt branch from 066cd4a to 178a4d9 Compare February 17, 2020 07:31

densmirn changed the title ~~Reimplement series.rolling.sum()~~ Optimize series.rolling.sum() Feb 17, 2020

Optimize series.rolling.sum()

b2a4d9d

densmirn force-pushed the feature/series_rolling_sum_opt branch from 178a4d9 to b2a4d9d Compare February 17, 2020 12:15

densmirn added 2 commits February 17, 2020 19:22

Enable scalability for series.rolling.sum

c22c208

Merge branch 'master' of https://github.com/IntelPython/sdc into feat…

ae90d1a

…ure/series_rolling_sum_opt

densmirn added [WIP] Work in progress and removed Ready for Review labels Feb 17, 2020

AlexanderKalistratov previously approved these changes Feb 17, 2020

View reviewed changes

densmirn added 2 commits February 18, 2020 15:03

Move rolling part to separate place

7c10fca

Merge branch 'master' of https://github.com/IntelPython/sdc into feat…

236ac8a

…ure/series_rolling_sum_opt

Fix style issues

be0229f

densmirn added Ready for Review and removed [WIP] Work in progress labels Feb 18, 2020

densmirn added 2 commits February 18, 2020 15:07

Minor fixes for series.rolling.sum()

b18f5d3

Change perf test for series.rolling.sum()

e0d92fd

densmirn force-pushed the feature/series_rolling_sum_opt branch from 8a4b6da to e0d92fd Compare February 18, 2020 12:53

densmirn added [WIP] Work in progress and removed Ready for Review labels Feb 18, 2020

densmirn added 4 commits February 18, 2020 18:27

Fix issue in case of multithreading

0e63d1a

Minor changes in WindowSum

a8572f8

Merge branch 'master' of https://github.com/IntelPython/sdc into feat…

6236a57

…ure/series_rolling_sum_opt

Enable scalability for series.rolling.sum()

713f623

densmirn added 2 commits February 19, 2020 13:39

Merge branch 'master' of https://github.com/IntelPython/sdc into feat…

111bf08

…ure/series_rolling_sum_opt

Change perf test on series.rolling.sum

5be165c

densmirn commented Feb 19, 2020

View reviewed changes

densmirn added Ready for Review and removed [WIP] Work in progress labels Feb 19, 2020

Refuse class WindowSum

ac675e5

densmirn force-pushed the feature/series_rolling_sum_opt branch from 41896f4 to ac675e5 Compare February 19, 2020 16:38

densmirn commented Feb 19, 2020

View reviewed changes

AlexanderKalistratov reviewed Feb 19, 2020

View reviewed changes

densmirn added 2 commits February 20, 2020 08:35

Refactor series.rolling.sum()

86f3d5a

Merge branch 'master' of https://github.com/IntelPython/sdc into feat…

48c88b6

…ure/series_rolling_sum_opt

AlexanderKalistratov approved these changes Feb 20, 2020

View reviewed changes

AlexanderKalistratov merged commit 632b554 into IntelPython:master Feb 20, 2020

densmirn deleted the feature/series_rolling_sum_opt branch June 9, 2020 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize series.rolling.sum() #608

Optimize series.rolling.sum() #608

Uh oh!

densmirn commented Feb 16, 2020

Uh oh!

AlexanderKalistratov Feb 17, 2020

Uh oh!

AlexanderKalistratov Feb 17, 2020 •

edited

Loading

Uh oh!

pep8speaks commented Feb 18, 2020 •

edited

Loading

Uh oh!

densmirn left a comment •

edited

Loading

Uh oh!

densmirn left a comment •

edited

Loading

Uh oh!

AlexanderKalistratov Feb 19, 2020

Uh oh!

densmirn Feb 20, 2020

Uh oh!

AlexanderKalistratov commented Feb 19, 2020

Uh oh!

Uh oh!

		return nfinite, result


		def gen_sdc_pandas_series_rolling_impl(pop, put, init_result=numpy.nan):

Optimize series.rolling.sum() #608

Optimize series.rolling.sum() #608

Uh oh!

Conversation

densmirn commented Feb 16, 2020

Uh oh!

AlexanderKalistratov Feb 17, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pep8speaks commented Feb 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-02-20 05:35:37 UTC

Uh oh!

densmirn left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

densmirn left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

densmirn Feb 20, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov commented Feb 19, 2020

Uh oh!

Uh oh!

AlexanderKalistratov Feb 17, 2020 •

edited

Loading

pep8speaks commented Feb 18, 2020 •

edited

Loading

densmirn left a comment •

edited

Loading

densmirn left a comment •

edited

Loading