Skip to content

Rewrite tps functions to improve performance on large data set and more readable

Lan Dam requested to merge improve_calculating_tps into develop

closes #253 (closed)

The performance is improved by reducing nest loops, using np.split to split data into 5m blocks instead of using different loops and nest loops to do it.

This MR also sort data according to sorted times, which handle sections with laps and overlaps more correctly

Compare on large data set 5083 (12GB):

When plot all channels, with RAW, SOH, TPS, check the time from the start of MainWindow.replot_loaded_data() to done of the last tps plotted:

Develop branch: 442.92s which reduced a lot after different merges. (compare with 1842s start from TPS widget's plot_channels.

This branch: 49.13

Old get_start_5mins_of_diff_days(): 0.10749197006225586

New get_start_5min_blocks(): 0.00043511390686035156

Not much but more readable, and the result of new function serve new get_tps_for_discontinuous_data() better.

**For changes in find_tps_idx() Ex: 6407.sdr, channel LH1's TPS, time from the start of on_pick_event() until end of on_ctrl_cmd_click():

  • develop branch: 0.2269s
  • this branch: 0.1827s

Not much but more readable and consistent with the start_5min_blocks above.

Edited by Lan Dam

Merge request reports