Skip to content

Enhance compute_MVBS feasability #1470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 3, 2025

Conversation

ctuguinay
Copy link
Collaborator

@ctuguinay ctuguinay commented Feb 24, 2025

Allow reindex to be visible flox parameter in compute_MVBS and default to reindex=False, allowing computations to be more memory feasible.

Added range_var_max optional parameter which allows a user to specify range variable maximum, which in turn allows compute_MVBS to be lazily computed.

Tested the new compute_MVBS code on a large 40 GiB Sv dataset on a Dask Local Cluster with 2 workers each with 2 threads and 2 GiB RAM (4 GiB RAM total), shown on the following Gist: https://gist.github.com/ctuguinay/4cafda0c604fe6873b8f18c79cccdd1c

@codecov-commenter
Copy link

codecov-commenter commented Feb 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.56%. Comparing base (9f56124) to head (d663ba3).
Report is 201 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1470       +/-   ##
===========================================
+ Coverage   83.52%   96.56%   +13.04%     
===========================================
  Files          64        3       -61     
  Lines        5686      204     -5482     
===========================================
- Hits         4749      197     -4552     
+ Misses        937        7      -930     
Flag Coverage Δ
unittests 96.56% <100.00%> (+13.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ctuguinay ctuguinay marked this pull request as ready for review February 25, 2025 00:12
@ctuguinay ctuguinay requested a review from leewujung February 25, 2025 00:12
@ctuguinay ctuguinay self-assigned this Feb 25, 2025
@ctuguinay ctuguinay added the enhancement This makes echopype better label Feb 25, 2025
@ctuguinay ctuguinay added this to the v0.10.1 milestone Feb 25, 2025
@ctuguinay
Copy link
Collaborator Author

@leewujung This should be ready for review now

leewujung
leewujung previously approved these changes Apr 3, 2025
Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctuguinay : Sorry it took me forever to look through this. Your finding with reindex is interesting. I think this is ready to be merged. I only have a tiny change on a comment.

I read the flox docs though not sure if I fully understood everything. Was what you found that with reindex=True we typically run out of memory for larger datasets on a regular-sized machine, and when the machine has larger memory, we can set reindex=True to increase the speed?

@ctuguinay ctuguinay merged commit 7dd2743 into OSOceanAcoustics:main Apr 3, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this to Done in Echopype Apr 3, 2025
ctuguinay added a commit to ctuguinay/echopype that referenced this pull request Apr 3, 2025
* add range var max and reindex to compute_MVBS parameters and drop flox maximum version

* add tests for compute MVBS changes

* add todo for compute NASC

* add np.nan fill value

* Update echopype/commongrid/api.py

Co-authored-by: Wu-Jung Lee <[email protected]>

---------

Co-authored-by: Wu-Jung Lee <[email protected]>
ctuguinay added a commit that referenced this pull request Apr 3, 2025
* use align vectors instead of from matrix for a channel-wise rotation

* use normalization logic, add warnings, and add tests

* simplify test

* small word change

* Update echopype/consolidate/ek_depth_utils.py

Co-authored-by: Wu-Jung Lee <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* replace previous APA7 citation with recent ICES APA7 citation (#1455)

* pin scipy to temporarily ensure that rotation matrix calculation does not fail (#1460)

* [pre-commit.ci] pre-commit autoupdate (#1461)

updates:
- [github.com/PyCQA/flake8: 7.1.1 → 7.1.2](PyCQA/flake8@7.1.1...7.1.2)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Check if there exist any swap files before cleaning them up [all tests ci] (#1451)

* check if zarr stores > 0

* add a test for echodata delete

* fix path

* add comments for readability

---------

Co-authored-by: Wu-Jung Lee <[email protected]>

* remove __setattr__ from EchoData (#1457)

* compute dask array before np array equal (#1452)

* Dataset.dims to Dataset.sizes [all tests ci] (#1453)

* dims to sizes

* dims to sizes when using ds.dims

* dims to sizes for test echodata combine

* Chunks as dictionaries in `_get_auto_chunk` [all tests ci] (#1454)

* chunks as dictionaries in _get_auto_chunk

* when zarr encoding get chunks as list-like and rename _get_auto_chunk to _get_dask_auto_chunk to show that there is a difference between Dask and Zarr chunking

* set decode timedelta to False since it will default to this in later xarray versions (#1462)

* Fix deprecation warning for truth value of an empty array [all tests ci] (#1450)

* fix deprecation warning for truth value of an empty array

* change test outcome from True to False when param is missing

* found logic error in checking parameter validity before loading XML

---------

Co-authored-by: Wu-Jung Lee <[email protected]>

* Fill in NaN for missing EK80 coefficients [all tests ci] (#1458)

* fill in nan for missing filter coeffs from all channels

* update test_convert_ek80_no_fil_coeff

* remove commented out section

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cast numpy float array to int

* add compute Sv test

* remove test mark

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ctuguinay <[email protected]>

* Backward compatibility of raw-converted dataset with `xr.DataTree` [all tests ci] (#1447)

* EP-1420 resolved path issue with ek60_missing_channel_power

* EP-1420 fixed path to use string concatenation

* formatting

* adding conditionals for checking platform nmea and exchange from time1 to time_nmea

* working datatree for v0.9.0 w o calibration

* Drop Ping Time Duplicates (#1382)

* init commit

* revert change to fix merge conflict

* test only one file

* use other file

* move test duplicate to test convert ek

* add extra line

* move test back to ek80 convert

* pin zarr and add check unique ping time duplicates and tests

* fix test message

* test remove zarr pin

* add back zarr pin

* Fix tests that fail from new Xarray variable and attribute assignment updates (#1433)

* add .values to change water level scalar value

* add .values to fillna

* fix assign attributes

* remove test1

* update echodata properly

* remove print statement

* Assemble AD2CP timestamp with nanosecond precision (#1436)

* assemble ad2cp timestamp at ns instead of us

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add noqa

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Use `import_resources.files` instead of the legacy `open_text` (#1434)

* use import_resources.files to open yaml files

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* import from importlib.resources

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* bump codespell version, add exceptions (#1438)

* Pin `zarr` and `netcdf4` temporarily [all tests ci] (#1429)

* pin zarr and netcdf4 in requirements.txt

* change np.alltrue to np.all

* use np.argmin(da.data) instead of da.argmin()

* use S instead s for MVBS ping_time_bin

* extract single element in computing nsamples and L in tapered_chirp

* change from S to s in testing.py

* Add `type-extensions` to requirements.txt (#1440)

* add type-extensions

* sort requirements.txt

* add manual trigger to pypi workflow (#1442)

* update cff with ices paper data (#1443)

* add assign actual range utility function (#1435)

* chore(deps): bump actions/setup-python from 5.3.0 to 5.4.0 (#1445)

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.3.0 to 5.4.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v5.3.0...v5.4.0)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#1446)

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/PyCQA/isort: 5.13.2 → 6.0.0](PyCQA/isort@5.13.2...6.0.0)
- [github.com/psf/black: 24.10.0 → 25.1.0](psf/black@24.10.0...25.1.0)
- [github.com/codespell-project/codespell: v2.4.0 → v2.4.1](codespell-project/codespell@v2.4.0...v2.4.1)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* added pooch and bound the test registry to the echodata test

* EP-1420 added legacy test files >=v0.8.4, created pooch registry, parameterized test

* EP-1420 checked tests

* EP-1420 debugging TestEchoData test failures, test_nbytes, test_setattr, & test_setitem

* EP-1447 troubleshooting bug still

* EP-1420 removed pooch and added normal test path resources

* EP-1420 fixed test_nbytes, commented out test_setattr

* EP-1420 changed "time_nmea" to "nmea_time", added ek80 legacy_datatree tests, fixed readme

* EP-1420 removed import

* trying to add sonar channel_all

* EP-1420 checked sonar for channel and renamed to channel_all

* check legacy data differently

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove v0.9.2 zarr test files

* remove test_setattr that is no longer needed, see #1457 that removes EchoData.__setattr__

* Update echopype/tests/echodata/test_echodata.py

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Caesar Tuguinay <[email protected]>
Co-authored-by: Wu-Jung Lee <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add v0.9.1 and v0.10.0 release notes to docs (#1465)

* add v0.9.1 and v0.10.0 whats new

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Update workflows to use python 3.12 and ubuntu 22.04 [all tests ci] (#1466)

* update to use python 3.12 and ubuntu 22.04

* update v0.10.0 release notes

* Replace `pkg_resources.resource_string` with `importlib.resources.files` [all tests ci] (#1468)

* replace pkg_resources with importlib.resources

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused pkg_resources import

* update __name__ to __package__ and use read_text

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused pkg_resources again

* update whats new

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* chore(deps): bump pypa/gh-action-pypi-publish from 1.12.3 to 1.12.4 (#1430)

Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.12.3 to 1.12.4.
- [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases)
- [Commits](pypa/gh-action-pypi-publish@v1.12.3...v1.12.4)

---
updated-dependencies:
- dependency-name: pypa/gh-action-pypi-publish
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump actions/cache from 4.2.0 to 4.2.1 (#1469)

Bumps [actions/cache](https://github.com/actions/cache) from 4.2.0 to 4.2.1.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](actions/cache@v4.2.0...v4.2.1)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* not -> no (#1471)

* [pre-commit.ci] pre-commit autoupdate (#1472)

updates:
- [github.com/PyCQA/isort: 6.0.0 → 6.0.1](PyCQA/isort@6.0.0...6.0.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* chore(deps): bump actions/cache from 4.2.1 to 4.2.2 (#1473)

Bumps [actions/cache](https://github.com/actions/cache) from 4.2.1 to 4.2.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](actions/cache@v4.2.1...v4.2.2)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump actions/setup-python from 5.4.0 to 5.5.0 (#1481)

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v5.4.0...v5.5.0)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: 5.5.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#1480)

updates:
- [github.com/PyCQA/flake8: 7.1.2 → 7.2.0](PyCQA/flake8@7.1.2...7.2.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* chore(deps): bump actions/cache from 4.2.2 to 4.2.3 (#1479)

Bumps [actions/cache](https://github.com/actions/cache) from 4.2.2 to 4.2.3.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](actions/cache@v4.2.2...v4.2.3)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* expand ping time preemptively and add corresponding chunking tests (#1483)

* Enhance `compute_MVBS` feasability  (#1470)

* add range var max and reindex to compute_MVBS parameters and drop flox maximum version

* add tests for compute MVBS changes

* add todo for compute NASC

* add np.nan fill value

* Update echopype/commongrid/api.py

Co-authored-by: Wu-Jung Lee <[email protected]>

---------

Co-authored-by: Wu-Jung Lee <[email protected]>

* indent line of code

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Wu-Jung Lee <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: oftfrfbf <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jan Meischner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This makes echopype better
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants