[29] Logging in json format #68

tjhunter · 2025-03-12T17:11:38Z

Closes #29

First part: prototyping the new format.

src/weathergen/utils/train_logger.py

tjhunter · 2025-03-12T17:14:28Z

src/weathergen/utils/train_logger.py

+
+        # TODO: performance: we repeatedly open the file for each call. Better for multiprocessing
+        # but we can probably do better and rely for example on the logging module.
+        with open(os.path.join(self.path_run, "metrics.json"), "ab") as f:


I suggest that we start with this simple version, we can always improve performance if it turns out to be a bottleneck

tjhunter · 2025-03-13T16:20:31Z

pyproject.toml

+
+[tool.uv.sources]
+flash-attn = { url = "https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl" }


I had to make this change to use uv on the hpc2020 cluster. I am not sure if this is going to be a breaking change for people. @clessig , do we assume that different HPCs can use different versions of CUDA? That sounds like a nightmare.

We do not assume it, we know it ;) One can write a script that detects the available CUDA (and the python version if this is a variable) and then assembles the string that defines the wheel to be downloaded. @tjhunter : To what extent could one integrate this into pyproject toml?

And could we open an issues to track this? :)

I have the script in branch of the private repo but not committed yet:
#57

…r into tjh/dev/29-logging

tjhunter · 2025-03-17T16:07:43Z

Ass discussed, will be followed up by #90

tjhunter added 2 commits March 12, 2025 16:58

work

614ad74

work

3cf38e4

github-project-automation bot added this to WeatherGen-dev Mar 12, 2025

tjhunter commented Mar 12, 2025

View reviewed changes

src/weathergen/utils/train_logger.py Outdated Show resolved Hide resolved

tjhunter commented Mar 12, 2025

View reviewed changes

tjhunter added 2 commits March 13, 2025 16:05

changes

ce55c18

Merge branch 'develop' into tjh/dev/29-logging

22701ce

tjhunter marked this pull request as ready for review March 13, 2025 16:18

tjhunter commented Mar 13, 2025

View reviewed changes

tjhunter added 3 commits March 14, 2025 10:18

changes

e4d1b57

Merge branch 'tjh/dev/29-logging' of github.com:ecmwf/WeatherGenerato…

3241135

…r into tjh/dev/29-logging

Merge remote-tracking branch 'origin' into tjh/dev/29-logging

3a3977c

clessig approved these changes Mar 17, 2025

View reviewed changes

tjhunter merged commit 1dece82 into develop Mar 17, 2025
3 checks passed

github-project-automation bot moved this to Done in WeatherGen-dev Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[29] Logging in json format #68

[29] Logging in json format #68

Uh oh!

tjhunter commented Mar 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

tjhunter Mar 12, 2025

Uh oh!

tjhunter Mar 13, 2025

Uh oh!

clessig Mar 13, 2025

Uh oh!

clessig Mar 13, 2025

Uh oh!

tjhunter Mar 13, 2025

Uh oh!

Uh oh!

tjhunter commented Mar 17, 2025

Uh oh!

Uh oh!


		[tool.uv.sources]
		flash-attn = { url = "https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl" }

[29] Logging in json format #68

[29] Logging in json format #68

Uh oh!

Conversation

tjhunter commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tjhunter Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

clessig Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

clessig Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

tjhunter Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tjhunter commented Mar 17, 2025

Uh oh!

Uh oh!

tjhunter commented Mar 12, 2025 •

edited

Loading