[PRE REVIEW]: Balsa: A Fast C++ Random Forest Classifier with Commandline and Python Interface #7599

editorialbot · 2024-12-17T14:09:19Z

Submitting author: @tobiasborsdorff (Tobias Borsdorff)
Repository: https://github.com/SRON-Earth/Balsa
Branch with paper.md (empty if default branch):
Version: v1.0.0
Editor: @HaoZeke
Reviewers: Pending
Managing EiC: Chris Vernon

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/f324b8495db8e2983e97cb9692817b48"><img src="https://joss.theoj.org/papers/f324b8495db8e2983e97cb9692817b48/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/f324b8495db8e2983e97cb9692817b48/status.svg)](https://joss.theoj.org/papers/f324b8495db8e2983e97cb9692817b48)

Author instructions

Thanks for submitting your paper to JOSS @tobiasborsdorff. Currently, there isn't a JOSS editor assigned to your paper.

@tobiasborsdorff if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands

The text was updated successfully, but these errors were encountered:

editorialbot · 2024-12-17T14:09:22Z

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot · 2024-12-17T14:09:41Z

Software report:

github.com/AlDanial/cloc v 1.90  T=0.03 s (1658.5 files/s, 282255.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C/C++ Header                    24            702           1133           2515
C++                             16            347            231           1854
Markdown                         2            343              0            725
TeX                              1             12              0             82
Bourne Shell                     1             13             15             70
YAML                             1              0              0             55
CMake                            3             20              7             45
-------------------------------------------------------------------------------
SUM:                            48           1437           1386           5346
-------------------------------------------------------------------------------

Commit count by author:

   136	Joris van Zwieten
   104	Denis de Leeuw Duarte
     5	Tobias Borsdorff

editorialbot · 2024-12-17T14:09:42Z

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.3390/rs16071208 is OK
- 10.5281/zenodo.14186320 is OK
- 10.5281/zenodo.14186406 is OK
- 10.1023/A:1010933404324 is OK
- 10.5194/amt-14-665-2021 is OK
- 10.5194/amt-16-1597-2023 is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: Balsa: A Fast C++ Random Forest Classifier
- No DOI given, and none found for title: Scikit-learn: Machine Learning in Python

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None

editorialbot · 2024-12-17T14:09:45Z

Paper file info:

📄 Wordcount for paper.md is 861

✅ The paper includes a Statement of need section

editorialbot · 2024-12-17T14:09:48Z

License info:

✅ License found: BSD 3-Clause "New" or "Revised" License (Valid open source OSI approved license)

editorialbot · 2024-12-17T14:10:46Z

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

editorialbot · 2024-12-17T14:11:30Z

Five most similar historical JOSS papers:

ADaPT-ML: A Data Programming Template for Machine Learning
Submitting author: @nulberry
Handling editor: @jmschrei (Active)
Reviewers: @aaronpeikert, @wincowgerDEV
Similarity score: 0.7037

ASCENDS: Advanced data SCiENce toolkit for Non-Data Scientists
Submitting author: @ornlpmcp
Handling editor: @terrytangyuan (Retired)
Reviewers: @zhampel, @jrbourbeau
Similarity score: 0.6984

AutoClassWrapper: a Python wrapper for AutoClass C classification
Submitting author: @pierrepo
Handling editor: @trallard (Retired)
Reviewers: @rpetit3, @lowandrew
Similarity score: 0.6955

rFBP: Replicated Focusing Belief Propagation algorithm
Submitting author: @Nico-Curti
Handling editor: @arokem (Retired)
Reviewers: @justusschock, @DanielLenz
Similarity score: 0.6913

CRATE: A Python package to perform fast material simulations
Submitting author: @BernardoFerreira
Handling editor: @Kevin-Mattheus-Moerman (Active)
Reviewers: @RahulSundar, @atzberg, @Extraweich, @kingyin3613
Similarity score: 0.6906

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

crvernon · 2025-01-03T15:08:07Z

@editorialbot invite @HaoZeke as editor

👋 can you take this one on @HaoZeke?

editorialbot · 2025-01-03T15:08:09Z

Invitation to edit this submission sent!

tobiasborsdorff · 2025-01-16T09:07:23Z

@editorialbot invite @HaoZeke as editor

👋 can you take this one on @HaoZeke?

Dear @crvernon,

I hope this message finds you well. I’m writing to kindly follow up on the paper I submitted over a month ago. It seems that the editor assignment has not yet been initiated. Could you please let me know when this step might take place?

I truly appreciate your time and assistance and look forward to your response.

Best regards, Tobias Borsdorff

crvernon · 2025-01-17T12:53:49Z

👋 @tobiasborsdorff - we have a large backlog of submissions right now so it may take a little longer than usual to get you set up with an editor. Thanks for your patience!

@HaoZeke - are you able to take this one on?

crvernon · 2025-01-28T15:25:27Z

@HaoZeke just checking back in on this one.

HaoZeke · 2025-01-29T11:23:38Z

@editorialbot assign @HaoZeke as editor

Thanks for the invite @crvernon

editorialbot · 2025-01-29T11:23:42Z

Assigned! @HaoZeke is now the editor

HaoZeke · 2025-02-10T20:31:38Z

Hi @dostuffthatmatters 👋 would you be interested in and available to review this JOSS submission? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: joss.readthedocs.io/en/latest/review_criteria.html

HaoZeke · 2025-02-10T20:32:46Z

Hi @cpellet 👋 would you be interested in and available to review this JOSS submission? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: joss.readthedocs.io/en/latest/review_criteria.html

HaoZeke · 2025-02-10T20:37:44Z

Hi @bcjaeger 👋 would you be interested in and available to review this JOSS submission? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: joss.readthedocs.io/en/latest/review_criteria.html

dostuffthatmatters · 2025-02-10T21:14:03Z

Hi @HaoZeke,

Thank you for asking. I am busy with another review right now, so I have to kindly decline.

A general remark:

The main statement of need for this tool is about performance. But there is no performance evaluation against existing tools in the paper. The ScikitLearn implementations for decision trees and random forests are extremely efficient and there are many ML libraries like ScikitLearn with interfaces for different programming languages.

Maybe there is a good reason for the existence of Balsa, but I could not tell that from the given material. Good luck with the review!

bcjaeger · 2025-02-10T21:34:38Z

Hi @bcjaeger 👋 would you be interested in and available to review this JOSS submission? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: joss.readthedocs.io/en/latest/review_criteria.html

Hello! 👋 This looks very interesting, but I don't have enough availability to review at the moment. I agree that pointing to a formal benchmark of computational efficiency in the article would be a great addition.

tobiasborsdorff · 2025-04-03T07:09:48Z

Dear @HaoZeke,

I hope you're doing well. I wanted to follow up regarding the review process for my manuscript, as it has not yet started. If possible, could you kindly check whether potential reviewers are available?

I appreciate your time and assistance.

Best regards,
Tobias Borsdorff

HaoZeke · 2025-04-03T22:04:43Z

My apologies @tobiasborsdorff, however, have there been any updates addressing the comments of @dostuffthatmatters ?

tobiasborsdorff · 2025-04-07T14:22:32Z

Dear @HaoZeke,

Thank you for your feedback. We have already conducted a performance analysis of the Balsa algorithm, which is documented in Section 6.1 of the ATBD https://zenodo.org/records/14186320 that is also referenced in our paper.

As illustrated by the figures in that chapter, Balsa demonstrates clear advantages in both memory usage and runtime compared to the scikit-learn implementation as well as ranger—a C++ implementation of the random forest algorithm.

We would be happy to include a brief summary of these results in the manuscript to enhance clarity. However, we kindly ask that this addition be addressed during the official review process. While @dostuffthatmatters raised a valuable point, it is important to note that the suggestion was made outside the formal review framework and the manuscript is still not under review.

Best regards, Tobias Borsdorff

tobiasborsdorff · 2025-04-09T12:55:56Z

Dear @HaoZeke and @dostuffthatmatters,

I hope this message finds you well. I wanted to inform you that I have updated the manuscript to include a full paragraph including figures discussing the runtime and memory usage of the Balsa implementation in comparison with the SKlearn - Python and Ranger C++ implementations. This addition highlights the performance advantages of Balsa, particularly in terms of memory efficiency and prediction speed making the software suitability for large-scale operations.

With this update, I believe the manuscript is now complete and ready for the next steps. As the manuscript has been awaiting review since December last year, I would greatly appreciate it if you could kindly restart the review process and initiate the search for reviewers at your earliest convenience.

Thank you for your time and consideration.

Best regards, Tobias

HaoZeke · 2025-04-09T19:12:23Z

Thanks @tobiasborsdorff , if you have any suggested reviewers please let me know without the @ here.

dostuffthatmatters · 2025-04-09T19:13:55Z

@editorialbot generate pdf

editorialbot · 2025-04-09T19:15:12Z

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

editorialbot · 2025-04-09T19:17:02Z

Five most similar historical JOSS papers:

None
Submitting author: None
Handling editor: None (None)
Reviewers: None
Similarity score: 0.6816

None
Submitting author: None
Handling editor: None (None)
Reviewers: None
Similarity score: 0.6762

MetObs - a Python toolkit for using non-traditional meteorological observations
Submitting author: @vergauwenthomas
Handling editor: @hugoledoux (Active)
Reviewers: @ashwinvis, @Zeitsperre
Similarity score: 0.6761

quantile-forest: A Python Package for Quantile Regression Forests
Submitting author: @reidjohnson
Handling editor: @jbytecode (Active)
Reviewers: @jncraton, @oparisot
Similarity score: 0.6750

rFBP: Replicated Focusing Belief Propagation algorithm
Submitting author: @Nico-Curti
Handling editor: @arokem (Retired)
Reviewers: @justusschock, @DanielLenz
Similarity score: 0.6710

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

dostuffthatmatters · 2025-04-09T19:19:00Z

@tobiasborsdorff great, thanks - looks promising! My other review is still ongoing though. Best of luck!

tobiasborsdorff · 2025-04-14T07:28:37Z

Thanks @tobiasborsdorff , if you have any suggested reviewers please let me know without the @ here.

Dear @HaoZeke, I don’t know many people who review for JOSS, but a colleague of mine suggested these two: https://github.com/mkhorton and https://github.com/dgasmith. Maybe they’re a good fit! It’d be great if you could also reach out to a few people you think might be interested. regards Tobias

editorialbot added pre-review Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning labels Dec 17, 2024

editorialbot added C++ TeX CMake labels Dec 17, 2024

editorialbot assigned HaoZeke Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PRE REVIEW]: Balsa: A Fast C++ Random Forest Classifier with Commandline and Python Interface #7599

[PRE REVIEW]: Balsa: A Fast C++ Random Forest Classifier with Commandline and Python Interface #7599

editorialbot commented Dec 17, 2024 •

edited

Loading

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

crvernon commented Jan 3, 2025

editorialbot commented Jan 3, 2025

tobiasborsdorff commented Jan 16, 2025

crvernon commented Jan 17, 2025

crvernon commented Jan 28, 2025

HaoZeke commented Jan 29, 2025

editorialbot commented Jan 29, 2025

HaoZeke commented Feb 10, 2025

HaoZeke commented Feb 10, 2025

HaoZeke commented Feb 10, 2025

dostuffthatmatters commented Feb 10, 2025

bcjaeger commented Feb 10, 2025

tobiasborsdorff commented Apr 3, 2025

HaoZeke commented Apr 3, 2025

tobiasborsdorff commented Apr 7, 2025 •

edited

Loading

tobiasborsdorff commented Apr 9, 2025

HaoZeke commented Apr 9, 2025

dostuffthatmatters commented Apr 9, 2025

editorialbot commented Apr 9, 2025

editorialbot commented Apr 9, 2025

dostuffthatmatters commented Apr 9, 2025

tobiasborsdorff commented Apr 14, 2025

[PRE REVIEW]: Balsa: A Fast C++ Random Forest Classifier with Commandline and Python Interface #7599

[PRE REVIEW]: Balsa: A Fast C++ Random Forest Classifier with Commandline and Python Interface #7599

Comments

editorialbot commented Dec 17, 2024 • edited Loading

Status

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

editorialbot commented Dec 17, 2024

crvernon commented Jan 3, 2025

editorialbot commented Jan 3, 2025

tobiasborsdorff commented Jan 16, 2025

crvernon commented Jan 17, 2025

crvernon commented Jan 28, 2025

HaoZeke commented Jan 29, 2025

editorialbot commented Jan 29, 2025

HaoZeke commented Feb 10, 2025

HaoZeke commented Feb 10, 2025

HaoZeke commented Feb 10, 2025

dostuffthatmatters commented Feb 10, 2025

bcjaeger commented Feb 10, 2025

tobiasborsdorff commented Apr 3, 2025

HaoZeke commented Apr 3, 2025

tobiasborsdorff commented Apr 7, 2025 • edited Loading

tobiasborsdorff commented Apr 9, 2025

HaoZeke commented Apr 9, 2025

dostuffthatmatters commented Apr 9, 2025

editorialbot commented Apr 9, 2025

editorialbot commented Apr 9, 2025

dostuffthatmatters commented Apr 9, 2025

tobiasborsdorff commented Apr 14, 2025

editorialbot commented Dec 17, 2024 •

edited

Loading

tobiasborsdorff commented Apr 7, 2025 •

edited

Loading