Skip to content

[PRE REVIEW]: ParquetDB: A Lightweight Python Parquet-Based Database #7867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
editorialbot opened this issue Mar 3, 2025 · 34 comments
Closed
Assignees
Labels
pre-review Track: 7 (CSISM) Computer science, Information Science, and Mathematics

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Mar 3, 2025

Submitting author: @lllangWV (Logan Lang)
Repository: https://github.com/lllangWV/ParquetDB
Branch with paper.md (empty if default branch): main
Version: v0.25.1
Editor: @fabian-s
Reviewers: @ckoerber, @perdelt
Managing EiC: Daniel S. Katz

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/fb723cd2091d1e2c580937e31efe0d82"><img src="https://joss.theoj.org/papers/fb723cd2091d1e2c580937e31efe0d82/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/fb723cd2091d1e2c580937e31efe0d82/status.svg)](https://joss.theoj.org/papers/fb723cd2091d1e2c580937e31efe0d82)

Author instructions

Thanks for submitting your paper to JOSS @lllangWV. Currently, there isn't a JOSS editor assigned to your paper.

@lllangWV if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
@editorialbot editorialbot added pre-review Track: 7 (CSISM) Computer science, Information Science, and Mathematics labels Mar 3, 2025
@editorialbot
Copy link
Collaborator Author

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.1063/1.4812323 is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: 15 Types of Databases and When to Use Them
- No DOI given, and none found for title: Customer Case Studies
- No DOI given, and none found for title: Well-Known Users Of SQLite
- No DOI given, and none found for title: Genomics Data
- No DOI given, and none found for title: MongoDB: The Developer Data Platform \textbar Mong...
- No DOI given, and none found for title: About SQLite

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.98  T=0.99 s (155.1 files/s, 488711.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                            31           1612             29         178968
Python                          47           2214           3447           6415
CSS                             10            262             87           1565
Jupyter Notebook                12              0         281639           1473
JavaScript                      11            180            285           1110
Markdown                         3            445              0            892
YAML                            10             72             15            336
reStructuredText                18            291            262            302
TOML                             1             18              0             67
TeX                              1              7              0             64
DOS Batch                        1             14              6             62
make                             1             12             11             36
SVG                              4              0              0             27
Text                             2              0              0             10
JSON                             1              0              0              1
-------------------------------------------------------------------------------
SUM:                           153           5127         285781         191328
-------------------------------------------------------------------------------

Commit count by author:

   228	lllangWV
    57	GitHub Action
     6	lllang
     2	Logan Lang
     2	clayote

@editorialbot
Copy link
Collaborator Author

Paper file info:

⚠️ Wordcount for paper.md is 1655

✅ The paper includes a Statement of need section

@editorialbot
Copy link
Collaborator Author

License info:

✅ License found: MIT License (Valid open source OSI approved license)

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

EspressoDB: A scientific database for managing high-performance computing workflows
Submitting author: @ckoerber
Handling editor: @gkthiruvathukal (Active)
Reviewers: @remram44, @ixjlyons
Similarity score: 0.6548

The pdb2sql Python Package: Parsing, Manipulation and Analysis of PDB Files Using SQL Queries
Submitting author: @NicoRenaud
Handling editor: @lpantano (Active)
Reviewers: @i-mtz, @JoaoRodrigues, @joaomcteixeira
Similarity score: 0.6108

matbench-genmetrics: A Python library for benchmarking crystal structure generative models using time-based splits of Materials Project structures
Submitting author: @sgbaird
Handling editor: @phibeck (Active)
Reviewers: @ml-evs, @mkhorton, @jamesrhester
Similarity score: 0.5985

SampleDB: A sample and measurement metadata database
Submitting author: @FlorianRhiem
Handling editor: @arfon (Active)
Reviewers: @stuartcampbell, @dvanic
Similarity score: 0.5983

optimade-python-tools: a Python library for serving and consuming materials data via OPTIMADE APIs
Submitting author: @ml-evs
Handling editor: @jgostick (Active)
Reviewers: @hungpham2017, @jamesrhester
Similarity score: 0.5972

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@danielskatz
Copy link

👋 @lllangWV - thanks for your submission. Please add the country to the second affiliation, and note that we don't require addresses for affiliations, just organizations and their location (which could be city and country or just country).

Also, your paper is a bit long, though I'm not sure what I would suggest removing. Please do, however, consider if there's anything that could be in the repo or docs, and just pointed to from the paper, rather than being in the paper. Perhaps one of the two benchmarks?

Finally, please look at the your references in the paper, and make sure they follow the example paper: there should be a space before a reference, and multiple references have a particular format.

You can use the command @editorialbot generate pdf after making changes to the .md file to make a new PDF. And the command @editorialbot check repository runs a bunch of checks, including the wordcount of the paper, which should be close to 1000 words. editorialbot commands need to be the first thing in a new comment.

@lllangWV
Copy link

lllangWV commented Mar 3, 2025

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

EspressoDB: A scientific database for managing high-performance computing workflows
Submitting author: @ckoerber
Handling editor: @gkthiruvathukal (Active)
Reviewers: @remram44, @ixjlyons
Similarity score: 0.7060

easyaccess: Enhanced SQL command line interpreter for astronomical surveys
Submitting author: @mgckind
Handling editor: @arfon (Active)
Reviewers: @ngoldbaum
Similarity score: 0.7007

DBMS-Benchmarker: Benchmark and Evaluate DBMS in Python
Submitting author: @perdelt
Handling editor: @gkthiruvathukal (Active)
Reviewers: @simon-lewis, @erik-whiting
Similarity score: 0.6912

MatD^3^: A Database and Online Presentation Package for Research Data Supporting Materials Discovery, Design, and Dissemination
Submitting author: @raullaasner
Handling editor: @majensen (Active)
Reviewers: @dgasmith, @mkhorton
Similarity score: 0.6880

Paicos: A Python package for analysis of (cosmological) simulations performed with Arepo
Submitting author: @tberlok
Handling editor: @JBorrow (Active)
Reviewers: @ttricco, @kyleaoman
Similarity score: 0.6877

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@lllangWV
Copy link

lllangWV commented Mar 3, 2025

@editorialbot check repository

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.98  T=0.98 s (155.4 files/s, 489844.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                            31           1612             29         178968
Python                          47           2214           3447           6415
CSS                             10            262             87           1565
Jupyter Notebook                12              0         281639           1473
JavaScript                      11            180            285           1110
Markdown                         3            438              0            873
YAML                            10             72             15            336
reStructuredText                18            291            262            302
TeX                              1              6              0            100
TOML                             1             18              0             67
DOS Batch                        1             14              6             62
make                             1             12             11             36
SVG                              4              0              0             27
Text                             2              0              0             10
JSON                             1              0              0              1
-------------------------------------------------------------------------------
SUM:                           153           5119         285781         191345
-------------------------------------------------------------------------------

Commit count by author:

   230	lllangWV
    59	GitHub Action
     6	lllang
     2	Logan Lang
     2	clayote

@editorialbot
Copy link
Collaborator Author

Paper file info:

📄 Wordcount for paper.md is 1076

✅ The paper includes a Statement of need section

@editorialbot
Copy link
Collaborator Author

License info:

✅ License found: MIT License (Valid open source OSI approved license)

@lllangWV
Copy link

lllangWV commented Mar 3, 2025

Hey @danielskatz,

Thanks for the feedback! I made the suggested adjustments. I believe I fixed the references now.

As for reducing the length of the paper, I reduced the size to 1076, do you think this size will be good? I can try to reduce it more if I have to. For instance I could remove the installation section.

@danielskatz
Copy link

This length is fine. I'll work on finding an editor next, though this may take a little while due to too many papers and too few editors, though we are bringing on some more editors soon.

@danielskatz
Copy link

👋 @fabian-s - Would you be able to edit this submission?

@danielskatz
Copy link

@editorialbot invite @fabian-s as editor

@editorialbot
Copy link
Collaborator Author

Invitation to edit this submission sent!

@fabian-s
Copy link

@editorialbot add @fabian-s as editor

@editorialbot
Copy link
Collaborator Author

Assigned! @fabian-s is now the editor

@fabian-s
Copy link

hi @ckoerber @perdelt

would any of you be willing to review this submission (https://github.com/lllangWV/ParquetDB) for JOSS? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: https://joss.readthedocs.io/en/latest/review_criteria.html
Even if you are not interested or don't have the time, please reply to this message so we can start looking for other reviewers.

@ckoerber
Copy link

Hello @fabian-s,

Thank you for reaching out. I'd be happy to review the submission. However, due to current commitments, I won’t be able to start the review process before the start of April. If that timeline is acceptable, I’m glad to contribute.

@fabian-s
Copy link

@ckoerber sure, that's fine. it will take me a couple of days (or weeks...) to find another reviewer anyway...

@fabian-s
Copy link

@editorialbot add @ckoerber as reviewer

@editorialbot
Copy link
Collaborator Author

@ckoerber added to the reviewers list!

@fabian-s
Copy link

hi @jcrobak @perdelt @wesm

would any of you be willing to review this submission (https://github.com/lllangWV/ParquetDB) for JOSS? We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: https://joss.readthedocs.io/en/latest/review_criteria.html
Even if you are not interested or don't have the time, please be so kind and reply to this message so I can start looking for other reviewers.

@perdelt
Copy link

perdelt commented Mar 21, 2025

Hi @fabian-s
Thank you for your inquiry. Yes, I'd be happy to review the submission.

@fabian-s
Copy link

awesome, thank you @perdelt !

@fabian-s
Copy link

@editorialbot add @perdelt as reviewer

@editorialbot
Copy link
Collaborator Author

@perdelt added to the reviewers list!

@fabian-s
Copy link

@editorialbot start review

@editorialbot
Copy link
Collaborator Author

OK, I've started the review over in #7932.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pre-review Track: 7 (CSISM) Computer science, Information Science, and Mathematics
Projects
None yet
Development

No branches or pull requests

6 participants