Skip to content

Submission: rsketball #29

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
9 of 30 tasks
carlinakim opened this issue Mar 17, 2020 · 4 comments
Open
9 of 30 tasks

Submission: rsketball #29

carlinakim opened this issue Mar 17, 2020 · 4 comments
Assignees

Comments

@carlinakim
Copy link

carlinakim commented Mar 17, 2020

Submitting Author: Andres Pitta (@AndresPitta ), Kenneth Foo (@kfoofw ), Anand Shankar(@vanandsh ), Carlina Kim (@carlinakim )
Repository: rsketball
Version submitted: 1.1.0
Editor: Varada Kolhatkar (@kvarada)
Reviewer 1: Aakanksha Dimri (@aakanksha023)
Reviewer 2: Subing Cao (@scao1)
Archive: TBD
Version accepted: TBD


Package: rsketball
Title: Scrape data from ESPN NBA and show summary statistics/visualisations
Version: 0.0.0.9000
Authors@R: 
    person(given = "Kenneth",
           family = "Foo",
           role = c("aut", "cre"),
           email = "[email protected]");
    person(given = "Anand",
           family = "Vemparala",
           role = c("aut", "cre"),
           email = "[email protected]");
    person(given = "Carlina",
           family = "Kim",
           role = c("aut", "cre"),
           email = "[email protected]");
    person(given = "Andres",
           family = "Pitta",
           role = c("aut", "cre"),
           email = "[email protected]")
Description: This package is designated for all NBA enthusiasts! The `rsketball` package works to scrape online tabular data from the ESPN NBA website into a csv file. It also includes various functions to create graphs and statistical analysis for your interest (such as boxplots, player rankings by stats, and a summary statistics table).
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.0.2
Suggests: 
    testthat (>= 2.1.0),
    covr,
    knitr,
    rmarkdown
Imports: 
    RSelenium,
    rvest,
    dplyr,
    forcats,
    ggplot2,
    stats,
    rlang,
    scales,
    magrittr,
    tibble,
    tidyr,
    xml2,
    utils,
    assertthat,
    checkmate,
    readr
URL: https://github.com/UBC-MDS/rsketball
BugReports: https://github.com/UBC-MDS/rsketball/issues
VignetteBuilder: knitr

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • workflow automataion
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

Our package works to scrape online tabular data from the ESPN NBA website into a csv file. It also includes various functions to create graphs and statistical analysis for your interest (such as boxplots, player rankings by stats, and a summary statistics table).

  • Who is the target audience and what are scientific applications of this package?

The target audience is data scientists and NBA enthusiasts. They can use our package for interest in basketball, to predict playoffs, fantasy drafts, etc.

nbastatR is another package that take data from other sources (NBA Stats API, Basketball Insiders, Basketball-Reference, HoopsHype, and RealGM), but no package that we currently know of takes data from ESPN NBA specifically.

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you intend for this package to go on Bioconductor?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
JOSS Options
  • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@scao1
Copy link

scao1 commented Mar 18, 2020

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

This package is well-designed and is a very nice tool for its target audience. The instructions for installation and preparation is very detailed and easy to follow. The functions are well-written and user-friendly. Here are some small suggestions you may consider for future improvements.

  1. For the nba_scraper() documentation, the input for the ‘port’ argument should not only be positive integer but also be consistent with the the docker container settings. Although the authors mentioned that in the docker setup in the readme file, but it’s better also to indicate in the function documentation. Some grammar errors and typos were found in the description of of nba_scraper() documentation, such as the first sentence is not a complete sentence and the typo “seaason” should be changed to “season”. Please also check the nba_boxplot() documentation for grammar errors.

  2. The nba_boxplot() can produce a proper error message if there are misspellings in the input to the stats_column argument. But testing the the function with misspellings in the input to the grouping_list argument doesn’t raise an error message, instead the function just deletes the misspelled columns and generates the plot using the rest columns. In this case, it’s better to raise a warning message like “ Column names not found in the data were removed” to let the users know what happened. This is also the same with the nba_team_stats().

  3. It is nice that the authors make a reference to the website explaining each columns to help the users to understand the meaning of the columns. But instead of putting the reference at the end, the authors should consider to put it at the very beginning of “Usage examples” to help the users to select the right columns to analyze.

  4. The description for nba_boxplot() in the readme file is not very accurate. It is not to analyze "how different teams and positions affect different scoring statistics" but to compare the scoring statistics for different teams or positions.

  5. To make the titles of the plots more clear, I suggest changing the title of nba_boxplot() plot to something like "Comparing FT% scores in selected TEAMs" and the title of nba_ranking() plot to "Top 3 NAME based on 3PA ranking".

  6. Dependencies and the specific versions should be listed in Readme file (not only in the DESCRIPTION file).

Overall, this is a wonderful software. I really like it! Please let me know if any of the suggestions is not clear.

@aakanksha023
Copy link

aakanksha023 commented Mar 19, 2020

Package Review

  • [X ] As the reviewer I confirm that there are no [conflicts of interest] for me to review this work

Documentation

The package includes all the following forms of documentation:

  • [X ] A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • [ X] Installation instructions: for the development version of package and any non-standard dependencies in README
  • [ X] Vignette(s) demonstrating major functionality that runs successfully locally
  • [ X] Function Documentation: for all exported functions in R help
  • [ X] Examples for all exported functions in R Help that run successfully locally
  • [ X] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • [ X] Installation: Installation succeeds as documented.
  • [ X] Functionality: Any functional claims of the software been confirmed.
  • [ X] Performance: Any performance claims of the software been confirmed.
  • [ X] Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • [ X] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 2.5 hrs

  • [ X] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Additional Comments:

  • Installation instructions were clear and to the point
  • Crisp description of the functions before installation made the experience intuitive and easy to follow
  • Few grammatical errors here and there, but nothing too significant
  • Not sure how important this is, but one of the TA's pointed it out to us that test msgs should be consistent
  • Dependencies is missing from the Readme.d file
  • Code coverage was an impressive 100%!
  • Minor suggest, head's up about the R version being >3.6 could be added to dependencies
  • test() -Tests were successful exclusively for Ubuntu with the exception of version 18.04 (as informed by the author. Appreciate it.)

Overall, the function documentation was crisp and easy to follow. Just faced a little difficulty with the function nba_boxplot. The wording could be simplified, If the author does not mind. One more minor suggestion common across all functions except scraping one (stelar work), was to reduce the number of examples to one to reduce the length of the documentation. The code for all 4 function were clean and quite easy to follow with appropriate comments.

Few glitches I noticed:

  1. nba_scraper()
  • Quite creative to display the msg "Data scraping of 2017 postseason season completed " at the end for non technical audience!
  • Impressive work with dynamic scrapping
  1. nba_boxplot()
  • An error msg in case of an unknown column in the argument grouping_list is missing. Rest all arguments produce appropriate error
  • The description of the function seems incorrect. The boxplot is helping the user compare stats for different teams/positions.
  1. nba_ranking()
  • Added functionality of rounding down the argument top automatically was a great touch instead of just displaying an error msg
  • With top=5, there seems to be some sizing issue. Unable to read the 5th name in both ascending and descending order
  • Would prefer a better colour scheme
  1. nba_team_stats()
  • An error msg in case of an invalid values for arguments positions_filter() and teams_filter() is missing. Argument stats_filter produces appropriate error msg with incorrect values

Overall, I thoroughly enjoyed the process. Look fwd to passing this along to few NBA enthusiasts post March 2020 release on CRAN (As promised in the Readme!)

@kfoofw
Copy link

kfoofw commented Mar 23, 2020

Hi @scao1 and @aakanksha023, thanks for helping the review. We are performing some changes with respect to your feedback so feel free to track it in the following links:

Issue for feedback from @scao1:
UBC-MDS/rsketball#54

Issue for feedback from @aakanksha023 :
UBC-MDS/rsketball#55

Do note that not everything in your original review will be addressed but most of it will be. Once the pull request for the changes are collected, we will update this review thread.

Thanks once again!

@kfoofw
Copy link

kfoofw commented Mar 25, 2020

Hi @scao1 and @aakanksha023, thank you for reviewing our package. We have addressed most of your suggested feedback so here is a summary of the actions taken:

Functions revision:

  • nba_scraper:
    Edited docstring documentation for grammar (1st sentence) and spelling correction [as per @scao1].
    Also included weblink to vignette for usage in docstring.
  • nba_boxplot:
    Edited docstring documentation for grammar correction (1st sentence) [as per @scao1]
    Also included weblink to vignette and dataset description for usage in docstring.
    Function improved with warning functionality to warn user if grouping_list argument has elements that do not exist in the data set. Warning statement to show that misspecified elements will be removed. [as per @scao1 and @aakanksha023 ]
    Edited boxplot title to be more explicitly descriptive [as per @scao1]
  • nba_ranking:
    Edited docstring documentation to include weblink to vignette and dataset description for usage.
    Edited ranking visualization title to be more explicitly descriptive [as per @scao1]
    Edited text font in visualization to default size as they were too large for a larger number of specified rankings (ie top = 5) [as per @aakanksha023 ]
  • nba_team_stats:
    Edited docstring documentation to include weblink to vignette and dataset description for usage.
    Function improved with warning functionality to warn user if stats_filter, teams_filter, positions_filter argument has elements that do not exist in the data set. Warning statement to show that misspecified elements will be removed. [as per @scao1 and @aakanksha023 ]

General Documentation revision:

  • Relocating the dataset description file to near the beginning of the subsection "Usage Examples" after nba_scraper to improve user onboarding experience when they are new to the package and have difficulty figuring out the other functions [as per @scao1]
  • Updated description for nba_boxplot in Readme as a comparison of statistics among teams and position groupings. [as per @scao1 and @aakanksha023 ]
  • Minor edits on vignette.rmd file due to some error in the "important statement" for the example of nba_boxplot (backtick formatting required for columns with "%" character).
  • Listing of dependencies in Readme documentation [as per @scao1 and @aakanksha023 ]

Feedback issues not addressed:

  • Additional Github Actions build test for R > 3.6 with respect to ubuntu-16.04 [as per @aakanksha023 ]
  • Colour scheme changes for nba_ranking [as per @aakanksha023 ]
  • Consistent error messages across different functions [as per @aakanksha023 ]
  • Suggestion on reduction of examples in vignette to reduce documentation length [as per @aakanksha023].

For a more relevant look at the pull requests made, please refer to the following:

We are proud to release a new version 1.2.0 of rsketball. Please enjoy the revamped package!

Yours sincerely,
@carlinakim
@AndresPitta
@vanandsh
@kfoofw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants