Skip to content

Installation cleanup #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ivalaginja opened this issue Apr 10, 2025 · 7 comments
Closed

Installation cleanup #8

ivalaginja opened this issue Apr 10, 2025 · 7 comments

Comments

@ivalaginja
Copy link

ivalaginja commented Apr 10, 2025

This is part of openjournals/joss-reviews#7917.

Going through the installation instructions, the first thing I asked myself when seeing the optional LFS pull after a git clone was: What can I run with this package if I don't do that? Digging into this, I found a couple of issues that would need to be addressed to make the installation process more robust.

Some comments:

  • When cloning the repo for the files but installing the package from PyPI, I now have two separate instances of your code on my machine that are not in sync. Even if I were to go into the local repo clone and change some code there, it would still import it from the pip installation from PyPI, which could lead to confusion.
  • Since the assumption is that the package installation happens from PyPI, the default output path across the repository saves output data into the installed module, which is not desirable. Data outputs and code should be kept separate. It looks like @Jashcraf ran into the same issue: Error when running examples/visualize_throughput.py #7.
  • Your setup.py assumes Python versions that have reached or are about to reache end-of-life (EOL): https://devguide.python.org/versions/. This leads to some dependency confusion when running in newer Python versions.
  • It's unclear which examples require the provided data and which not.
  • it is unclear in which cases the optional dependencies astropy and hcipy are actually necessary.

Requested revisions:

PyPI installation vs. cloning the repo

  1. Separate your two installation pathways very clearly: one that installs from PyPI and allows only running your package without the provided data, and one that clones your repo and installs it from source.
  • Make sure to clearly state this difference in your installation instructions, and explain the difference between them.
  • Have the first option just list the pip install pystarshade option as that's sufficient for this case.
  • For the second option, provide the cloning instructions as you already to, and tell your users to install the package from source, in editable mode, so that any changes they do in the code is actually reflected at runtime. The command for this is (while in the top-level repository directory): pip install -e .

Output data management

  1. Implement a different default output path that separates the data from the code. I can suggest one of two options:
  • Instead of defaulting to inside the package location, default to a folder in the user's home directory that your code can create when this is done for the first time.
  • Alternatively, you could require an environment variable that the user needs to set with their desired default output path. In this case, you would need to specify this requirement in the installation instructions and also update your code base to use that environment variable.

Dependencies

  1. Simplify things and make astropy and hcipy normal dependencies. It takes time to figure out currently when I would want to install them and when not, and there is no disadvantage to adding them to the required dependencies (I think). In this way you eliminate a minor headache.
  2. Include a Python requirement of at least 3.12 to your package.
  • The module pkg_resources is not part of the Python standard library anymore since Python 3.12. This means that you need to list setuptools (which the module is a part of) as an explicit dependencies for Python versions >= 3.12. The simplest way is to bump your Python requirement to >= 3.12 and add setuptools to your dependency list.
  • The reason you haven't run into this issue before is because hcipy installs setuptools as part of its installation. However, all direct dependencies of your own package should always be listed.
  • For more info, see pkg_resources is deprecated and removed in Python 3.12 mu-editor/mu#2485
@ivalaginja ivalaginja changed the title Installation from PyPI vs. cloning repo, and output data management Installation cleanup Apr 10, 2025
@xiaziyna
Copy link
Owner

xiaziyna commented Apr 25, 2025

Hi,

Thanks for all the detail!

PyPI installation vs. cloning the repo

  1. I have amended the LFS install instructions to pip install in editable mode. I have also revised the installation instructions to clearly separate these two install instructions - with warnings.
    The diffraction tools/library certain can be used without the data. However, the user will need to supply data (such as a starshade mask) and depending on the size of the starshade/computational resources available, this could be computationally time-intensive to simulate from scratch. For this reason I have recommended the user install with git-LFS.

Output data management

  1. The output data path is currently not being used by any script (it was created for personal use). I have deleted this path and will add in an external output path if any script in the future needs one.

Dependencies

  1. I am hesitant to add HCIPy and Astropy as dependencies due to their limited use in the package/not wanting to install a bunch of extra packages on someones machine. 1) HCIPy is only used to generate example apertures using the 'apodization.pupil.make_pupil' script, the user can also provide their own aperture (as is used in the example notebook). 2) Astropy is only used to open an ExoVista fits file in example scripts. So perhaps a solution would be to include further instructions in the installation guide? I will add these as dependencies if you think it is better to do so.

  2. Thanks for pointing this out to me. I have updated the setup.py and added setuptools as a requirement. I believe this is always compatible with earlier Python 3 versions, so I have left in those versions for now.

@ivalaginja
Copy link
Author

@xiaziyna looking a lot better!

  1. The installation instructions are so much cleaner now, thanks! I will make sure to test this, but if everything works fine, nothing else to do here.
  2. From what I saw, you use plt.savefig() and np.savez_compressed() extensively throughout the repo though, at least in the examples. These outputs need to be directed to somewhere that is not inside your repository. What is your current output data management plan for these examples?
  3. If your packages has dependencies, then you should indeed list them as such, even if they are used "rarely" - even a single usage justifies this. Otherwise you are providing an incomplete installation. Users need to be able to manage their local installations themselves, so cluttering their machines is also not a good reason to withhold dependencies. On the contrary, the more streamlined your installation process, the easier it will be for your users. As a side note, nowadays people who run Python should be using virtual environment solutions anyway (e.g. conda or others), so installations for new packages and projects would be isolated enough that your installation would not be invasive into the user's setup at all. You cannot really dictate this part though, and this often remains up to the user.
  4. Indeed, this is backwards-compatibe so you should be fine. However, now you list support for Python versions 3, 3.6-3.9 and 3.12, wich makes it look like 3.10 and 3.11 are unsupported. Is this the case? If you support in fact everything starting from 3.6 and above, it might be less confusing if you just list "Programming Language :: Python :: 3", since you list python_requires=">=3.6" # Keep supporting 3.6 and above anyway.

@xiaziyna
Copy link
Owner

xiaziyna commented May 2, 2025

Thanks for taking the time to be thorough in your explanations and providing expertise!
2. True. I have added an output directory set to the users home: '~/PyStarshade' (set in the pystarshade/config.py file and instantiated when called for the first time). I maintain an internal data directory (for the files used by pystarshade, starshade masks, optical fields etc.). Is this ok?

3.4. Your explanation makes a lot of sense and I have updated the setup to include the dependencies. I have also replaced all the random Python versions with Python 3.

@ivalaginja
Copy link
Author

ivalaginja commented May 4, 2025

Looks good! With the updated dependencies, there would just need to be an update to that section in the readme:
https://github.com/xiaziyna/PyStarshade/tree/main?tab=readme-ov-file#dependencies

And I just realized that your installation instructions (https://pystarshade.readthedocs.io/en/latest/#installation) should also give an indication of the dependencies.
Edit: Well I guess the installation installs them automatically, so no need for that here.

Could I ask you do fix these two one (minor) things please? You're almost there!

@ivalaginja
Copy link
Author

And about the output details in comment/question 2.: This looks like a great solution! If I run into any issues while testing this, I can easily open a new issue so I will consider this done here!

@xiaziyna
Copy link
Owner

xiaziyna commented May 5, 2025

Hi! I have updated the README to list the updated dependencies.

@ivalaginja
Copy link
Author

Thanks a lot for addressing all of my requests! This issue can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants