-
Notifications
You must be signed in to change notification settings - Fork 39
tickets/SP-2082: Support getting obsloctac visits from the pre-night sim archive #452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment and question that is optional, but the error message should be modified (I gave a suggestion) so that people don't think their misspelling means they should try installing lsst.resources.
rubin_sim/maf/utils/opsim_utils.py
Outdated
col_str = "" | ||
for colname in dbcols: | ||
col_str += colname + ", " | ||
col_str = col_str[0:-2] + " " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have recently found that
", ".join([colname for colname in dbcols])
should work for this in a one-liner.
Worth testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just moving what was there, without thinking about it too hard.
What's the list comprehension for? Why not just:
", ".join(dbcols)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh true .. dbcols is already a list here. (was remembering this from a different situation, where the columns were coming back from a query to the consdb schema, and so was not a list).
except ModuleNotFoundError: | ||
raise RuntimeError( | ||
f"Cannot read visits from {db_con}." | ||
"Maybe it does not exist, or maybe you need to install lsst.resources." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this error message may need some tweaking.
By far the most common error people are going to make is to try to use a mis-spelled version of the filename, or try to access a database that isn't where they think it is.
Most of these people (who include not-us) will not be trying to load a file from an S3 bucket, but may not be savvy enough to realize that lsst.resources is irrelevant to what they need.
The error message also slightly conflates two different issues - having lsst.resources versus the file existing anywhere (since it could still be a bad or inaccessible uri even with lsst.resources)
Can we tell if the uri is a local file vs. a uri that should be read with lsst.resources? And then give an error message that is as simple as "FileNotFoundError".
If not, then we could indicate why we were trying to use lsst.resources ..
try:
from lsst.resources import ResourcePath
except ModuleNotFoundError:
raise RuntimeError(f"Could not find {db_con} as a local file; please install lsst.resources in order to read remote file")
# then try to read the remote file
with ResourcePath(db_con).as_local() as local_db_path:
with closing(sqlite3.connect(local_db_path.ospath)) as con:
sim_data = pd.read_sql(query, con).to_records(index=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error messages lists the two issues as possibilities because it can arise from either cause.
Can we tell if the uri is a local file vs. a uri that should be read with lsst.resources?
I started to write code to do this, and concluded that it was more effort than it was worth, thinking that the user would know whether the file they are trying to read was local so diagnostics to be more specific wouldn't be particularly useful.
A test that is usually right shouldn't be that hard, though: if db_con
is a string that does not start with a URI "scheme" designation, its probably intended to be a file name. But, in principle this can fail, because linux files names can have colons in them, so really any URI can be a path to a local file. Naming a file or directory http:
or s3:
is a foolish and confusing thing to do, but it can happen. It can also fail if support is added for new schemes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ok - I'm ok if we can't decide whether it's local or not ourselves, but the thing to do is to make it clear that we're using lsst.resources for reading remote files only (rather than someone who didn't know what lsst.resources is thinking that they needed to install it to read their misspelled database filename). Because the general users for rubin_sim may very well not know what lsst.resources is or how to install it.
|
||
sim_metadata = read_archived_sim_metadata( | ||
archive_uri, num_nights=max_simulation_age, compilation_resource=compilation_uri | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is sim_metadata a pandas dataframe? would it make sense to make it one, so that the section below becomes a query of the dataframe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sim_metadata is a list of dicts. It can be trivially turned into a dataframe, but doing that doesn't really simplify things here. Some of the elements we are concerned with are lists or dicts. Yes, a cell in a pandas DataFrame can be a dict or a list, but dealing with them is awkward, so using a DataFrame ends up making the code harder to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reworked it so it only tries using ResourcePath if db_con is either not a str, or it can be parsed with pythons URL parser to get a valid scheme. So, it will only mention ResourcePath if it actually needs it.
8939bb2
to
8ace21f
Compare
I've reworked the start and end times of the obsloctap function to match what @rhiannonlynne proposed to Will: