Skip to content

FIREFLY-148: Refactor file analysis to make it more general #834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 18, 2019

Conversation

loitly
Copy link
Contributor

@loitly loitly commented Jul 16, 2019

https://jira.ipac.caltech.edu/browse/FIREFLY-148

Changes made:
Server-side

  • refactor AnyFileUpload.
    • much easier to read
    • dramatically improve table upload due to bad logic from previous implementation
    • added ability to ask for different types of analysis report (Brief, Normal, Details)
  • added FileAnalysisCmd to get an analysis report from a file already on the server
  • define a model for the analysis report
    • every data product returns the same model, this includes fits, votable, ipactable, csv, and tsv.
  • Remove error-prone guessing of file format based on file extension
  • tests added and updated where applicable.

Client-side

  • refactor FileUploadViewPanel
    • convert it into a functional component
    • change logic so it will take the new analysis model
    • display is consistent for all data types
    • table is much more advanced with sorting and filtering
  • added display of Table Meta information into TablePanel
    • this will show meta, params, and links.
    • groups and/or data links can be added later.
  • Update FileUpload to handle new analysis model

New analysis model:
image

Tests:
https://irsawebdev9.ipac.caltech.edu/FIREFLY-148_refactor_file_analysis/firefly/?__action=layout.showDropDown&visible=true&view=FileUploadDropDownCmd

Try uploading a variety of files to see if the behavior is acceptable.
A good source of sample files are in firefly_test_data/FileUpload-samples

If you upload a table with meta, params, or links, you should see that appears in the Upload Details

Also, please spot check places you think may be affected by this change.

As always, gradle :firefly:test to run unit tests.

Server-side
- refactor AnyFileUpload.
    - much easier to read
    - dramatically improve table upload due to bad logic from previous implementation
    - added ability to ask for different types of analysis report (Brief, Normal, Details)
- added FileAnalysisCmd to get an analysis report from a file already on the server
- define a model for the analysis report
    - every data product must returns the same model, this includes fits, votable, ipactable, csv, and tsv.
- Remove error-prone guessing of file format based on file extension
- tests added and updated where applicable.

Client-side
- refactor FileUploadViewPanel
    - convert it into a functional component
    - change logic so it will take the new analysis model
    - display is consistent for all data types
    - table is much more advanced with sorting and filtering
- added display of Table Meta information into TablePanel
    - this will show meta, params, and links.
    - groups and/or data links can be added later.
- Update FileUpload to handle new analysis model
@loitly loitly requested review from robyww and cwang2016 July 16, 2019 19:30
@cwang2016
Copy link
Contributor

cwang2016 commented Jul 17, 2019

looks very nice.
Just saw some issues from the following samples,

  • FileUpload-samples/fits/multiimage/n8t801pxq_cal.fits => some empty rows are shown on key/value table.
  • FileUpload-samples/json gaia_result.json => when clicking search, not sure if the error message addresses the clear meaning (for the file with unknown type? )
  • FileUpload-samples/TSV/gaia_result.tsv, the right side table shows the spinning (waiting for results). => this may not always happen. Just try it upload this file after a json file is uploaded first.
    (it looks like this problem happens when the file is uploaded right after a 'json' file is uploaded).

@robyww
Copy link
Contributor

robyww commented Jul 17, 2019

UI issues @loitly and I talked about:

  • showing file format type
  • Size using the WebUtil.getSizeAsString
  • Type in 0 index of FITS being 'HeaderOnly'
  • for FITS parts showing as Extensions

Copy link
Contributor

@robyww robyww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good a will provide us better flexibility. Also it will be easier to add more analysis features. Most of my comments are small. @loitly and I already talked about the Ui changes that should happen (show above). After the changes are done I think it is ready to go.

@@ -137,6 +137,10 @@ export function getFieldVal(groupKey, fldName, defval=undefined) {
return get(getGroupFields(groupKey), [fldName, 'value'], defval);
}

export function getField(groupKey, fldName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you JSDoc this? Not sure what it is for.

const analysisResult = result.slice(4).join('::');
return {status, message, cacheKey, fileFormat, analysisResult};
let [status, message, cacheKey, analysisResult, ...rest] = text.split('::');
if (rest.length > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep the original so you don't have to put it back together?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:: is used as a separator token for status, message, etc. But in some cases, analysisResult may also contain the character ::. Usually it's in one of the comments.
JavaScript split(s, limit) is different from Java where it would discard the remaining string when limit is reached. Therefore, I had to split the whole string, and then put back what was incorrectly split.

const analysisResult = result.slice(4).join('::');
return {status, message, cacheKey, fileFormat, analysisResult};
let [status, message, cacheKey, analysisResult, ...rest] = text.split('::');
if (rest.length > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue, why not just keep the original so you don't have to put it back together?

return Format.JSON;
}
}

int readAhead = 10;
Copy link
Contributor

@robyww robyww Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the guessFormat ignores the file extension and now just looks inside?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's error-prone. We would assume any .xml file is a table and then get a failed error when loading.

private long fileSize;
private String filePath;
private String fileName;
private List<Part> parts;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about putting the file format type here as well

public Part(Type type, int index, String desc) {
this.type = type;
this.index = index;
this.desc = desc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there could be a detailed description here that we could start using in certain use cases.

private static final List<String> NAXIS_SET = Arrays.asList("naxis", "naxis1", "naxis2", "naxis3");

public static FileAnalysis.Report analyze(File infile, FileAnalysis.ReportType type) throws Exception {
FileAnalysis.Report report = new FileAnalysis.Report(type, infile.length(), infile.getPath());
Copy link
Contributor

@robyww robyww Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am glad you broke this out. It needs work but it is clear and easy to add stuff in another ticket. Looking at this I think stil think the Part needs a optional forth parameter.

@cwang2016
Copy link
Contributor

Please check the votable file, SloanDssImage-10.68479041.269060--0.1388889--u--queryKey.xml, from firefly_test_data/VOTable/tabledata. It is a votable with columns defined and no cell data.
The upload reports that it has 1 cols x 0 rows and 'double' type under no name.

- change ImageNoData to HeaderOnly
- replace Types with Format
- use sizeAsString to display file size
- change Parts to Extensions when file is FITS
- add more prominent format error
- details paney spiney not going away
- fix bad Sloan votable not showing the right number of columns
@loitly
Copy link
Contributor Author

loitly commented Jul 18, 2019

All requests are fixed. k8s test instance is updated.

@loitly loitly merged commit fc8bd0c into dev Jul 18, 2019
@loitly loitly deleted the FIREFLY-148_refactor_file_analysis branch July 18, 2019 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants