Skip to content

Read dropped study folder on the submission interface (proof of concept) #1442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

hepcat72
Copy link
Collaborator

@hepcat72 hepcat72 commented Feb 1, 2025

Summary Change Description

Proof of concept, to show that it's possible to drop and read a study folder and associate peak annotation files with sequence and scan directories.

@mneinast - This code is live for your testing on http://tracebase-dev.princeton.edu

Testing Notes

For usage, please see the screen recording I made.

slack
This POC code however is still useful for debugging. The logic it uses will be the same logic that can be applied to the list of files from the directory reading, so it would be nice to test it out. It requires access to dragging and dropping from the MS data share. The focus should be about dropping study folders that contain mzXML files and confirming that the code identifies the sequence directories correctly and populates dropdowns with those sequence and scan directories when you drop peak annotation files...

Using the "Debug" show button isn't necessary, but it would be good to make sure that the recorded data is correctly updated.

slack
It would also be good to test this on a windows machine and different browsers. The code used uses ES6. I don't know what that stands for, but it relates to javascript functionality present in modern browsers. It might not work in older browsers, but I hope that perhaps it handles things gracefully, like just omitting the relative file paths, meaning that the select lists would never populate and could be mitigated by using text fields with autocomplete...

Development Notes

  • The feature this is a POC for will not be implemented before the end of February publication deadline. I will proceed directly from this PR creation to resuming the extraction of legacy code. This POC took over a week because javascript is a bear.
  • No linting has been performed
  • No tests have been run
  • Debug code is present
  • Views and loaders have not been updated
  • It's possible to use this code with current study doc, but ideally, a new MS Files sheet will be added and schema changed
  • After implementation, I realized the possibility of 2 improvements:
    • Formsets could be eliminated by storing the sequence metadata in a hidden json field
    • Javascript performance can be improved by looping on the result of the directory entry reading instead of on file objects

Affected Issues/Pull Requests

Reviewer Notes/Requests

See comments in-line.

Checklist

This pull request will be merged once the following requirements are met. The
author and/or reviewers should uncheck any unmet requirements:

…anges.

Files checked in: DataRepo/templates/submission/includes/1_start.html DataRepo/templates/submission/submission.html static/css/drop_area.css static/js/submission_start.js
[skip ci]
… study folder and add mzXML files to the new MS Files tab and the mzXML File Name of the Peak Annotation Details sheet.

Details:

- Changes to submission_start.js
  - New functions:
    - getMzxmlsInDirectory
    - refreshMzxmlDisplayList
    - getParentMzxmlDirectories
    - getAllMzxmlDirectories
    - getAllMzxmlFilePaths
    - getMzxmlFileNamesString
    - refreshMzxmlMetadata
    - initMzxmlMetadataUploads
  - Renamed afterAddingFiles to afterAddingPeakAnnotFiles
  - Deleted commented code from clearPeakAnnotFiles
  - Renamed global variables to distinguish between ones associated with peak annotation files versus mzXML files
  - Added a DataTransfer argument to afterAddingPeakAnnotFiles to make it compatible with alternate methods called by handleFiles
  - Changed fileColumn ID to UnHideMe because there are multiple column that must be hidden and unhidden for the peak annotation file metadata inputs (sequence directory and scan subdirectory)
  - Changed other IDs and added global variables for the mzXML-related variables
- Changes to file_list_drop_area.js
  - Make fileFunc optional, since there's nothing to do per file in the study folder addition.
  - Dropped files are now passed to postDropFunc
- Added style elements for mzXML-related elements to drop_area.css
- Changes to submission.py
  - Changed the metadata coming into BuildSubmissionView
    - Commented out code relating to self.annot_file_metadata
    - Added variables for new incoming metadata:
      - self.peak_annot_to_mzxml_metadata
      - self.sequence_metadata
      - self.mzxml_file_to_seq_dir
    - Added some initialization of the new metadata that is expected to come in
- Added exception classes to exceptions.py:
  - MzxmlsWithoutSequences
  - SequencesWithoutMzxmls
- Added calls to new initialization javascript methods (for elements related to mzXMLs) in submission.html
- Changes to 1_start.html]
  - Added a study folder drop area and file upload button
  - Moved the sequence metadata fields to be next to the study folder drop area
  - Added new form fields
- Changes to forms.py (BuildSubmissionForm
  - Changed peak_annotation_file to peak_annotation_files and made it a miltiple file input.
  - Added peak_annot_to_mzxml_metadata as a JSONField
  - Added mzxml_file_list as a JSONField with an array decoder
  - Added sequence_dir as a CharField.  This will be associated with the sequence metadata
  - The peak_annot_to_mzxml_metadata will hold a sequence directory and scan subdirectory associated with each peak annotation file

Files checked in: DataRepo/forms.py DataRepo/templates/submission/includes/1_start.html DataRepo/templates/submission/submission.html DataRepo/utils/exceptions.py DataRepo/views/upload/submission.py static/css/drop_area.css static/js/file_list_drop_area.js static/js/submission_start.js
[skip ci]
Adding peak annotation files now updates a "hidden" multiple files input element (that's not currently hidden for debugging purposes) and adds a row with the peak annotation file name and 2 (currently empty) select list for the default sequence dir and the default scan dir.

This work involved changing the drop area globals into a "dict" so that multiple drop areas could do different things.

I also added an alert about skipped files with duplicate names.

Files checked in: DataRepo/forms.py DataRepo/templates/submission/includes/1_start.html DataRepo/templates/submission/submission.html DataRepo/views/upload/submission.py static/js/file_list_drop_area.js static/js/submission_start.js
[skip ci]
…om a dropped folder.

This doesn't appear to be the same as the webkitRelativePath, so I'm not sure this will work with the file input element which uses that relative path...

There's also an issue with promises.  I'm not getting the paths back, but it took awhile to get to this point where SOMETHING works, so I am checking this in.

Files checked in: DataRepo/forms.py DataRepo/templates/submission/includes/1_start.html static/js/file_list_drop_area.js static/js/submission_start.js
[skip ci]
… the choose folder button and dropped folders and started work on the defaultSequenceDir and defaultScanDir select lists.

Details:

- added global variables to save extra data such as the name of the study directory and the paths of the possible peak annotation files.  Also saved possible study docs in the root directory.
- Renamed getAllMzxmlFilePaths to getStudyMetadata and returned a list of variables with all of the parsed data.
- Renamed clearPeakAnnotFiles to clearSubmissionForm and consolidated its functions (removing calls from the html template.
- renamed refreshMzxmlMetadata to refreshStudyMetadata
- Added method updateAnnotSelectLists
- added method arraysEqual
- Added method getFilePath
- Renamed disablePeakAnnotForm to disableSubmissionForm
- Renamed enablePeakAnnotForm to enableSubmissionForm
- Renamed initMzxmlMetadataUploads to initStudyMetadataUploads
- Added global variables:
  - possibleAnnotPaths
  - peakAnnotHash
  - sequenceDirsHash
- Added methods to the drop area code:
  - setResults
  - readEntryContentAsync
  - readEntriesPromise
  - readAllDirectoryEntries
  - getAllFileEntries
  - getAllFileEntries
  - onDropItems
  - supportsWebkitGetAsEntry
  - supportsFileSystemAccessAPI
- Added CSS for ID: sequence_dir_input_id
- Temporarily changed form field widgets from hidden to another type for debugging

Files checked in: DataRepo/forms.py DataRepo/templates/submission/includes/1_start.html DataRepo/templates/submission/includes/3_validate.html DataRepo/templates/submission/submission.html static/css/drop_area.css static/js/file_list_drop_area.js static/js/submission_start.js
[skip ci]
…scan directories.

Details:

- I added code to gray out and present a popup about folder processing, but it doesn't work as intended.  That should either be ripped out or made to work.  The intention is that it takes some time to process the files and it's possible that if the user does something during that processing, it will silently fail.
- I should update the code to look at directory entries instead of looping through actual files, because that's what takes time.
- Added the ability to account for multiple peak annotation files with the same name in the folder, and to only auto-select sequence directories when they exist in a sequence directory.
- Added a non-selection for each select list that tells the user to select a sequence/scan folder.
- Added change event listeners to update all fields.

Files checked in: DataRepo/forms.py DataRepo/templates/submission/includes/1_start.html DataRepo/templates/submission/submission.html static/css/drop_area.css static/js/file_list_drop_area.js static/js/submission_start.js
[skip ci]
@hepcat72 hepcat72 added type:feature New feature or request priority:2-high Must be implemented in order to meet release goals status:wait:WIP Do not review. Issue/PR content is in flux. component:submission Submission process page and logic effort:2-over1wk This is a complex effort involving about a week or more of focussed effort labels Feb 1, 2025
@hepcat72 hepcat72 self-assigned this Feb 1, 2025
@hepcat72 hepcat72 changed the title Read dropped folder poc Read dropped study folder on the submission interface (proof of concept) Feb 1, 2025
@hepcat72 hepcat72 marked this pull request as draft February 1, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:submission Submission process page and logic effort:2-over1wk This is a complex effort involving about a week or more of focussed effort priority:2-high Must be implemented in order to meet release goals status:wait:WIP Do not review. Issue/PR content is in flux. type:feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant