-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Describe the Enhancement
The MetOffice often defines masking regions using the mask.sid option. That defines a list of station ID names to be included in the verification region. Those station name lists are stored in a StringArray object which in turn stores them as a vector of strings. For each point observation, the PairDataPoint class checks to see if the station names appears in the specified list. However, the StringArray::has() function searches the list linearly for a match which is O(n). This task is to optimize that search to be O(log(n)):
- Add "bool Sorted" as a member of the StringArray class (mimic logic of NumArray).
- Should be false by default and reset to false by all "add" fuctions.
- Add "StringArray::sort_array()" member function (naming consistent with existing NumArray::sort_array())... which sets Sorted = true after sorting. See StackOverflow for quick/easy sorting example:
std::sort( serverList.begin() , serverList.end() )
- Enhance 2 "StringArray::has()" member function variations to check if the StringArray has already been sorted, and, if so, search more efficiently. See StackOverflow for quick/easy searching example:
std::lower_bound(serverList.begin() , serverList.end() , valuetoFind) to find first matching
- Also look at NumArray::has() and TimeArray::has()/index() to see if similar enhancements could be made for sorted arrrays.
- In config_util.cc, update the parse_sid_mask() function so that just prior to returning, call StringArray::sort(). Storing the station ID list in a sorted order should always be a good idea.
- Lastly, devise a test before/after these changes to quantify how much time is saved using this strategy. Note that this does not need to be added as a new unit test... just to quantify the improvement this work provides.
Time Estimate
2 days (?)
Sub-Issues
Consider breaking the enhancement down into sub-issues.
No sub-issues needed.
Relevant Deadlines
List relevant project deadlines here or state NONE.
Funding Source
2799991
Define the Metadata
Assignee
- Select engineer(s) or no engineer required: @sethlinden
- Select scientist(s) or no scientist required: @robdarvell (would be good for advice on testing)
Labels
- Select component(s)
- Select priority
- Select requestor(s)
Projects and Milestone
- Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
- Select Milestone as the next official version or Future Versions
Define Related Issue(s)
Consider the impact to the other METplus components.
Enhancement Checklist
See the METplus Workflow for details.
- Complete the issue definition above, including the Time Estimate and Funding Source.
- Fork this repository or create a branch of develop.
Branch name:feature_<Issue Number>_<Description>
- Complete the development and test your changes.
- Add/update log messages for easier debugging.
- Add/update unit tests.
- Add/update documentation.
- Push local changes to GitHub.
- Submit a pull request to merge into develop.
Pull request:feature <Issue Number> <Description>
- Define the pull request metadata, as permissions allow.
Select: Reviewer(s) and Linked issues
Select: Repository level development cycle Project for the next official release
Select: Milestone as the next official version - Iterate until the reviewer(s) accept and merge your changes.
- Delete your fork or branch.
- Close this issue.