Skip to content

[WIP] Disk entry rework #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 30, 2020
Merged

[WIP] Disk entry rework #67

merged 14 commits into from
May 30, 2020

Conversation

ingrinder
Copy link
Collaborator

@ingrinder ingrinder commented May 5, 2020

Yet more rewriting code! 😨

Reason and / or context

This is a full rework of the Disk entry in an effort to implement #55 and fix #62. It may also potentially be faster as it relies on only a single call to df for input, and nothing else.
This also additionally fixes an (unreported) bug, where we don't actually calculate the correct space usage in Disk currently... oops!

How has this been tested ?

It seems to work fine so far... Some tests have been written for the new functionality, some are TBC. (This section to be completed properly before converting to a full PR 😁)

@HorlogeSkynet Could you let me know what you think of this approach? I'm up for changing it but basing all of our calculations around a single basic df call like this is what I came up with after some thinking on how to approach reworking this entry. TIA!

Behaviour changes

Since I'm not sure how df -l determines local filesystems, I've taken a guess at defining "local" filesystems as all /dev/xxx devices, excluding:
Loop devices:-

  • /dev(/...)/loop filesystems (Linux)
  • /dev(/...)/*vnd filesystems (BSD)
  • /dev(/...)/lofi filesystems (Solaris)

Device mappers:- (since their partitions are already included!)

  • /dev(/...)/dm filesystems (Linux)

This is highly likely to be a change in behaviour from our existing implementation (and may be erronous)! From what I can tell, this excludes all /dev/ devices that are not local, while including all local disks only once (since their device-paths are deduplicated). This logic also hinges on all local disks always having an entry in /dev/ on all Unix-like/POSIX OSes, which may possibly not be true.

Types of changes :

  • Bug fix (non-breaking change which fixes an issue)
  • Typo / style fix (non-breaking change which improves readability)
  • New feature (non-breaking change which adds functionality)
  • [SEE ABOVE] Breaking change (fix or feature that would cause existing functionality to change)

Checklist :

  • I have updated the README.md file accordingly ;
  • I have updated the test cases (which pass) accordingly ;
  • My changes looks good ;
  • I agree that my code may be modified in the future ;
  • My code follows the code style of this project (PEP8).

@ingrinder ingrinder added bug 🐛 A real glitch has been found enhancement ⬆️ Implements a new feature, fixes or improves existing ones question ❔ The OP is asking for something about this project ! labels May 5, 2020
@ingrinder ingrinder mentioned this pull request May 6, 2020
13 tasks
@HorlogeSkynet HorlogeSkynet added this to the v4.7.2 milestone May 10, 2020
@HorlogeSkynet HorlogeSkynet mentioned this pull request May 13, 2020
5 tasks
@HorlogeSkynet HorlogeSkynet modified the milestones: v4.7.2, v4.8.0 May 18, 2020
@ingrinder
Copy link
Collaborator Author

Since we merged #70, it's allowed a lot of simplification so I'm just going to (kind of) start afresh from master and force push over the commits already here - hope that's alright!

The entire entry has been rewritten, and now relies exclusively on a
single call to `df -P -k`.

* Adds support for multi-line disk output (fixes #55).
* Fixes #62 by using a standard `df` call.
* Adds configurability to disk labels, and which disks to show.
  * Modifies `config.json`, `configuration.py` and README accordingly.

More detail:

Our single call to `df` now saves the results in a table-like format,
so we can simply reuse this to extract our disks' mount-points, device
paths and used/total blocks.

This saves us on slow system calls, and the new structure allows for a
more suitable list of dicts (which each represent a filesystem) to be
stored in the `value` attribute of the entry. This in turn provides us
with a highly configurable entry!
@ingrinder ingrinder force-pushed the feature/multiple-disks branch from aac01b2 to d898133 Compare May 20, 2020 23:25
@HorlogeSkynet
Copy link
Owner

Yes that's perfect.
I'll try to take a look to this anytime soon (and a bunch of other stuffs).

Don't mind looking at/editing as needed the v4.8 milestone.

Bye 👋

+ Adds missing `,` in README configuration example
+ Sets `blocks_to_human_readable` as private method and improves/DRY its implementation
+ Simplifies SSV parsing implementation (without dialect definition/usage)
+ Renames some configuration attributes
+ Sets default attributes to `None` as `combine_total` is `True` by default
+ Improves documentation and code formatting
+ DRY `output`'s conditional branches

- TODO (BUG) : No entry _name_ with below disk configuration :
-{
-	"combine_disks": false,
-	"disk_labels": false,
-	"hide_entry_name": true`
-}
@HorlogeSkynet
Copy link
Owner

HorlogeSkynet commented May 26, 2020

I'm back !

So I've started looking at your changes, and if you can assure me that your solution is 100% working with BTRFS partitions too, we'll benefit a lot from it 😮 🚀

Two questions for you though :

  • So as to avoid one more iteration on self._df_table, couldn't we build a dict with mountpoint as keys (based on mounting-points uniqueness ?) instead of a list ? This way, we could directly retrieve them and compute their usage in O(1). Configuration ordering would be kept for Python >= 3.6.
  • If the user set "combine_total": false, "disk_labels": false, "hide_entry_name": true, entries without any name may be displayed. How do we handle that ?

I've pushed some "improvements" (from my PoV), feel free to override/revert/edit as needed 👌
We'll write test cases when this will be considered functionally-"defined" 😉

Bye Michael 👋

Basically rewrites everything in such a way that we now deal with dicts
whose keys are mount points, and values are dicts of other data for that
disk. Mount points are used as keys since they are globally unique.

This simplifies quite a lot of the internal logic of `Disk`, likely
resulting in faster execution (since it eradicates a few now-unneeded
iterations of `df`'s output).

Additionally:

+ Cleans up consistency in variable names and documentation, to use:
  * `mount_point` and "mount point"
  * `device_path` and "device path"
  * `filesystem` and "filesystem"

+ Modifies README to reflect usage of `null`.
@ingrinder
Copy link
Collaborator Author

...if you can assure me that your solution is 100% working with BTRFS partitions too...

I've tested it on three of my systems running BTRFS (one with a RAID-1 setup) and it seems to be working as expected on them. Since I also got the older core 2 duo machine running the other day, I'll try out some whacky configurations on that to test too 😄.

So as to avoid one more iteration on self._df_table, couldn't we build a dict with mountpoint as keys (based on mounting-points uniqueness ?) instead of a list ?

I've gone ahead and rewritten... a lot of the entry based on this, since I think it's a good idea 😅 - see 2017352. Let me know what you think of the changes I've made; it's removed at least two iterations of the df output (more for custom filesystems), I believe.

If the user set "combine_total": false, "disk_labels": false, "hide_entry_name": true, entries without any name may be displayed. How do we handle that ?

We could either let it happen, since I suppose the user did explicitly ask for it... Or, we could just add a check with hide_entry_name to only hide if labels are also enabled? I'll leave that up to you.

We'll write test cases when this will be considered functionally-"defined" 😉

Agreed! This kind of change is why I thought I'd wait until we discussed it 😄

Oh, also, about changing df -P -k to df -P -- this gives 512-byte blocks in the BSDs, so we should probably keep the -k. It's part of the Single Unix Specificatio, so hopefully that means we're okay using it everywhere? If there's somewhere it doesn't work, I guess we may have to resort to parsing the df header row to get the block size...

Anyway, see you for now!

Samuel FORESTIER added 3 commits May 27, 2020 10:01
+ Number of blocks is always a positive integer ; `abs` is irrelevant in our case
+ Optimizes list comprehensions when keys are not being used
+ Restores the `-k` flag passed to `df` call
```
{
"combine_total": false,
"hide_entry_name": true,
"disk_labels": false
}
```
@HorlogeSkynet
Copy link
Owner

I've tested it on three of my systems running BTRFS (one with a RAID-1 setup) and it seems to be working as expected on them. Since I also got the older core 2 duo machine running the other day, I'll try out some whacky configurations on that to test too smile.

Awesome !! I've just removed the optional dependency to btrfs-progs as we won't require it anymore.

I've gone ahead and rewritten... a lot of the entry based on this, since I think it's a good idea sweat_smile - see 2017352. Let me know what you think of the changes I've made; it's removed at least two iterations of the df output (more for custom filesystems), I believe.

Love this !

We could either let it happen, since I suppose the user did explicitly ask for it... Or, we could just add a check with hide_entry_name to only hide if labels are also enabled? I'll leave that up to you.

As it's very very ugly, I opted for one more conditional check to avoid it.

Agreed! This kind of change is why I thought I'd wait until we discussed it smile

If you're okay with latest changes, I'm all good to write/review/help you with them. I guess we will need many cases 😮

Oh, also, about changing df -P -k to df -P -- this gives 512-byte blocks in the BSDs, so we should probably keep the -k. It's part of the Single Unix Specificatio, so hopefully that means we're okay using it everywhere? If there's somewhere it doesn't work, I guess we may have to resort to parsing the df header row to get the block size...

Actually, your first patch did not use flags at all 🤣
Anyway, I've "restored" the -k... and pretty frustrated to learn that GNU & BSD implementations differ on the default block length too...

@ingrinder
Copy link
Collaborator Author

Sorry for the delay again, been a little busy recently!

If you're okay with latest changes, I'm all good to write/review/help you with them. I guess we will need many cases 😮

This implementation looks good to me - I'll get on with writing some tests now and then you can review them if you'd like?

Actually, your first patch did not use flags at all 🤣
Anyway, I've "restored" the -k... and pretty frustrated to learn that GNU & BSD implementations differ on the default block length too...

Ah... oops! I didn't even notice that 😆. Yeah, kinda annoying how it's not just a consistent block... but ah well.

By the way -- sorry for some of the changes I revert accidentally in commits (like line 48 in disk.py) - I end up doing stuff like rewriting comments, then changing my mind and rewriting them back without realising I removed your changes. Sorry about that!

Anyway, see you on the other side of test-writing 👍

@HorlogeSkynet
Copy link
Owner

No problem Michael ! I'm all good with the proposal. See you soon, and mostly, take your time/care 👌
Bye

ingrinder added 2 commits May 29, 2020 23:22
- remove all previous tests.
+ add framework for other disk tests.
@ingrinder
Copy link
Collaborator Author

So I've done the majority of the tests, I think - you can start looking over them if you'd like. Pretty sure there just needs to be another for the output entry labelling, then that's all of them done!

@HorlogeSkynet HorlogeSkynet marked this pull request as ready for review May 30, 2020 19:52
@HorlogeSkynet HorlogeSkynet merged commit 899042d into master May 30, 2020
@HorlogeSkynet HorlogeSkynet deleted the feature/multiple-disks branch May 30, 2020 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 A real glitch has been found enhancement ⬆️ Implements a new feature, fixes or improves existing ones question ❔ The OP is asking for something about this project !
Projects
Status: DONE
Development

Successfully merging this pull request may close these issues.

[BUG] [DISK] [BUSYBOX] Entry is not detected [FEATURE] Show multiple disks
2 participants