New function analyze.distribution #1473

kmurphy61 · 2024-03-07T03:06:46Z

Added new function analyze.distribution. Added documentation for analyze.distribution.

Describe your changes
This new function provides a spatial analysis of pixel distribution of a mask in the X and Y dimension. This will be most useful for minirhizotron root images.

Type of update

New feature or feature enhancement
Update to documentation
Work in progress

Associated issues
Issue #1460

Additional context

For the reviewer
See this page for instructions on how to review the pull request.

PR functionality reviewed in a Jupyter Notebook
All tests pass
Test coverage remains 100%
Documentation tested
New documentation pages added to plantcv/mkdocs.yml
Changes to function input/output signatures added to updating.md
Code reviewed
PR approved

Added new function analyze.distribution. Added documentation for analyze.distribution.

deepsource-io · 2024-03-07T03:07:59Z

Here's the code health analysis summary for commits baface0..ced92f0. View details on DeepSource ↗.

Analysis Summary

Analyzer	Status	Summary	Link
Python	✅ Success		View Check ↗
Test coverage	✅ Success		View Check ↗

Code Coverage Report

Metric	Aggregate	Python
Branch Coverage	100%	100%
Composite Coverage	94%	94%
Line Coverage	94%	94%
New Branch Coverage	100%	100%
New Composite Coverage	100%	100%
New Line Coverage	100%, ✅ Above Threshold	100%, ✅ Above Threshold

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

Updates to distribution.py to fix errors in numpy to python list for the histogram axes.

HaleySchuhl · 2024-03-08T16:25:29Z

Closer but not quite there still with this code:

"""Analyzes the X and Y spatial distribution of objects in an image."""
import os
import cv2
import numpy as np
from scipy import stats
from plantcv.plantcv import fatal_error
from plantcv.plantcv import params
from plantcv.plantcv._debug import _debug
from plantcv.plantcv import outputs
from plantcv.plantcv.visualize import histogram
from plantcv.plantcv._helpers import _iterate_analysis


def distribution(labeled_mask, img=None, n_labels=1, bin_size_x=100, bin_size_y=100, label=None):
    """A function that analyzes the X and Y distribution of objects and outputs data.

    Inputs:
    labeled_mask     = Labeled mask of objects (32-bit).
    n_labels         = Total number expected individual objects (default = 1).
    bin_size_x       = Total number of desired bins for the histogram in the X direction
    bin_size_y       = Total number of desired bins for the histogram in the Y direction
    label            = Optional label parameter, modifies the variable name of
                       observations recorded (default = pcv.params.sample_label).

    Returns:
    distribution_image   = histogram output

    :param mask: numpy.ndarray
    :param  n_labels: int
    :param bin_size_x: int
    :param bin_size_y: int
    :param label: str
    :return distribution_images: list
    """
    # Set lable to params.sample_label if None
    if label is None:
        label = params.sample_label
    if img is None:
        img = np.where(labeled_mask > 0, 255, 0).astype(np.uint8)
        
    _ = _iterate_analysis(img=img, labeled_mask=labeled_mask, n_labels=n_labels, label=label, function=_analyze_distribution,
                          **{"bin_size_x": bin_size_x,"bin_size_y": bin_size_y})
    gray_chart_x = outputs.plot_dists(variable="X_frequencies")
    gray_chart_y = outputs.plot_dists(variable="Y_frequencies")
    _debug(visual=gray_chart_x, filename=os.path.join(params.debug_outdir, str(params.device) + '_x_distribution_hist.png'))

    _debug(visual=gray_chart_y, filename=os.path.join(params.debug_outdir, str(params.device) + '_y_distribution_hist.png'))
    return gray_chart_x, gray_chart_y


def _analyze_distribution(img, mask, bin_size_x=100, bin_size_y=100, label=None):
    """Analyze the color properties of an image object
    Inputs:
    mask             = Binary mask made from selected contours
    bin_size_x       = Total number of desired bins for the histogram in the X direction
    bin_size_y       = Total number of desired bins for the histogram in the Y direction
    label            = optional label parameter, modifies the variable name of observations recorded

    Returns:
    distribution_image   = histogram output

    :param img: numpy.ndarray
    :param mask: numpy.ndarray
    :param bin_size_x: int
    :param bin_size_y: int
    :param label: str
    :return distribution_images: list
    """

    # Save user debug setting
    debug = params.debug
    params.debug = None

    mask = img

    # Initialize output data
    # find the height and width, in pixels, for this image
    height, width = mask.shape[:2]
    num_bins_y = height // bin_size_y
    num_bins_x = width // bin_size_x

    # Initialize output measurements
    Y_histogram = np.zeros(height // bin_size_y)
    X_histogram = np.zeros(width // bin_size_x)

    # Undefined defaults
    X_distribution_mean = np.nan
    X_distribution_median = np.nan
    X_distribution_std = np.nan
    Y_distribution_mean = np.nan
    Y_distribution_median = np.nan
    Y_distribution_std = np.nan

    # Skip empty masks
    if np.count_nonzero(mask) != 0:

        # Calculate histogram
        params.debug = None
        for y in range(0, height, bin_size_y):
            y_slice = mask[y:min(y+bin_size_y, height), :]
            white_pixels_y = np.sum(y_slice == 255)  # Count white pixels
            bin_index_y = min(y // bin_size_y, num_bins_y - 1)  # Ensure index within range
            Y_histogram[bin_index_y] = white_pixels_y

        for x in range(0, width, bin_size_x):
            x_slice = mask[:, x:min(x+bin_size_x, width)]  # Corrected slicing indices here
            white_pixels_x = np.sum(x_slice == 255)  # Count white pixels
            bin_index_x = min(x // bin_size_x, num_bins_x - 1)  # Ensure index within range
            X_histogram[bin_index_x] = white_pixels_x

        # Restore user debug setting
        params.debug = debug

        # Determine the axes of the histograms
        y_axis = np.arange(len(Y_histogram)) * bin_size_y
        x_axis = np.arange(len(X_histogram)) * bin_size_x

        # Calculate the median X and Y value distribution
        X_distribution_median = np.median(x_axis)
        Y_distribution_median = np.median(y_axis)

        # Calculate the mean and standard deviation X and Y  value distribution
        X_distribution_mean = np.sum(X_histogram * x_axis) / np.sum(X_histogram)
        X_distribution_std = np.std(x_axis)
        Y_distribution_mean = np.sum(Y_histogram * y_axis) / np.sum(Y_histogram)
        Y_distribution_std = np.std(y_axis)

    # Convert numpy arrays to lists before adding to outputs
    X_histogram_list = X_histogram.tolist()
    Y_histogram_list = Y_histogram.tolist()
    
    # Save histograms
    outputs.add_observation(sample=label, variable='X_frequencies', trait='X frequencies',
                            method='plantcv.plantcv.analyze.distribution', scale='frequency', datatype=list,
                            value=X_histogram_list, label=x_axis.tolist())
    outputs.add_observation(sample=label, variable='Y_frequencies', trait='Y frequencies',
                            method='plantcv.plantcv.analyze.distribution', scale='frequency', datatype=list,
                            value=Y_histogram_list, label=y_axis.tolist())

    # Save average measurements
    outputs.add_observation(sample=label, variable='X_distribution_mean', trait='X distribution mean',
                            method='plantcv.plantcv.analyze.distribution', scale='pixels', datatype=float,
                            value=X_distribution_mean, label='pixel')
    outputs.add_observation(sample=label, variable='X_distribution_median', trait='X distribution median',
                            method='plantcv.plantcv.analyze.distribution', scale='pixel', datatype=float,
                            value=X_distribution_median, label='pixel')
    outputs.add_observation(sample=label, variable='X_distribution_std', trait='X distribution standard deviation',
                            method='plantcv.plantcv.analyze.distribution', scale='pixel', datatype=float,
                            value=X_distribution_std, label='pixel')
    outputs.add_observation(sample=label, variable='Y_distribution_mean', trait='Y distribution mean',
                            method='plantcv.plantcv.analyze.distribution', scale='pixels', datatype=float,
                            value=Y_distribution_mean, label='pixel')
    outputs.add_observation(sample=label, variable='Y_distribution_median', trait='Y distribution median',
                            method='plantcv.plantcv.analyze.distribution', scale='pixel', datatype=float,
                            value=Y_distribution_median, label='pixel')
    outputs.add_observation(sample=label, variable='Y_distribution_std', trait='Y distribution standard deviation',
                            method='plantcv.plantcv.analyze.distribution', scale='pixel', datatype=float,
                            value=Y_distribution_std, label='pixel')

    return mask

Fixed bugs to allow plotting, labels plots by X or Y distribution

…er/plantcv into detection-issue1460

The debug image of the bounding rectangle was not needed and caused an issue when used in analyze.distribution

k034b363

Tested with labeled masks from bean tutorial and with unlabeled masks from minirhizotron, both worked. Used as a test to collect mean depth data from a set of minirhizotron images and it's very fast and flexible. Clarified a few minor things in the doc and doc strings, but it looks good to me!

kmurphy61 added 2 commits March 6, 2024 21:02

Added new function analyze.distribution

3406928

Added new function analyze.distribution. Added documentation for analyze.distribution.

Merge branch 'main' into detection-issue1460

1321824

Update distribution.py

8ab857d

Updates to distribution.py to fix errors in numpy to python list for the histogram axes.

kmurphy61 and others added 3 commits March 8, 2024 10:53

Update distribution.py

9eac8d2

Fixed bugs to allow plotting, labels plots by X or Y distribution

update distribution function to plot

db38945

Merge branch 'detection-issue1460' of https://github.com/danforthcent…

3b1607c

…er/plantcv into detection-issue1460

HaleySchuhl added new feature New feature ideas and solutions work in progress Mark work in progress labels Mar 8, 2024

HaleySchuhl added this to the PlantCV v4.3 milestone Mar 8, 2024

nfahlgren added 19 commits March 10, 2024 17:18

Add distribution to subpackage imports

9448045

Default n_labels to 1

962097a

Remove unused import

86338cd

Add space between arguments

df1d79c

Remove trailing whitespace

93b3aeb

Remove blank line after docstring

279bf30

Remove debug on/off as no plantcv functions are used

7cd8b5c

Python convention uses lowercase variable names

17a4d86

Count non-zero to avoid 8-bit math

28dee95

Use stored values instead of recalculating

a0a1110

Delete img as it is only passed for compatibility

3ad05d0

Add test for analyze distribution

0ef68ce

Remove unused import

12adebd

Remove delete and divide by 1

593e78e

Do valid math

d882dbe

Add distribution to TOC

3a45e4c

Add distribution to the changelog

8a1ba29

Update function parameters

cdd68e9

Merge branch 'main' into detection-issue1460

e810854

HaleySchuhl and others added 11 commits March 12, 2024 12:58

Merge branch 'main' into detection-issue1460

2f91930

Merge branch 'detection-issue1460' of https://github.com/danforthcent…

3ed9e54

…er/plantcv into detection-issue1460

Refactor to do x and y separately

4898bcc

Update parameters in the docs

df8ebcb

Add device incrementer

fb7d32f

Temporarily turn debug off

7f7943b

Fix minor spacing issues

5c4db36

Remove second debug image that causes issue

3ebad23

The debug image of the bounding rectangle was not needed and caused an issue when used in analyze.distribution

Use auto_crop to implement relative scale histograms

28c3dcd

Add images and source to distribution docs

7cbd61c

Fix number of outputs in changelog

9325644

nfahlgren added ready to review and removed work in progress Mark work in progress labels Mar 13, 2024

Rename var to avoid built-in name

296911b

nfahlgren requested a review from k034b363 March 13, 2024 02:10

k034b363 added 3 commits March 13, 2024 16:23

updated docstring bin size description

8c7a90f

match stdev output to other stats

99d9fd9

Added binary image note in docs

5979ab2

k034b363 approved these changes Mar 15, 2024

View reviewed changes

fix trailing whitespace

ced92f0

nfahlgren merged commit cc1e872 into main Mar 15, 2024

nfahlgren deleted the detection-issue1460 branch March 15, 2024 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New function analyze.distribution #1473

New function analyze.distribution #1473

Uh oh!

kmurphy61 commented Mar 7, 2024 •

edited by k034b363

Loading

Uh oh!

deepsource-io bot commented Mar 7, 2024 •

edited

Loading

Analysis Summary

Code Coverage Report

Uh oh!

HaleySchuhl commented Mar 8, 2024

Uh oh!

k034b363 left a comment

Uh oh!

Uh oh!

New function analyze.distribution #1473

New function analyze.distribution #1473

Uh oh!

Conversation

kmurphy61 commented Mar 7, 2024 • edited by k034b363 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deepsource-io bot commented Mar 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Analysis Summary

Code Coverage Report

Uh oh!

HaleySchuhl commented Mar 8, 2024

Uh oh!

k034b363 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kmurphy61 commented Mar 7, 2024 •

edited by k034b363

Loading

deepsource-io bot commented Mar 7, 2024 •

edited

Loading