pyprep.NoisyChannels#
- class pyprep.NoisyChannels(raw, do_detrend=True, random_state=None, matlab_strict=False, *, ransac=True, correlation=True, bad_by_manual=None, reject_by_annotation=None)[source]#
Bases:
objectDetect bad channels in an EEG recording using a range of methods.
This class provides a number of methods for detecting bad channels across a full-session EEG recording. Specifically, this class implements all of the noisy channel detection methods used in the PREP pipeline, as described in [1]. The detection methods in this class can be run independently, or can be run all at once using the
find_all_bads()method.At present, only EEG channels are supported and any non-EEG channels in the provided data will be ignored.
- Parameters:
- raw
mne.io.Raw An MNE Raw object to check for bad EEG channels. Channels set to bad in
raw.info["bads"]will not be used to find additional bad channels.- do_detrendbool
Whether or not low-frequency (<1.0 Hz) trends should be removed from the EEG signal prior to bad channel detection. This should always be set to
Trueunless the signal has already had low-frequency trends removed. Defaults toTrue.- random_state{
int,None,np.random.RandomState} |None The seed to use for random number generation within RANSAC. This can be
None, an integer, or aRandomStateobject. IfNone, a random seed will be obtained from the operating system. Defaults toNone.- matlab_strictbool
Whether or not PyPREP should strictly follow MATLAB PREP’s internal math, ignoring any improvements made in PyPREP over the original code (see Deliberate Differences from MATLAB PREP for more details). Defaults to
False.- ransacbool
Whether RANSAC should be used for bad channel detection, in addition to other methods. RANSAC can detect bad channels that other methods are unable to catch, but also slows down noisy channel detection considerably. Defaults to
True.- correlationbool
Whether correlation should be used for bad channel detection, in addition to other methods. Defaults to
True.- bad_by_manual
listofstr|None List of channels that are bad. These channels will be excluded when trying to find additional bad channels. Note that the union of these channels and those declared in
raw.info["bads"]will be used. Defaults toNone.- reject_by_annotation{
None, ‘omit’} |None How to handle BAD-annotated time segments (annotations starting with “BAD” or “bad”) during channel quality assessment. If
'omit', annotated segments are excluded from analysis (clean segments are concatenated). IfNone(default), annotations are ignored and the full recording is used. This is useful when recordings contain breaks or movement artifacts that shouldn’t influence channel rejection decisions.
- raw
References
[1]Bigdely-Shamlo, N., Mullen, T., Kothe, C., Su, K. M., Robbins, K. A. (2015). The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Frontiers in Neuroinformatics, 9, 16.
- find_all_bads(*, ransac=None, channel_wise=False, max_chunk_size=None, correlation=None, reject_by_annotation=None)[source]#
Call all the functions to detect bad channels.
This function calls all the bad-channel detecting functions.
- Parameters:
- ransacbool |
None Whether RANSAC should be used for bad channel detection, in addition to the other methods. RANSAC can detect bad channels that other methods are unable to catch, but also slows down noisy channel detection considerably. If
None(default), then the value at instantiation of theNoisyChannelsclass is taken (defaults toTrue), else the instantiation value is overwritten.- channel_wisebool |
None Whether RANSAC should predict signals for chunks of channels over the entire signal length (“channel-wise RANSAC”, see max_chunk_size parameter). If
False, RANSAC will instead predict signals for all channels at once but over a number of smaller time windows instead of over the entire signal length (“window-wise RANSAC”). Channel-wise RANSAC generally has higher RAM demands than window-wise RANSAC (especially if max_chunk_size isNone), but can be faster on systems with lots of RAM to spare. Has no effect if not using RANSAC. Defaults toFalse.- max_chunk_size{
int,None} |None The maximum number of channels to predict at once during channel-wise RANSAC. If
None, RANSAC will use the largest chunk size that will fit into the available RAM, which may slow down other programs on the host system. If using window-wise RANSAC (the default) or not using RANSAC at all, this parameter has no effect. Defaults toNone.- correlationbool |
None Whether correlation should be used for bad channel detection, in addition to the other methods. If
None(default), then the value at instantiation of theNoisyChannelsclass is taken (defaults toTrue), else the instantiation value is overwritten.- reject_by_annotation{
None, ‘omit’} |None This parameter is accepted for compatibility but is ignored here. Annotation rejection is applied during
NoisyChannelsinitialization, not duringfind_all_bads. To use annotation rejection, passreject_by_annotationto theNoisyChannelsconstructor.
- ransacbool |
- find_bad_by_PSD(zscore_threshold=3.0, fmin=1.0, fmax=45.0)[source]#
Detect channels with abnormally high or low power spectral density.
This is a PyPREP-only method not present in the original MATLAB PREP.
A channel is considered “bad-by-psd” if:
Its power in any frequency band (low: 1-15 Hz, mid: 15-30 Hz, high: 30-45 Hz) is abnormally HIGH compared to other channels, OR
Its high-frequency band has more power than its low-frequency band (violating the typical 1/f spectral profile of EEG).
Note: Only excess power (positive z-scores) is flagged, as abnormally low power could reflect normal topographic variation.
PSD is computed using Welch’s method over the specified frequency range. The default range (1-45 Hz) excludes line noise frequencies (50/60 Hz).
- Parameters:
- zscore_threshold
float,optional The minimum absolute z-score of a channel for it to be considered bad-by-psd. Defaults to
3.0.- fmin
float,optional The lower frequency bound (in Hz) for PSD computation. Defaults to
1.0.- fmax
float,optional The upper frequency bound (in Hz) for PSD computation. The default of
45.0excludes 50/60 Hz line noise from the analysis.
- zscore_threshold
- find_bad_by_SNR()[source]#
Detect channels that have a low signal-to-noise ratio.
Channels are considered “bad-by-SNR” if they are bad by both high-frequency noise and bad by low correlation.
- find_bad_by_correlation(correlation_secs=1.0, correlation_threshold=0.4, frac_bad=0.01)[source]#
Detect channels that sometimes don’t correlate with any other channels.
Channel correlations are calculated by splitting the recording into non-overlapping windows of time (default: 1 second), getting the absolute correlations of each usable channel with every other usable channel for each window, and then finding the highest correlation each channel has with another channel for each window (by taking the 98th percentile of the absolute correlations).
A correlation window is considered “bad” for a channel if its maximum correlation with another channel is below the provided correlation threshold (default:
0.4). A channel is considered “bad-by-correlation” if its fraction of bad correlation windows is above the bad fraction threshold (default:0.01).This method also detects channels with intermittent dropouts (i.e., regions of flat signal). A channel is considered “bad-by-dropout” if its fraction of correlation windows with a completely flat signal is above the bad fraction threshold (default:
0.01).- Parameters:
- correlation_secs
float|None The length (in seconds) of each correlation window. Defaults to
1.0.- correlation_threshold
float|None The lowest maximum inter-channel correlation for a channel to be considered “bad” within a given window. Defaults to
0.4.- frac_bad
float|None The minimum proportion of bad windows for a channel to be considered “bad-by-correlation” or “bad-by-dropout”. Defaults to
0.01(1% of all windows).
- correlation_secs
- find_bad_by_deviation(deviation_threshold=5.0)[source]#
Detect channels with abnormally high or low overall amplitudes.
A channel is considered “bad-by-deviation” if its amplitude deviates considerably from the median channel amplitude, as calculated using a robust Z-scoring method and the given deviation threshold.
Amplitude Z-scores are calculated using the formula
(channel_amplitude - median_amplitude) / amplitude_sd, where channel amplitudes are calculated using a robust outlier-resistant estimate of the signals’ standard deviations (IQR scaled to units of SD), and the amplitude SD is the IQR-based SD of those amplitudes.
- find_bad_by_hfnoise(HF_zscore_threshold=5.0)[source]#
Detect channels with abnormally high amounts of high-frequency noise.
The noisiness of a channel is defined as the amplitude of its high-frequency (>50 Hz) components divided by its overall amplitude. A channel is considered “bad-by-high-frequency-noise” if its noisiness is considerably higher than the median channel noisiness, as determined by a robust Z-scoring method and the given Z-score threshold.
Due to the Nyquist theorem, this method will only attempt bad channel detection if the sample rate of the given signal is above 100 Hz.
- find_bad_by_nan_flat(flat_threshold=1e-15)[source]#
Detect channels than contain NaN values or have near-flat signals.
A channel is considered flat if its standard deviation or its median absolute deviation from the median (MAD) are below the provided flat threshold (default:
1e-15volts).This method is run automatically when a
NoisyChannelsobject is initialized, preventing flat or NaN-containing channels from interfering with the detection of other types of bad channels. Thereject_by_annotationsetting of theNoisyChannelsinstance is respected when retrieving the data.
- find_bad_by_ransac(n_samples=50, sample_prop=0.25, corr_thresh=0.75, frac_bad=0.4, corr_window_secs=5.0, channel_wise=False, max_chunk_size=None)[source]#
Detect channels that are predicted poorly by other channels.
This method uses a random sample consensus approach (RANSAC, see [1], and a short discussion in [2]) to try and predict what the signal should be for each channel based on the signals and spatial locations of other currently-good channels. RANSAC correlations are calculated by splitting the recording into non-overlapping windows of time (default: 5 seconds) and correlating each channel’s RANSAC-predicted signal with its actual signal within each window.
A RANSAC window is considered “bad” for a channel if its predicted signal vs. actual signal correlation falls below the given correlation threshold (default:
0.75). A channel is considered “bad-by-RANSAC” if its fraction of bad RANSAC windows is above the given threshold (default:0.4).Due to its random sampling component, the channels identified as “bad-by-RANSAC” may vary slightly between calls of this method. Additionally, bad channels may vary between different montages given that RANSAC’s signal predictions are based on the spatial coordinates of each electrode.
This method is a wrapper for the
find_bad_by_ransac()function.Warning
For optimal performance, RANSAC requires that channels bad by deviation, correlation, and/or dropout have already been flagged. Otherwise RANSAC will attempt to use those channels when making signal predictions, decreasing accuracy and thus increasing the likelihood of false positives.
- Parameters:
- n_samples
int|None Number of random channel samples to use for RANSAC. Defaults to
50.- sample_prop
float|None Proportion of total channels to use for signal prediction per RANSAC sample. This needs to be in the range [0, 1], where 0 would mean no channels would be used and 1 would mean all channels would be used (neither of which would be useful values). Defaults to
0.25(e.g., 16 channels per sample for a 64-channel dataset).- corr_thresh
float|None The minimum predicted vs. actual signal correlation for a channel to be considered good within a given RANSAC window. Defaults to
0.75.- frac_bad
float|None The minimum fraction of bad (i.e., below-threshold) RANSAC windows for a channel to be considered bad-by-RANSAC. Defaults to
0.4.- corr_window_secs
float|None The duration (in seconds) of each RANSAC correlation window. Defaults to 5 seconds.
- channel_wisebool |
None Whether RANSAC should predict signals for chunks of channels over the entire signal length (“channel-wise RANSAC”, see max_chunk_size parameter). If
False, RANSAC will instead predict signals for all channels at once but over a number of smaller time windows instead of over the entire signal length (“window-wise RANSAC”). Channel-wise RANSAC generally has higher RAM demands than window-wise RANSAC (especially if max_chunk_size isNone), but can be faster on systems with lots of RAM to spare. Defaults toFalse.- max_chunk_size{
int,None} |None The maximum number of channels to predict at once during channel-wise RANSAC. If
None, RANSAC will use the largest chunk size that will fit into the available RAM, which may slow down other programs on the host system. If using window-wise RANSAC (the default), this parameter has no effect. Defaults toNone.
- n_samples
References
[1]Fischler, M.A., Bolles, R.C. (1981). Random sample consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Communications of the ACM, 24, 381-395
[2]Jas, M., Engemann, D.A., Bekhti, Y., Raimondo, F., Gramfort, A. (2017). Autoreject: Automated Artifact Rejection for MEG and EEG Data. NeuroImage, 159, 417-429
- get_bads(verbose=False, as_dict=False)[source]#
Get the names of all channels currently flagged as bad.
Note that this method does not perform any bad channel detection itself, and only reports channels already detected as bad by other methods.
- Parameters:
- verbosebool |
None If
True, a summary of the channels currently flagged as by bad per category is printed. Defaults toFalse.- as_dict: bool | None
If
True, this method will return a dict of the channels currently flagged as bad by each individual bad channel type. IfFalse, this method will return a list of all unique bad channels detected so far. Defaults toFalse.
- verbosebool |
- Returns: