ransac.find_bad_by_ransac¶
- ransac.find_bad_by_ransac(data, sample_rate, complete_chn_labs, chn_pos, exclude, n_samples=50, sample_prop=0.25, corr_thresh=0.75, frac_bad=0.4, corr_window_secs=5.0, channel_wise=False, max_chunk_size=None, random_state=None, matlab_strict=False)[source]¶
Detect channels that are not predicted well by other channels.
Here, a RANSAC approach (see [1], and a short discussion in [2]) is adopted to predict a “clean EEG” dataset. After identifying clean EEG channels through the other methods, the clean EEG dataset is constructed by repeatedly sampling a small subset of clean EEG channels and interpolation the complete data. The median of all those repetitions forms the clean EEG dataset. In a second step, the original and the RANSAC-predicted data are correlated and channels, which do not correlate well with themselves across the two datasets are considered bad_by_ransac.
- Parameters:
- data
np.ndarray
A 2-D array of detrended EEG data, with bad-by-flat and bad-by-NaN channels removed.
- sample_rate
float
The sample rate (in Hz) of the EEG data.
- complete_chn_labsarray_like
Labels for all channels in data, in the same order as they appear in data.
- chn_pos
np.ndarray
3-D electrode coordinates for all channels in data, in the same order as they appear in data.
- exclude
list
Labels of channels to exclude as signal predictors during RANSAC (i.e., channels already flagged as bad by metrics other than HF noise).
- n_samples
int
,optional
Number of random channel samples to use for RANSAC. Defaults to
50
.- sample_prop
float
,optional
Proportion of total channels to use for signal prediction per RANSAC sample. This needs to be in the range [0, 1], where 0 would mean no channels would be used and 1 would mean all channels would be used (neither of which would be useful values). Defaults to
0.25
(e.g., 16 channels per sample for a 64-channel dataset).- corr_thresh
float
,optional
The minimum predicted vs. actual signal correlation for a channel to be considered good within a given RANSAC window. Defaults to
0.75
.- frac_bad
float
,optional
The minimum fraction of bad (i.e., below-threshold) RANSAC windows for a channel to be considered bad-by-RANSAC. Defaults to
0.4
.- corr_window_secs
float
,optional
The duration (in seconds) of each RANSAC correlation window. Defaults to 5 seconds.
- channel_wisebool,
optional
Whether RANSAC should predict signals for chunks of channels over the entire signal length (“channel-wise RANSAC”, see max_chunk_size parameter). If
False
, RANSAC will instead predict signals for all channels at once but over a number of smaller time windows instead of over the entire signal length (“window-wise RANSAC”). Channel-wise RANSAC generally has higher RAM demands than window-wise RANSAC (especially if max_chunk_size isNone
), but can be faster on systems with lots of RAM to spare. Defaults toFalse
.- max_chunk_size{
int
,None
},optional
The maximum number of channels to predict at once during channel-wise RANSAC. If
None
, RANSAC will use the largest chunk size that will fit into the available RAM, which may slow down other programs on the host system. If using window-wise RANSAC (the default), this parameter has no effect. Defaults toNone
.- random_state{
int
,None
,np.random.RandomState
},optional
The random seed with which to generate random samples of channels during RANSAC. If random_state is an int, it will be used as a seed for RandomState. If
None
, the seed will be obtained from the operating system (see RandomState for details). Defaults toNone
.- matlab_strictbool,
optional
Whether or not RANSAC should strictly follow MATLAB PREP’s internal math, ignoring any improvements made in PyPREP over the original code (see Deliberate Differences from MATLAB PREP for more details). Defaults to
False
.
- data
- Returns:
- bad_by_ransac
list
List containing the labels of all channels flagged as bad by RANSAC.
- channel_correlations
np.ndarray
Array of shape (windows, channels) containing the correlations of the channels with their predicted RANSAC values for each window.
- bad_by_ransac
References
[1]Fischler, M.A., Bolles, R.C. (1981). Random sample consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Communications of the ACM, 24, 381-395
[2]Jas, M., Engemann, D.A., Bekhti, Y., Raimondo, F., Gramfort, A. (2017). Autoreject: Automated Artifact Rejection for MEG and EEG Data. NeuroImage, 159, 417-429