ransac.find_bad_by_ransac

ransac.find_bad_by_ransac(data, sample_rate, complete_chn_labs, chn_pos, exclude, n_samples=50, sample_prop=0.25, corr_thresh=0.75, frac_bad=0.4, corr_window_secs=5.0, channel_wise=False, max_chunk_size=None, random_state=None, matlab_strict=False)[source]

Detect channels that are not predicted well by other channels.

Here, a RANSAC approach (see [1], and a short discussion in [2]) is adopted to predict a “clean EEG” dataset. After identifying clean EEG channels through the other methods, the clean EEG dataset is constructed by repeatedly sampling a small subset of clean EEG channels and interpolation the complete data. The median of all those repetitions forms the clean EEG dataset. In a second step, the original and the RANSAC-predicted data are correlated and channels, which do not correlate well with themselves across the two datasets are considered bad_by_ransac.

Parameters:
datanp.ndarray

A 2-D array of detrended EEG data, with bad-by-flat and bad-by-NaN channels removed.

sample_ratefloat

The sample rate (in Hz) of the EEG data.

complete_chn_labsarray_like

Labels for all channels in data, in the same order as they appear in data.

chn_posnp.ndarray

3-D electrode coordinates for all channels in data, in the same order as they appear in data.

excludelist

Labels of channels to exclude as signal predictors during RANSAC (i.e., channels already flagged as bad by metrics other than HF noise).

n_samplesint, optional

Number of random channel samples to use for RANSAC. Defaults to 50.

sample_propfloat, optional

Proportion of total channels to use for signal prediction per RANSAC sample. This needs to be in the range [0, 1], where 0 would mean no channels would be used and 1 would mean all channels would be used (neither of which would be useful values). Defaults to 0.25 (e.g., 16 channels per sample for a 64-channel dataset).

corr_threshfloat, optional

The minimum predicted vs. actual signal correlation for a channel to be considered good within a given RANSAC window. Defaults to 0.75.

frac_badfloat, optional

The minimum fraction of bad (i.e., below-threshold) RANSAC windows for a channel to be considered bad-by-RANSAC. Defaults to 0.4.

corr_window_secsfloat, optional

The duration (in seconds) of each RANSAC correlation window. Defaults to 5 seconds.

channel_wisebool, optional

Whether RANSAC should predict signals for chunks of channels over the entire signal length (“channel-wise RANSAC”, see max_chunk_size parameter). If False, RANSAC will instead predict signals for all channels at once but over a number of smaller time windows instead of over the entire signal length (“window-wise RANSAC”). Channel-wise RANSAC generally has higher RAM demands than window-wise RANSAC (especially if max_chunk_size is None), but can be faster on systems with lots of RAM to spare. Defaults to False.

max_chunk_size{int, None}, optional

The maximum number of channels to predict at once during channel-wise RANSAC. If None, RANSAC will use the largest chunk size that will fit into the available RAM, which may slow down other programs on the host system. If using window-wise RANSAC (the default), this parameter has no effect. Defaults to None.

random_state{int, None, np.random.RandomState}, optional

The random seed with which to generate random samples of channels during RANSAC. If random_state is an int, it will be used as a seed for RandomState. If None, the seed will be obtained from the operating system (see RandomState for details). Defaults to None.

matlab_strictbool, optional

Whether or not RANSAC should strictly follow MATLAB PREP’s internal math, ignoring any improvements made in PyPREP over the original code (see Deliberate Differences from MATLAB PREP for more details). Defaults to False.

Returns:
bad_by_ransaclist

List containing the labels of all channels flagged as bad by RANSAC.

channel_correlationsnp.ndarray

Array of shape (windows, channels) containing the correlations of the channels with their predicted RANSAC values for each window.

References

[1]

Fischler, M.A., Bolles, R.C. (1981). Random sample consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Communications of the ACM, 24, 381-395

[2]

Jas, M., Engemann, D.A., Bekhti, Y., Raimondo, F., Gramfort, A. (2017). Autoreject: Automated Artifact Rejection for MEG and EEG Data. NeuroImage, 159, 417-429