scatcluster.analysis.waveform_correlations

Waveform Correlations Analysis module.

Classes

WaveformCorrelations

Module Contents

class scatcluster.analysis.waveform_correlations.WaveformCorrelations[source]
process_cluster_trace_correlation(df_preds: pandas.DataFrame, cluster: int = 1, sort_type: str = 'all', sort_filter: int = None, time_second: int = 60, channel: str = 'HHZ', envelope: bool = False)[source]

Calculate the waveform trace correlation for a given cluster.

Parameters:
  • df_preds (pd.DataFrame) – The DataFrame containing the predictions.

  • cluster (int, optional) – The cluster number. Defaults to 1.

  • sort_type (str, optional) – The type of sorting to apply. Options are ‘all’, ‘xcorr_filter’, and ‘distance_filter’. Defaults to ‘all’.

  • sort_filter (int, optional) – The filter value for sorting. Defaults to None.

  • time_second (int, optional) – The time window in seconds. Defaults to 60.

  • channel (str, optional) – The channel to use. Defaults to ‘HHZ’.

  • envelope (bool, optional) – Whether to use the envelope. Defaults to False.

Returns:

A dictionary containing the centre waveform time, centre waveform, and correlations.

Return type:

dict

calculate_waveform_correlations(df_preds: pandas.DataFrame, sort_type='distance_filter', sort_filter=100, time_second=60, channel='HHZ', envelope=False)[source]

Calculate the waveform correlations for each cluster based on the input DataFrame of predictions.

Parameters:
  • df_preds (pd.DataFrame) – The DataFrame containing the predictions.

  • sort_type (str, optional) – The type of sorting to apply. Defaults to ‘distance_filter’.

  • sort_filter (int, optional) – The filter value for sorting. Defaults to 100.

  • time_second (int, optional) – The time window in seconds. Defaults to 60.

  • channel (str, optional) – The channel to use. Defaults to ‘HHZ’.

  • envelope (bool, optional) – Whether to use the envelope. Defaults to False.

Returns:

A dictionary containing the waveform correlations for each cluster.

Return type:

dict

load_correlations(sort_type='distance_filter', sort_filter=100, envelope=False)[source]

Load the waveform correlations from a pickle file.

Parameters:
  • sort_type (str, optional) – The type of sorting to be applied to the correlations. Defaults to ‘distance_filter’.

  • sort_filter (int, optional) – The filter to be applied to the sorted correlations. Defaults to 100.

  • envelope (bool, optional) – Whether to use the envelope of the waveform. Defaults to False.

Returns:

A dictionary containing the waveform correlations.

Return type:

dict

stack_correlations(correlations, cluster)[source]

Calculate the stacked correlations for a given cluster.

Parameters:
  • correlations (dict) – A dictionary containing the correlations for different clusters.

  • cluster (int) – The cluster number.

Returns:

The stacked correlations for the given cluster, or None if the cluster is not

present in the correlations’ dictionary.

Return type:

numpy.ndarray or None

process_waveform_correlations_stacked_waveform(df_preds, correlations, sort_type, sort_filter, envelope=False)[source]

Process the waveform correlations and stack the correlated waveforms for each cluster.

Parameters:
  • df_preds (pandas.DataFrame) – The DataFrame containing the predictions.

  • correlations (dict) – A dictionary containing the correlations for different clusters.

  • sort_type (str) – The type of sorting to apply.

  • sort_filter (int) – The filter value for sorting.

  • envelope (bool, optional) – Whether to use the envelope. Defaults to False.

Returns:

A dictionary containing the stacked correlated waveforms for each cluster.

Return type:

dict

This function iterates over the unique clusters in the predictions DataFrame and calculates the stacked correlated waveforms for each cluster using the stack_correlations method. The resulting stacked correlated waveforms are stored in the correlation_waveform dictionary.

The _wvf_type variable is set to ‘waveform’ by default. If the envelope parameter is True, _wvf_type is set to ‘envelope’.

The correlation_waveform dictionary is then saved as a NumPy binary file using the np.save function. The file name is constructed using various attributes of the instance (self) and the input parameters.

Finally, the correlation_waveform dictionary is returned.

plot_correlation_waveforms(df_preds, correlations, sort_type, sort_filter, envelope=False)[source]

Plot the correlation waveforms for each cluster.

Parameters:
  • df_preds (pandas.DataFrame) – The DataFrame containing the predictions.

  • correlations (dict) – A dictionary containing the correlations for different clusters.

  • sort_type (str) – The type of sorting to apply.

  • sort_filter (int) – The filter value for sorting.

  • envelope (bool, optional) – Whether to use the envelope. Defaults to False.

This function plots the correlation waveforms for each cluster. It first calculates the correlation waveform using the process_waveform_correlations_stacked_waveform method. Then, it creates a figure with subplots for each cluster. If the cluster has no correlations, the subplot title is set to ‘Cluster {cluster_number} - Empty Traces’. Otherwise, it plots the shifted and corrected waveforms, the centroid waveform, and the correlation waveform for that cluster. The subplot title includes the number of traces, the average cross-correlation coefficient, and the type of waveform. The figure legend includes labels for the shifted and corrected waveforms, the centroid waveform, and the correlation waveform. The figure is saved as a PNG image with a unique file name based on the data network, station, location, network name, ICA components, clustering method, and waveform type. The plot is displayed using plt.show().

plot_correlation_frequency(df_preds, correlations_waveforms_distance_all)[source]

Plot the correlation frequency for each cluster.

Parameters:
  • df_preds (pandas.DataFrame) – The DataFrame containing the predictions.

  • correlations_waveforms_distance_all (dict) – A dictionary containing the correlations and waveforms distance

  • clusters. (for different) –

This function plots the correlation frequency for each cluster. It creates a figure with subplots for each cluster. If the cluster has no correlations, the subplot title is set to ‘Cluster {cluster_number} - Empty Traces’. Otherwise, it calculates the absolute value of the correlation_value for each correlation in the cluster and plots a histogram of the data using seaborn. The subplot title includes the cluster number. The figure is displayed using plt.show().

plot_correlation_shift(correlations: dict, clusters: list, within_cluster_number: int = None)[source]

Plot the correlation shift for each cluster.

Parameters:
  • correlations (dict) – A dictionary containing the correlations for each cluster.

  • clusters (list) – A list of clusters.

  • within_cluster_number (int, optional) – The number of correlations to plot within each cluster. Defaults to None.