scatcluster.processing.ica

ICA processing module

Classes

ICA

Functions

round_nearest(x, a)

Rounds a number x to the nearest multiple of a.

Module Contents

scatcluster.processing.ica.round_nearest(x, a)[source]

Rounds a number x to the nearest multiple of a.

Parameters:
  • x (float) – The number to be rounded.

  • a (float) – The multiple to round to.

Returns:

The rounded number.

Return type:

float

class scatcluster.processing.ica.ICA[source]
_get_index_from_UTC_timestamp(utc_timestamp: str)[source]

Get the index of the maximum value in the data_times list that is less than the given UTC timestamp.

Parameters:

utc_timestamp (str) – The UTC timestamp to compare against the data_times list.

Returns:

The index of the maximum value in the data_times list that is less than the given UTC timestamp.

Return type:

int

process_ICA_single(num_ICA: int, return_data: bool = False, exclude_timestamps: List[str] | None = None, exclude_timestamps_skip: int = 5, **kwargs)[source]

Process the data for a single run of ICA. This function performs Independent Component Analysis (ICA) for a specified number of components. It fits the ICA model to the provided data after optionally excluding certain timestamps. The resulting features are then saved, and the model can be optionally returned.

Parameters:
  • num_ICA (int) – The number of Independent Components to reduce the data to.

  • return_data (bool) – Flag indicating whether to return data after processing. Default is False.

  • exclude_timestamps (Optional[List[str]]) – List of timestamps to exclude from the data before fitting.

  • exclude_timestamps_skip (int) – The number of data points to skip for each excluded timestamp.

  • **kwargs – Additional keyword arguments that can be passed to the FastICA model.

Returns:

If return_data is True, it returns a tuple containing:
  • The Explained Variance (%) of the ICA.

  • The Mean Squared Error (MSE) of the ICA.

  • The trained ICA model.

  • The extracted features after transformation.

Return type:

Tuple

Side Effects:
  • Saves the trained ICA model in a pickle file.

  • Saves the features in a numpy .npz file.

process_ICA_range(exclude_timestamps: List[str] | None = None, exclude_timestamps_skip: int = 3, **kwargs) None[source]

Process a range of Independent Component Analysis (ICA).

Parameters:
  • exclude_timestamps (Optional[List[str]]) – List of timestamps to exclude from the data.

  • exclude_timestamps_skip (int) – The number of data points to skip for each excluded timestamp.

  • **kwargs – Additional keyword arguments.

preload_ICA(num_ICA: int) None[source]

Load a pre-calculated ICA and set required variables

Parameters:

num_ICA (int) – Desired number of ICAs

plot_ICA(**kwargs) None[source]

Visualise the ICAs

plot_ICA_zoom(ICA_letter: str, zoom_time_start: str, zoom_time_end: str, **kwargs) None[source]

Plots a zoomed view of a specific ICA component.

Parameters:
  • ICA_letter (str) – The letter corresponding to the ICA component to plot.

  • zoom_time_start (str) – The start time of the zoomed view.

  • zoom_time_end (str) – The end time of the zoomed view.

  • **kwargs – Additional keyword arguments to pass to plt.subplots().

Raises:

IndexError – If the specified ICA component does not exist.

Note

The ICA_letter argument should be an uppercase letter corresponding to the ICA component to plot.

plot_ica_contribution(**kwargs) None[source]

Visualise the ICA contribution to each cluster

list_linkages()[source]

Lists all the linkage files for the given data network, station, location, and network name.

This function uses the glob function to find all the linkage files in the clustering directory that match the pattern f’{self.data_savepath}clustering/{self.data_network}_{self.data_station}_{self.data_location}_’ f’{self.network_name}_ICA_*linkage*’. It prints the list of linkage files found.

Parameters:

self (object) – The instance of the class.