Feature Request: Add Built-in Subsampling Methods

**Is your feature request related to a problem? Please describe.**
I'm working with periodic time series data (e.g., EGG data) where the data points are taken with high frequency, and it is necessary to change the sampling frequency or reduce the number of time points. It would be great to have built-in subsampling methods for periodic time series in `tslearn.` This would make it much easier to preprocess data like EGG signals directly within the library. 

**Describe the solution you'd like**
I would like to see built-in subsampling methods for time series in `tslearn`.
Requested methods include:

- Simple sampling frequency change by applying functions such as 'mean', 'max', 'overlapping mean', or 'uniform' (selecting every N-th point). For example:

```python
def subsample_data(time_series, reduction_factor=2, method='nth'):
    """
    Subsamples the data from the original frequency to the target frequency.

    Parameters:
        time_series (numpy array): The original data.
        original_frequency (int): The original sampling frequency (e.g., 250 Hz).
        target_frequency (int): The target sampling frequency (e.g., 1 Hz).
        method (str): The method to use for subsampling. Options are 'mean', 'max', 'overlapping_mean', or 'nth'.

    Returns:
        numpy array: The subsampled data.
    """
    if method == 'mean':
        # Take the mean of every 'factor' samples
        subsampled_time_series = np.mean(time_series.reshape(-1, reduction_factor), axis=1)
    elif method == 'max':
        # Take the max of every 'factor' samples
        subsampled_time_series = np.max(time_series.reshape(-1, reduction_factor), axis=1)
    elif method == 'overlapping_mean':
        # Take the mean of overlapping windows of size 'factor'
        subsampled_time_series = np.array([np.mean(time_series[i: i + reduction_factor + reduction_factor //2]) for i in range(0, len(time_series), reduction_factor)])
    elif method == 'nth':
        # Take every 'factor'-th sample
        subsampled_time_series = time_series[::reduction_factor]

    return subsampled_time_series

```

- Periodic time series specific, amplitude-preserving subsampling methods inspired by signal processing, such as applying  'Nyquist` filter before subsampling. This option would make it easier to preprocess and analyze periodic time series data directly with `tslearn`.  For example: 

```python
   def nyquist_subsample(time_series, reduction_factor, sampling_rate):
    """
    Downsample a time series using the Nyquist principle with anti-aliasing.

    Parameters
    ----------
    time_series (np.ndarray): The input time series (1D or 2D).
    reduction_factor(int): The factor by which to reduce the number of samples.
    sampling_rate (float):  The original sampling rate (Hz).

    Returns:
        numpy array: The subsampled data.
    """
    # Calculate the new sampling rate after reduction
    new_sampling_rate = sampling_rate / reduction_factor

    # Design a low-pass Butterworth filter to prevent aliasing
    nyquist = 0.5 * sampling_rate
    cutoff = 0.5 * new_sampling_rate
    normalized_cutoff = cutoff / nyquist
    b, a = signal.butter(N=5, Wn=normalized_cutoff, btype='low')

    # Apply the filter to the signal
    filtered = signal.filtfilt(b, a, time_series)

    # Select every Nth sample
    indices = np.arange(0, len(time_series), reduction_factor)
    subsampled_time_series = filtered[indices]

    return subsampled_,time_series
``` 

This functionality would be especially useful for biomedical signals (e.g., EGG  or EKG data) and other periodic time series where both frequency and amplitude information are important.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add Built-in Subsampling Methods #537

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add Built-in Subsampling Methods #537

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions