SimSortTool

Overview

SimSort is a deep learning–based framework for spike sorting, pre-trained with large-scale biologically realistic simulations of extracellular recordings.

Currently, SimSort supports tetrode (4-channel) recordings. SimSort 2.0 for Neuropixels is on the way!

How To Install SimSort

🔧 Installation (Linux/macOS/Windows)

We recommend installing SimSort in a conda environment with PyTorch + CUDA for optimal performance. CPU-only inference is supported but significantly slower.

Step 1. Clone the repository

git clone https://github.com/SimSortTool/SimSort-Tetrode.git
cd SimSort-Tetrode

Step 2. Set up environment

# Create and activate conda environment
conda create -n SimSort python=3.10 -y
conda activate SimSort
# Install Python dependencies
pip install -r model/requirements.txt

‼️ Additionally, you need to install PyTorch following the official setup guide depending on your system configuration.

Step 3. Download pre-trained models (Manual)

Please manually download the following two zip files:

Extract both zip files into the folder:

SimSort-Tetrode/model/simsort_pretrained/

After extraction, the directory structure should look like this:

model/simsort_pretrained/
├── detector_bbp_L1-L5-8192/
│   ├── detection_config.yaml
│   ├── detection_aug_config.yaml
│   └── saved_models/
│       ├── checkpoint.pth
│       └── args.yaml
└── extractor_bbp_L1-L5-8192/
    ├── config.yaml
    ├── aug_config.yaml
    └── saved_models/
        ├── checkpoint.pth
        └── args.yaml

🎉 Running SimSort on Your Own Neural Data

Here is the example in model/SimSort_Tool_Demo.ipynb. You can run the Jupyter notebook directly from that location.

Setp 1. Load Your Recording

SimSort use SpikeInterface to load neural recordings, supporting common formats such as .plx, .mda, .npy, and binary .bin files.

Here’s an example of a .plx recording:

import spikeinterface.extractors as se
from task.custom_sorting import SortingTask

recording_file = r'custom_data/4chTetrodeDemoPLX.plx'
recording = se.PlexonRecordingExtractor(recording_file, stream_name='TETWB')

🔎 For more formats, see: Supported File Formats

Setp 2. Spike Sorting

sorting = SortingTask(recording=recording, verbose=False)
sorting.run()

SimSort’s sorting pipeline includes:

Data Preprocessing: Filtering, normalizing and whitening the data.
Spike Detection: Identifies spike events using a pre-trained model.
Spike Identification: Extracts latent features of waveforms using a pre-trained model, reduces dimensionality, and clusters spikes into neuronal units.

Step 3. Save Results

sorting.save_results(output_dir='./simsort_results')

This saves the spike waveforms, timestamps, and unit IDs to disk.

📚 Advanced Usage

For more advanced usage such as model training and benchmarking (available on Linux/macOS or Windows via WSL):

Training:

# Train spike detection model
bash scripts/train_detector.sh

# Train spike identification model
bash scripts/train_extractor.sh

Benchmarking with Pre-trained Models:

 # Run spike sorting with pre-trained models
 bash scripts/run_sorting.sh

📢 Questions?

Feel free to open issues on GitHub if you run into any problems!