Keypoint-MoSeq: parsing behavior by linking point tracking to pose dynamics

authors: Caleb Weinreb, Jonah E. Pearl, Sherry Lin, Mohammed Abdal Monium Osman, Libby Zhang, Sidharth Annapragada, Eli Conlin, Red Hoffmann, Sofia Makowska, Winthrop F. Gillis, Maya Jay, Shaokai Ye, Alexander Mathis, Mackenzie W. Mathis, Talmo Pereira, Scott W. Linderman, Sandeep Robert Datta
doi: 10.1038/s41592-024-02318-2

CITATION

Weinreb, C., Pearl, J. E., Lin, S., Osman, M. A. M., Zhang, L., Annapragada, S., Conlin, E., Hoffmann, R., Makowska, S., Gillis, W. F., Jay, M., Ye, S., Mathis, A., Mathis, M. W., Pereira, T., Linderman, S. W., & Datta, S. R. (2024). Keypoint-MoSeq: Parsing behavior by linking point tracking to pose dynamics. Nature Methods, 21(7), 1329–1339. https://doi.org/10.1038/s41592-024-02318-2

ABSTRACT

Abstract Keypoint tracking algorithms can flexibly quantify animal movement from videos obtained in a wide variety of settings. However, it remains unclear how to parse continuous keypoint data into discrete actions. This challenge is particularly acute because keypoint data are susceptible to high-frequency jitter that clustering algorithms can mistake for transitions between actions. Here we present keypoint-MoSeq, a machine learning-based platform for identifying behavioral modules (‘syllables’) from keypoint data without human supervision. Keypoint-MoSeq uses a generative model to distinguish keypoint noise from behavior, enabling it to identify syllables whose boundaries correspond to natural sub-second discontinuities in pose dynamics. Keypoint-MoSeq outperforms commonly used alternative clustering methods at identifying these transitions, at capturing correlations between neural activity and behavior and at classifying either solitary or social behaviors in accordance with human annotations. Keypoint-MoSeq also works in multiple species and generalizes beyond the syllable timescale, identifying fast sniff-aligned movements in mice and a spectrum of oscillatory behaviors in fruit flies. Keypoint-MoSeq, therefore, renders accessible the modular structure of behavior through standard video recordings.

fleeting notes

keypoint data is susceptible to high frequency jitter that clustering algorithms mistake for transitions

kepoint moseq finds syllables from keypoint data - unsupervised

uses a generative model to distinguish noise from behavior

MoSeq

uses unsupervised classificaiton of 3D videos (depth vids) into motifs
to find syllables - it finds discontinuities in behavior at timescale set by user
influences the frequency that syllabels can transition
in mice - sub second and second timscale is good for boundaries
depth cameras are difficult to use!
using keypoints as inputs has been tricky because of the jitter in estimates which gets mistaken for transitions

redeveloped the moseq model to capture behavioral syllables at mulitple timescales

similar sub second discontinuities were present in depth and keypoint models

keypoint model used on moseq had a ton of high frequency fluctuations from tracking artifacts

smoothing did not get rid of this. it blurred true transitions and made it difficult to detect syllable boundaries

keypoint moseq produces different syllables each time it runs

so run multiple interations and fit a likelihood metric

unsupervised classification of behavior - learn how the brain generates self motivated behaviors

boundaries serve as timestamps for alignment
found that dopamine dynamics increased at onset of syllables defined by moseq
- this suggests that moseq syllables could be used as landmarks for neural data analysis

keypoint moseq generalized across experimental setups and species

keypoint moseq handles tracking errors for behavioral classification

infers noise from learned patterns of animal motion

highlights

📚

🌱 mimir

mimir

Weinreb.etal2024

Keypoint-MoSeq: parsing behavior by linking point tracking to pose dynamics

fleeting notes

highlights

Graph View

Backlinks