Convolutional Tensor-Train LSTM for Spatio-temporal Learning
Abstract
Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation. However, existing methods still perform poorly on challenging video tasks such as long-term forecasting. This is because these kinds of challenging tasks require learning long-term spatio-temporal correlations in the video sequence. In this paper, we propose a higher-order convolutional LSTM model that can efficiently learn these correlations, along with a succinct representations of the history. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. To make this feasible in terms of computation and memory requirements, we propose a novel convolutional tensor-train decomposition of the higher-order model. This decomposition reduces the model complexity by jointly approximating a sequence of convolutional kernels as a low-rank tensor-train factorization. As a result, our model outperforms existing approaches, but uses only a fraction of parameters, including the baseline models. Our results achieve state-of-the-art performance in a wide range of applications and datasets, including the multi-steps video prediction on the Moving-MNIST-2 and KTH action datasets as well as early activity recognition on the Something-Something V2 dataset.
Additional Information
This work was done while the first author was an intern at NVIDIA. Project page: https://sites.google.com/nvidia.com/conv-tt-lstm. This work was done while the author, Jiahao Su, was an intern at NVIDIA. Su was also partially supported by the startup fund from Department of Computer Science of University of Maryland and National Science Foundation IIS-1850220 CRII Award 030742-00001. The author, Furong Huang, was supported by Adobe, Capital One, and JP Morgan faculty fellowships.Attached Files
Published - NeurIPS-2020-convolutional-tensor-train-lstm-for-spatio-temporal-learning-Paper.pdf
Supplemental Material - NeurIPS-2020-convolutional-tensor-train-lstm-for-spatio-temporal-learning-Supplemental.pdf
Files
Name | Size | Download all |
---|---|---|
md5:2ff6573258290550d09a54344d1a36b9
|
2.3 MB | Preview Download |
md5:c7552a4697986791f391008c235d073a
|
3.1 MB | Preview Download |
Additional details
- Eprint ID
- 102272
- Resolver ID
- CaltechAUTHORS:20200402-134911700
- University of Maryland
- NSF
- IIS-1850220
- NSF
- 030742-00001
- Adobe
- Capital One
- JP Morgan
- Created
-
2020-04-02Created from EPrint's datestamp field
- Updated
-
2023-06-02Created from EPrint's last_modified field