Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published 2024 | Submitted
Discussion Paper Open

TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis

  • 1. ROR icon California Institute of Technology

Abstract

The field of general time series analysis has recently begun to explore unified modeling, where a common architectural backbone can be retrained on a specific task for a specific dataset. In this work, we approach unification from a complementary vantage point: unification across tasks and domains. To this end, we explore the impact of discrete, learnt, time series data representations that enable generalist, cross-domain training. Our method, TOTEM, or TOkenized Time Series EMbeddings, proposes a simple tokenizer architecture that embeds time series data from varying domains using a discrete vectorized representation learned in a self-supervised manner. TOTEM works across multiple tasks and domains with minimal to no tuning. We study the efficacy of TOTEM with an extensive evaluation on 17 real world time series datasets across 3 tasks. We evaluate both the specialist (i.e., training a model on each domain) and generalist (i.e., training a single model on many domains) settings, and show that TOTEM matches or outperforms previous best methods on several popular benchmarks. The code can be found at: https://github.com/SaberaTalukder/TOTEM.

Copyright and License

CC BY 4.0

Files

2402.16412v1.pdf
Files (1.5 MB)
Name Size Download all
md5:a669da2044be1fe6fc475e4e6f4b8ed3
1.5 MB Preview Download
Created:
June 17, 2024
Modified:
June 17, 2024