TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis

Creators: Talukder, Sabera; Yue, Yisong¹; Gkioxari, Georgia¹

1. California Institute of Technology

Style

An error occurred while generating the citation.

Abstract

The field of general time series analysis has recently begun to explore unified modeling, where a common architectural backbone can be retrained on a specific task for a specific dataset. In this work, we approach unification from a complementary vantage point: unification across tasks and domains. To this end, we explore the impact of discrete, learnt, time series data representations that enable generalist, cross-domain training. Our method, TOTEM, or TOkenized Time Series EMbeddings, proposes a simple tokenizer architecture that embeds time series data from varying domains using a discrete vectorized representation learned in a self-supervised manner. TOTEM works across multiple tasks and domains with minimal to no tuning. We study the efficacy of TOTEM with an extensive evaluation on 17 real world time series datasets across 3 tasks. We evaluate both the specialist (i.e., training a model on each domain) and generalist (i.e., training a single model on many domains) settings, and show that TOTEM matches or outperforms previous best methods on several popular benchmarks. The code can be found at: https://github.com/SaberaTalukder/TOTEM.

Copyright and License

CC BY 4.0

Files

2402.16412v1.pdf

Files (1.5 MB)

Name	Size	Download all
2402.16412v1.pdf md5:a669da2044be1fe6fc475e4e6f4b8ed3	1.5 MB	Preview Download

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes