Barnum, George and Talukder, Sabera and Yue, Yisong (2020) On the Benefits of Early Fusion in Multimodal Representation Learning. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20210119-161629149
![]() |
PDF
- Submitted Version
Creative Commons Attribution. 1MB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20210119-161629149
Abstract
Intelligently reasoning about the world often requires integrating data from multiple modalities, as any individual modality may contain unreliable or incomplete information. Prior work in multimodal learning fuses input modalities only after significant independent processing. On the other hand, the brain performs multimodal processing almost immediately. This divide between conventional multimodal learning and neuroscience suggests that a detailed study of early multimodal fusion could improve artificial multimodal representations. To facilitate the study of early multimodal fusion, we create a convolutional LSTM network architecture that simultaneously processes both audio and visual inputs, and allows us to select the layer at which audio and visual information combines. Our results demonstrate that immediate fusion of audio and visual inputs in the initial C-LSTM layer results in higher performing networks that are more robust to the addition of white noise in both audio and visual inputs.
Item Type: | Report or Paper (Discussion Paper) | ||||||
---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||
ORCID: |
| ||||||
Additional Information: | Attribution 4.0 International (CC BY 4.0). | ||||||
Record Number: | CaltechAUTHORS:20210119-161629149 | ||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20210119-161629149 | ||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||
ID Code: | 107564 | ||||||
Collection: | CaltechAUTHORS | ||||||
Deposited By: | George Porter | ||||||
Deposited On: | 20 Jan 2021 15:58 | ||||||
Last Modified: | 20 Jan 2021 15:58 |
Repository Staff Only: item control page