CaltechAUTHORS
  A Caltech Library Service

Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling

Maire, Michael and Yu, Stella X. and Perona, Pietro (2015) Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling. In: Computer Vision -- ACCV 2014. Lecture Notes in Computer Science. No.9006. Springer , Cham, Switzerland, pp. 273-287. ISBN 978-3-319-16816-6. https://resolver.caltech.edu/CaltechAUTHORS:20151030-125936998

[img] PDF - Submitted Version
See Usage Policy.

6Mb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20151030-125936998

Abstract

We frame the task of predicting a semantic labeling as a sparse reconstruction procedure that applies a target-specific learned transfer function to a generic deep sparse code representation of an image. This strategy partitions training into two distinct stages. First, in an unsupervised manner, we learn a set of dictionaries optimized for sparse coding of image patches. These generic dictionaries minimize error with respect to representing image appearance and are independent of any particular target task. We train a multilayer representation via recursive sparse dictionary learning on pooled codes output by earlier layers. Second, we encode all training images with the generic dictionaries and learn a transfer function that optimizes reconstruction of patches extracted from annotated ground-truth given the sparse codes of their corresponding image patches. At test time, we encode a novel image using the generic dictionaries and then reconstruct using the transfer function. The output reconstruction is a semantic labeling of the test image. Applying this strategy to the task of contour detection, we demonstrate performance competitive with state-of-the-art systems. Unlike almost all prior work, our approach obviates the need for any form of hand-designed features or filters. Our model is entirely learned from image and ground-truth patches, with only patch sizes, dictionary sizes and sparsity levels, and depth of the network as chosen parameters. To illustrate the general applicability of our approach, we also show initial results on the task of semantic part labeling of human faces. The effectiveness of our data-driven approach opens new avenues for research on deep sparse representations. Our classifiers utilize this representation in a novel manner. Rather than acting on nodes in the deepest layer, they attach to nodes along a slice through multiple layers of the network in order to make predictions about local patches. Our flexible combination of a generatively learned sparse representation with discriminatively trained transfer classifiers extends the notion of sparse reconstruction to encompass arbitrary semantic labeling tasks.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1007/978-3-319-16817-3_18 DOIArticle
https://rdcu.be/b4p2HPublisherFree ReadCube access
https://arxiv.org/abs/1410.4521arXivDiscussion paper
ORCID:
AuthorORCID
Perona, Pietro0000-0002-7583-5809
Additional Information:© 2015 Springer International Publishing Switzerland. ARO/JPL-NASA Stennis NAS7.03001 supported Michael Maire’s work.
Funders:
Funding AgencyGrant Number
Army Research Office (ARO)UNSPECIFIED
NASANAS7.03001
Series Name:Lecture Notes in Computer Science
Issue or Number:9006
Record Number:CaltechAUTHORS:20151030-125936998
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20151030-125936998
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:61746
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:02 Nov 2015 23:46
Last Modified:26 May 2020 16:52

Repository Staff Only: item control page