CaltechAUTHORS
  A Caltech Library Service

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

Cvitkovic, Milan and Singh, Badal and Anandkumar, Anima (2018) Open Vocabulary Learning on Source Code with a Graph-Structured Cache. . (Unpublished) http://resolver.caltech.edu/CaltechAUTHORS:20190327-085810844

[img] PDF - Submitted Version
See Usage Policy.

1298Kb

Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:20190327-085810844

Abstract

Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g., the coinage of new variable and method names. Reasoning over such a vocabulary is not something for which most NLP methods are designed. We introduce a Graph-Structured Cache to address this problem; this cache contains a node for each new word the model encounters with edges connecting each word to its occurrences in the code. We find that combining this graph-structured cache strategy with recent Graph-Neural-Network-based models for supervised learning on code improves the models' performance on a code completion task and a variable naming task --- with over 100% relative improvement on the latter --- at the cost of a moderate increase in computation time.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/1810.08305arXivDiscussion Paper
Additional Information:Many thanks to Miltos Allamanis, Hyokun Yun, and Haibin Lin for their advice and useful conversations.
Record Number:CaltechAUTHORS:20190327-085810844
Persistent URL:http://resolver.caltech.edu/CaltechAUTHORS:20190327-085810844
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:94180
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:28 Mar 2019 14:51
Last Modified:28 Mar 2019 14:51

Repository Staff Only: item control page