Can Sequence to sequence models be used to convert code from one programming language to another?

1

Is it possible to convert a code from one language to another using sequence to sequence programming. if not sequence to sequence, which algorithm can i use to do this?

InAFlash

Posted 2018-10-04T19:29:44.960

Reputation: 641

Is it important that the converted code actually work? Are you trying to convert simple code snippets - in the raw language without libraries - or entire applications including core and maybe third-party libraries? – Neil Slater – 2018-10-04T19:53:06.723

initially, i don't expect it to work. i want to be able to see a conversion which is atleast 20% correct – InAFlash – 2018-10-04T19:55:27.923

I just want to get started with some idea. – InAFlash – 2018-10-04T19:56:05.817

1It is important to understand whether the end goal is to have working code, and what level it should operate at. If you expect inside the next 10 years to be converting more than a simple function with no dependencies, then that changes how you think about the problem. Your question is phrased as you want to have the end result, as opposed to learn the limits of a technique? – Neil Slater – 2018-10-04T19:58:46.140

Answers

2

Yes, it is possible to convert code from one programming language to another using sequence-to-sequence neural networks. Sequence-to-sequence can learn to translate anything from anything if there are consistent patterns and enough paired examples.

However, it would not be efficient because the models would only indirectly learn the semantics (i.e., the meaning of the code) through the observed string literals (i.e., the characters that appear). It would be more efficient to map one observed sequence of string literals to semantics, then use the latent semantics representation to generate a new observed sequence of string literals.

There are a variety of ways of doing that depending on much you handicap yourself. If you give the model access to abstract syntax tree (AST), it becomes much easier. Then a tree-based Long Short-Term Memory (LSTM) could learn the semantic space.

Overall, this is a field of study that tries to build a Neural Network Transpiler.

Brian Spiering

Posted 2018-10-04T19:29:44.960

Reputation: 10 864