Facebook’s AI research team has developed a system dubbed the neural transcompiler. The system uses more than 2.8 million open-source GitHub repositories to translate codes between three popular languages- C++, Java, and Python.
Keeping systems up to date with the latest programming languages and technologies costs millions of dollars every year. To tackle the issue, the team at Facebook AI Research (FAIR) has developed an automated system with the help of advanced Deep Learning architecture.
Migrating existing codebases to a modern and efficient language requires individuals with expertise in both the source and the target language. Hence, it’s not only tedious but costly as well. For example, Commonwealth Bank of Australia has spent $750 million to translate its platform from COBOL to Java.
According to FAIR, this new system is unsupervised. This means it requires minimum human supervision and will translate codes based on previously undetected patterns in data sets without labels. Researchers claim that this system outperforms rule-based baselines by a “significant” margin.
This unsupervised translation has been achieved by initializing the transcoder with a cross-lingual language model. The model maps pieces of code expressing the same instructions from the number of common tokens like “for,” “while,” “if,” as well as a few mathematical operators. After this, it uses a target-to-source model to translate all the sequences.
Although the results of the compiler are not completely accurate, it’s much more accurate than other transcoders available today.
According to Facebook,
TransCoder can easily be generalized to any programming language, does not require any expert knowledge, and outperforms commercial solutions by a large margin. Our results suggest that a lot of mistakes made by the model could easily be fixed by adding simple constraints to the decoder to ensure that the generated functions are syntactically correct, or by using dedicated architectures.