1

AI code generators have become very popular in recent years, from OpenAI's Codex to DeepMind's AlphaCode. However, neither of these two AI models is open source: AlphaCode only gives some test examples, while Codex only opens up the API.

"Despite the great success of large language code models, none of the strongest models have been made public," said Carnegie Mellon researchers. "This prevents the adoption of these models outside well-resourced companies and limits resources." Organizations lack research in this area."

Therefore, several researchers from Carnegie Mellon University have launched an open-source automatic code generator model PolyCoder with 27B parameters, based on the GPT-2 architecture, trained on a 249GB code database in 12 programming languages.

The 12 programming languages are: C, C#, C++, Go, Java, JavaScript, PHP, Python, Ruby, Rust, Scala, and TypeScript.

Training results show that PolyCoder outperforms all known models, including Codex, in writing C. Compared with other open source models, PolyCoder performs better in C, JavaScript, Rust, Scala and TypeScript than the similar model GPT-Neo 2.7B. But Codex still outperforms PolyCoder in other languages.


六一
556 声望347 粉丝

SegmentFault 新媒体运营