Wang Yuqi's Blog

Papers About Code Generation

April 01, 2023

LiveCodeBench

Evaluation problem from codeforce, atcoder, leetcode(only small number of problems from codeforce). LLMs struggle on these problems.

Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering

CodeGeeX

language models related to programming languages

Meet in the Middle: A New Pre-training Paradigm

great at code generation

a decoder-only transformer model trained both forward and backward.
- next word prediction loss(forward)
- previous word prediction loss(backward)
- agreement loss(distance between forward and backward represetations of tokens)
When inferring, synchronously generate tokens from both ends. (meet in the middle)

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

StarCoder: may the source be with you!

TODO
Use JupyText to deal with Jupyter notebooks
evaluate on DS-1000 dataset