This repository contains the data and inference code of paper "VersiCode: Towards Version-controllable Code Generation."
- Clone the Repository via
git clone https://github.com/wutong8023/VersiCode.git
- Please email us if you need the raw data.
- Install dependencies via
pip install -r requirements_gpt.txt
to run the gpt scripts and evaluating scripts. - Install dependencies via
pip install -r requirements_vllm.txt
to run the vllm scripts - Install dependencies via
pip install -r requirements_togetherai.txt
to run the togetherai scripts
Unzip the VersiCode_Benchmark.zip file to get the data, and put it in the data folder.
Our evaluation consists of two steps: generation and metrics calculation.
For open-sourced models like StarCoder, DeepSeek-Coder, etc., we download them from huggingface and use vLLM for inference, take token experiment as example.
python test_token.py
OpenAI models are accessible through an API. Taking token experiment as an example, change apikey and dataset path:
python test_token_generate_chunk.py
The device is not sufficient to run a model that is too large, so togetherai can be used. togetherai models are accessible through an API. Taking token experiment as an example, change apikey and dataset path:
python test_token_generate_chunk.py
After obtaining the generation, you need to clear the model_output to clear and , then we can calculate the final metrics. Taking the token experiment as example.
python test_token.py
@article{versicode,
author={Tongtong Wu and Weigang Wu and Xingyu Wang and Kang Xu and Suyu Ma and Bo Jiang and Ping Yang and Zhenchang Xing and Yuan-Fang Li and Gholamreza Haffari},
title = {VersiCode: Towards Version-controllable Code Generation},
journal = {CoRR},
volume = {abs/2406.07411},
year = {2024},
url = {https://arxiv.org/abs/2406.07411},
}
Please feel free to submit an issue in this repo.