-
Notifications
You must be signed in to change notification settings - Fork 48
Add initial support for Ada #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support for Ada #162
Conversation
Thanks for this! Just so we're on the same page: You're adding support for sets, which we had omitted since they didn't appear in HumanEval. But, I see that there are two MBPP problems that require them: arjun@arjun-laptop datasets % pwd
/Users/arjun/repos/nuprl/MultiPL-E/datasets
arjun@arjun-laptop datasets % grep -F "Set[" */*.py
mbpp-typed/mbpp_473_tuple_intersection.py:def tuple_intersection(test_list1: List[Tuple[int, int]], test_list2: List[Tuple[int, int]]) -> Set[Tuple[int, int]]:
mbpp-typed/mbpp_582_my_dict.py:def my_dict(dict1: Set[int]) -> bool: All the other translators should be updated to support sets if they want to support these two problems. |
Yes, other translators would need to be updated if they want to support these two problems. While I have defined a Just to confirm though, the Previously the exceptions raised looked like: ...
File ".../projects/ai/MultiPL-E/dataset_builder/generic_translator.py", line 44, in translate_expr
raise Exception(f"Unhandled expression: {py_expr}")
Exception: Unhandled expression: <ast.Set object at 0x1054e5750> Now the exceptions will look like: ...
File ".../MultiPL-E/dataset_builder/generic_translator.py", line 31, in translate_expr
return translator.gen_set([translate_expr(translator, e) for e in elts])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../MultiPL-E/dataset_builder/humaneval_to_py.py", line 103, in gen_set
raise NotImplementedError("This translator does not currently support translating sets")
NotImplementedError: This translator does not currently support translating sets or ...
File ".../MultiPL-E/dataset_builder/generic_translator.py", line 31, in translate_expr
return translator.gen_set([translate_expr(translator, e) for e in elts])
^^^^^^^^^^^^^^^^^^
AttributeError: 'Translator' object has no attribute 'gen_set'. Did you mean: 'gen_dict'? Note also that while MBPP 473 is a well-formed problem, MBPP 582 needs a couple of minor changes. Its current signature is: def my_dict(dict1: Set[int]) -> bool:
"""
Write a function to check if a dictionary is empty
"""
... So the function name and docstring suggest that it takes a dictionary, but it's currently typed to take a set. Then two test cases pass a set, while the third is a dictionary. I'm not entirely sure which combination of typehint, testcase, and docstring changes should be made, but I am confident that it probably should be updated. |
Add pass@1 metric to pass_k.py Update pass_k.py to load the results file from .gz or .json Added basic support for Sets, enabling the translation of mbpp_473_tuple_intersection.py Co-authored-by: Rowan Walshe <[email protected]> Co-authored-by: Fabien Chouteau <[email protected]>
48a408e
to
8a94612
Compare
Thanks! Yeah, some of these problems are a mess. We have generally erred on the side of letting problems be faulty, if the fault was in the original Python problem, but fixing problems in the translators. See EvalPlus (HumanEvalPlus) for a project that actually fixes the faults in the original Python problems. Anyway, I'll have this merged within this week. The container for execution is getting very large. I may create a new container for Ada (and do so for other PLs going forward). |
I tuned this out over the break. But, I'll get to this this week. Thanks for your patience. |
I've merged this in and updated the dataset README on the Hub: https://huggingface.co/datasets/nuprl/MultiPL-E I built a separate evaluation container for Ada, which is also pushed to the GitHub Container Registry: I do need to document that this container is here. I'll get to that next, and I do want to update the rather unwieldly directions for supporting a new language. |
Also, here is an Ada result on MultiPL-HumanEval with Llama 3.1 8b: 11.3% at temperature 0.2 (just 20 completions, so not totally stable, but very close to the true value) |
Hi. @Fabien-Chouteau and I have worked on a patch to add support for translating prompts into Ada.
It's able to translate ~97% of HumanEval and ~90% of MBPP problems (I haven't included the generated prompts in the PR, but let me know if I should). The only types that we haven't yet tried to translate are Any and Union.
We've also included a small change that should add basic support for translating problems that use Sets (which I believe is only mbpp_473_tuple_intersection.py at this time).
As a sense check, we've performed an initial run against one model (Qwen2.5-Coder-7B-Instruct):
Pass@k
Please let me know if you have any feedback, as we'd love to see support for Ada included in the project. Thanks :)