Skip to content

How to reproduce the performance comparison between PandaLM and ChatGPT #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
480284856 opened this issue May 10, 2023 · 1 comment
Closed

Comments

@480284856
Copy link

thanks for releasing the project!
I want to reproduce the performance comparison between PandaLM and ChatGPT like:
image
Of course I mean I want to get the same performance of PandaLM on my machine.
Is it gotten based on ./data/testset-v1.json? If so, what should I run for the second line of performance table?

@480284856 480284856 changed the title how to run on ./data/testset-v1.json How to reproduce the performance comparison between PandaLM and ChatGPT May 10, 2023
@zhuohaoyu
Copy link
Member

Feel free to reproduce PandaLM's output with pandalm/utils/pandalm_inference.py, as you could pass our human annotated test dataset with labels(data/testset-v1.json), run inference and calculate any metric you want.
We provide the outputs from gpt-3.5-turbo and parsed the raw outputs to json format in data/gpt-3.5-turbo-testset-v1.json as described here(https://github.com/WeOpenML/PandaLM/tree/main#test-data). You could also run OpenAI's API with your own script or collect data from ChatGPT manually to .json format for comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants