How to reproduce the performance comparison between PandaLM and ChatGPT #8

480284856 · 2023-05-10T13:08:56Z

thanks for releasing the project!
I want to reproduce the performance comparison between PandaLM and ChatGPT like:

Of course I mean I want to get the same performance of PandaLM on my machine.
Is it gotten based on ./data/testset-v1.json? If so, what should I run for the second line of performance table?

zhuohaoyu · 2023-05-13T06:45:26Z

Feel free to reproduce PandaLM's output with pandalm/utils/pandalm_inference.py, as you could pass our human annotated test dataset with labels(data/testset-v1.json), run inference and calculate any metric you want.
We provide the outputs from gpt-3.5-turbo and parsed the raw outputs to json format in data/gpt-3.5-turbo-testset-v1.json as described here(https://github.com/WeOpenML/PandaLM/tree/main#test-data). You could also run OpenAI's API with your own script or collect data from ChatGPT manually to .json format for comparison.

480284856 changed the title ~~how to run on ./data/testset-v1.json~~ How to reproduce the performance comparison between PandaLM and ChatGPT May 10, 2023

zhuohaoyu closed this as completed May 13, 2023

arjunbansal mentioned this issue Sep 5, 2023

add metrics #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to reproduce the performance comparison between PandaLM and ChatGPT #8

How to reproduce the performance comparison between PandaLM and ChatGPT #8

480284856 commented May 10, 2023

zhuohaoyu commented May 13, 2023

Uh oh!

How to reproduce the performance comparison between PandaLM and ChatGPT #8

How to reproduce the performance comparison between PandaLM and ChatGPT #8

Comments

480284856 commented May 10, 2023

zhuohaoyu commented May 13, 2023

Uh oh!