This repository was created with the idea that it would be helpful while reviewing Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor. In Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor, there was a comparison table regarding the similarity difference between the unnatural dataset and the natural dataset, computed using BERT Score. When generating datasets using LLM (e.g. ChatGPT, Bard, HyperClovaX), it would be beneficial to use this library to verify the similarity between datasets. If you encounter any issues or have suggestions while using this library, please feel free to talk me.
To be updated.
To be updated.
Or Honovich, Thomas Scialom, Omer Levy, Timo Schick Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor