Skip to content

Commit 626d860

Browse files
authored
Release v0.2.4 (stanford-crfm#1849)
1 parent 2822138 commit 626d860

File tree

2 files changed

+46
-2
lines changed

2 files changed

+46
-2
lines changed

CHANGELOG.md

+45-1
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,49 @@
22

33
## [Upcoming]
44

5+
## [v0.2.4] - 2023-09-20
6+
7+
### Models
8+
9+
- Added Meta LLaMA, Meta Llama 2, EleutherAI Pythia, Together RedPajama on Together (#1821)
10+
- Removed the unofficial chat-gpt client in favor of the official API (#1809)
11+
- Added support for models for the NeurIPS Efficiency Challenge (#1693)
12+
13+
### Frontend
14+
15+
- Added support for rendering train-test overlap stats in the frontend (#1747)
16+
- Fixed a bug where stats with NaN values would cause the frontend to fail to render tables (#1784)
17+
18+
### Framework
19+
20+
- Moved many dependencies, especially those only used by a single model provider or a small number of runs, to optional extra dependencies (#1798, #1844)
21+
- Widened some dependencies (e.g. PyTorch) to reduce dependency conflicts with other packages (#1759)
22+
- Added `MaxEvalInstancesRunExpander` to allow overriding the number of eval instances at the run level (#1837)
23+
- Updated human critique evaluation on Amazon Mechanical Turk to support emoji and other special characters (#1773)
24+
- Fixed a bug where in-context learning examples with multiple correct references were adapted to prompts where all the correct references are concatenated together as the output, which was not intended for some scenarios (e.g. narrative_qa, natural_qa, quac and wikifact) (#1785)
25+
- Fixed a bug where ObjectSpec is not hashable if any arg is a list (#1771)
26+
27+
### Evaluations
28+
29+
- Added evaluation results for Meta LLaMA, Meta Llama 2, EleutherAI Pythia, Together RedPajama on Together
30+
- Corrected evaluation results for AI21 Jurassic-2 and Writer Palmyra for the scenarios narrative_qa, natural_qa, quac and wikifact, as they were affected by the bug fixed by #1785
31+
32+
### Contributors
33+
34+
Thank you to the following contributors for your contributions to this HELM release!
35+
36+
- @AndrewJGaut
37+
- @andyzorigin
38+
- @bidyapati-p
39+
- @drisspg
40+
- @mkly
41+
- @msaroufim
42+
- @percyliang
43+
- @teetone
44+
- @timothylimyl
45+
- @unnawut
46+
- @yifanmai
47+
548
## [v0.2.3] - 2023-07-25
649

750
### Models
@@ -134,7 +177,8 @@
134177

135178
- Initial release
136179

137-
[upcoming]: https://github.com/stanford-crfm/helm/compare/v0.2.3...HEAD
180+
[upcoming]: https://github.com/stanford-crfm/helm/compare/v0.2.4...HEAD
181+
[v0.2.3]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.4
138182
[v0.2.3]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.3
139183
[v0.2.2]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.2
140184
[v0.2.1]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.1

setup.cfg

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[metadata]
22
name = crfm-helm
3-
version = 0.2.3
3+
version = 0.2.4
44
author = Stanford CRFM
55
author_email = [email protected]
66
description = Benchmark for language models

0 commit comments

Comments
 (0)