Skip to content

Navigation Menu

AlignmentResearch

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

FAR.AI

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

125 followers
https://far.ai
@FARAIResearch
company/far-ai
@FARAIResearch
hello@far.ai

Overview
Repositories
Projects
Packages
People

More

Overview
Repositories
Projects
Packages
People

Popular repositories Loading

tuned-lens tuned-lens Public

Tools for understanding how transformer predictions are built layer-by-layer

Python 480 51
go_attack go_attack Public

Python 84 7
vlmrm vlmrm Public

Python 51 14
gpt-4-novel-apis-attacks gpt-4-novel-apis-attacks Public

19 1
learned-planner learned-planner Public

Interpretability tools for recurrent convolutional networks (DRC) that play Sokoban

Python 11 3
scaling-poisoning scaling-poisoning Public

Python 8 1

Repositories

Loading

Type

Select type

All Public Sources Forks Archived Mirrors Templates

Language

Select language

All C++ Dockerfile Go HTML Java Jinja Jupyter Notebook Python Shell

Sort

Select order

Last updated Name Stars

Showing 10 of 39 repositories

refusal_direction Public Forked from andyrdt/refusal_direction
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".

AlignmentResearch/refusal_direction’s past year of commit activity

Python 0 Apache-2.0 45 0 0 Updated Mar 19, 2025
kueue Public Forked from kubernetes-sigs/kueue
Kubernetes-native Job Queueing

AlignmentResearch/kueue’s past year of commit activity

Go 0 Apache-2.0 303 0 0 Updated Mar 18, 2025
scaling-llm-robustness-paper Public
Code used for the paper `Scaling Trends in Language Model Robustness`

AlignmentResearch/scaling-llm-robustness-paper’s past year of commit activity

Python 0 0 0 0 Updated Mar 15, 2025
HarmBench Public Forked from centerforaisafety/HarmBench
Fork of HarmBench for getting R2D2 working

AlignmentResearch/HarmBench’s past year of commit activity

Jupyter Notebook 0 MIT 82 0 1 Updated Mar 14, 2025
emergent-misalignment Public Forked from emergent-misalignment/emergent-misalignment

AlignmentResearch/emergent-misalignment’s past year of commit activity

Python 0 MIT 28 0 0 Updated Mar 12, 2025
learned-planner Public
Interpretability tools for recurrent convolutional networks (DRC) that play Sokoban

AlignmentResearch/learned-planner’s past year of commit activity

Python 11 Apache-2.0 3 0 0 Updated Mar 6, 2025
kubespray Public Forked from kubernetes-sigs/kubespray
Deploy a Production Ready Kubernetes Cluster

AlignmentResearch/kubespray’s past year of commit activity

Jinja 0 Apache-2.0 6,693 0 0 Updated Mar 6, 2025
train-learned-planner Public
Experimenting with CleanRL for learned-planners

AlignmentResearch/train-learned-planner’s past year of commit activity

Python 5 1 1 1 Updated Mar 5, 2025
harmtune Public

AlignmentResearch/harmtune’s past year of commit activity

Python 3 0 2 0 Updated Mar 3, 2025
envpool Public Forked from sail-sg/envpool
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

AlignmentResearch/envpool’s past year of commit activity

C++ 1 Apache-2.0 116 0 1 Updated Mar 3, 2025

View all repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.