Revamp template capabilities + inject tools in system message when not supported by template #33

ochafik · 2025-01-28T23:42:14Z

This only affects the chat-template.hpp experimental header (used to incubate ggml-org/llama.cpp#9639), not core minja.hpp.

Refreshed template capabilities detection code in C++ and Python, now kept in lockstep by tests (w/ some ground-truth expectations in new test-capabilities.cpp).
More granular fallbacks: some models don't inject tools but do support tool calling, e.g. DeepSeek R1
Fixes detection of parallel tool calls, etc
Skip less tests from Python goldens (more work needed to align fallback mechanisms 1:1)
More extensible API

All templates can be used the same way (e.g. as if they supported everything / didn't require any specific adjustments from the user of Minja)

  struct chat_template_caps {
      bool supports_tools = false;
      bool supports_tool_calls = false;
      bool supports_tool_responses = false;
      bool supports_system_role = false;
      bool supports_parallel_tool_calls = false;
      bool supports_tool_call_id = false;
      // meta-llama/Llama-3.1-8B-Instruct expects arguments to be an object.
      // Most other templates (and OpenAI's API) expect the arguments object to be stringified.
      bool requires_object_arguments = false;
      // CohereForAI/c4ai-command-r-plus simple variant
      bool requires_non_null_content = false;
      // MiniMaxAI/MiniMax-Text-01 special
      bool requires_typed_content = false;
  }

This makes it possible to use deepseek-ai/DeepSeek-R1-Distill-Llama-8B with tools without having to pass a different system message than with most other models.

Possible follow ups:

allow controlling which capabilities are emulated (e.g. to disable tools inlining)
pass a template to control how the tools are inlined, etc?

Model	requires object arguments	requires typed content	supports parallel tool calls	supports system role	supports tool call id	supports tool calls	supports tool responses	supports tools
mistralai-Mistral-Large-Instruct-2407	⚠️		✅	✅	✅	✅	✅	✅
mistralai-Mistral-Nemo-Instruct-2407	⚠️		✅	✅	✅	✅	✅	✅
NousResearch-Hermes-2-Pro-Mistral-7B-tool_use			✅	✅		✅	✅	✅
NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use			✅	✅		✅	✅	✅
meetkai-functionary-medium-v3.2			✅	✅		✅	✅	✅
NousResearch-Hermes-3-Llama-3.1-70B-tool_use			✅	✅		✅	✅	✅
CohereForAI-c4ai-command-r-plus-tool_use	⚠️		✅	✅		✅	✅	✅
Qwen-QwQ-32B-Preview	⚠️		✅	✅		✅	✅	✅
Qwen-Qwen2.5-Math-7B-Instruct	⚠️		✅	✅		✅	✅	✅
Qwen-Qwen2.5-7B-Instruct	⚠️		✅	✅		✅	✅	✅
deepseek-ai-DeepSeek-R1-Distill-Qwen-7B			✅	✅		✅	✅
deepseek-ai-DeepSeek-R1-Distill-Qwen-32B			✅	✅		✅	✅
deepseek-ai-DeepSeek-R1-Distill-Llama-8B			✅	✅		✅	✅
deepseek-ai-DeepSeek-V2.5			✅	✅		✅	✅
meta-llama-Llama-3.2-3B-Instruct	⚠️			✅		✅	✅	✅
nvidia-Llama-3.1-Nemotron-70B-Instruct-HF	⚠️			✅		✅	✅	✅
meta-llama-Llama-3.3-70B-Instruct	⚠️			✅		✅	✅	✅
meta-llama-Meta-Llama-3.1-8B-Instruct	⚠️			✅		✅	✅	✅
meta-llama-Llama-3.1-8B-Instruct	⚠️			✅		✅	✅	✅
mistralai-Mistral-7B-Instruct-v0.2				✅
databricks-dbrx-instruct				✅
microsoft-Phi-3.5-vision-instruct				✅
openchat-openchat-3.5-0106				✅
bofenghuang-vigogne-2-70b-chat				✅
microsoft-Phi-3-small-8k-instruct				✅
mattshumer-Reflection-Llama-3.1-70B				✅
teknium-OpenHermes-2.5-Mistral-7B				✅
mistralai-Mixtral-8x7B-Instruct-v0.1				✅
NousResearch-Hermes-2-Pro-Mistral-7B-default				✅
Qwen-Qwen2-VL-7B-Instruct				✅
CohereForAI-c4ai-command-r-plus-rag				✅
mlabonne-AlphaMonarch-7B				✅
microsoft-Phi-3-mini-4k-instruct				✅
microsoft-Phi-3.5-mini-instruct				✅
NousResearch-Hermes-3-Llama-3.1-70B-default				✅
NousResearch-Hermes-2-Pro-Llama-3-8B-default				✅
deepseek-ai-DeepSeek-Coder-V2-Instruct				✅
CohereForAI-c4ai-command-r-plus-default				✅
deepseek-ai-deepseek-coder-33b-instruct				✅
mistralai-Mistral-Large-Instruct-2411				✅
Qwen-Qwen2-7B-Instruct				✅
indischepartij-MiniCPM-3B-OpenHermes-2.5-v2				✅
abacusai-Fewshot-Metamath-OrcaVicuna-Mistral				✅
TheBloke-FusionNet_34Bx2_MoE-AWQ				✅
MiniMaxAI-MiniMax-Text-01		⚠️		✅
microsoft-Phi-3-medium-4k-instruct
google-gemma-7b-it
NexaAIDev-Octopus-v2
google-gemma-2-2b-it
OrionStarAI-Orion-14B-Chat

ochafik added 17 commits January 28, 2025 01:53

Add tools to system message if not supported by template

72ac35d

Revamp capabilities detection

6e03694

Fix array typos in capabilities detection

12771d4

Delete tool_use_code_interpreter.json

73c96f1

Fix CWD of test-capabilities

bf166d0

Fix most tool-related capabilities

8d428eb

Write capabilities of templates as .jinja.caps.json companion files

affda86

Fix requires_object_arguments cap detection

3527e3d

Ensure capabilities detectors are in sync between c++ & Python

1bd0311

Fix typo in python golden gen

42522c7

Explode on operations w/ Nones

a72e535

Add requires_non_null_content capability to c++

3e0a197

Revert null content in tool_use context

2689d6c

Disable check on requires_non_null_content (mismatch c++ / python)

b3b97c0

Merge branch 'nones' into tools-system

99e2051

Merge remote-tracking branch 'origin/main' into tools-system

edc6a6c

nits

0611927

ochafik added a commit to ochafik/llama.cpp that referenced this pull request Jan 28, 2025

minja: sync on google/minja#33

4f25755

ochafik and others added 12 commits January 28, 2025 23:50

normalize lines of files on windows

ab0c766

Backtrack on disruptive None explosions

9282689

Skip caps golden tests on win32

cd362d2

Update test-syntax.cpp

1f4b928

Revert "Explode on operations w/ Nones"

30a1000

Merge remote-tracking branch 'origin/revert-32-nones' into tools-system

21820ab

Update test-chat-template.cpp

81e9949

Create analyze_capabilities.py

a45b474

Create __init__.py

e67ae5c

Test caps after template output

57529c8

Disable deepseek tests on win32

66a5c6e

Async scripts/fetch_templates_and_goldens.py (3x faster)

cdd93c1

ochafik added 2 commits January 29, 2025 20:19

Disable deepseek caps test on win32

782aaf6

Fix n/a case in golden gen

b643433

ochafik merged commit f21e3e8 into main Jan 29, 2025
11 checks passed

ochafik deleted the tools-system branch January 29, 2025 20:25

ochafik mentioned this pull request Jan 29, 2025

sync: minja ggml-org/llama.cpp#11499

Merged

ochafik restored the tools-system branch February 4, 2025 00:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revamp template capabilities + inject tools in system message when not supported by template #33

Revamp template capabilities + inject tools in system message when not supported by template #33

ochafik commented Jan 28, 2025 •

edited

Loading

Revamp template capabilities + inject tools in system message when not supported by template #33

Revamp template capabilities + inject tools in system message when not supported by template #33

Conversation

ochafik commented Jan 28, 2025 • edited Loading

ochafik commented Jan 28, 2025 •

edited

Loading