Skip to content

Commit 9f5dcf2

Browse files
authored
feat(aio): update AIO image defaults (#5002)
* feat(aio): update AIO image defaults cpu: - text-to-text: llama3.1 - embeddings: granite-embeddings - vision: moonream2 gpu/intel: - text-to-text: localai-functioncall-qwen2.5-7b-v0.5 - embeddings: granite-embeddings - vision: minicpm Signed-off-by: Ettore Di Giacinto <[email protected]> * feat(aio): use minicpm as moondream2 stopped working ggml-org/llama.cpp#12322 (comment) Signed-off-by: Ettore Di Giacinto <[email protected]> --------- Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent e878556 commit 9f5dcf2

9 files changed

+244
-339
lines changed

aio/cpu/embeddings.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
name: text-embedding-ada-002
21
embeddings: true
2+
name: text-embedding-ada-002
33
parameters:
4-
model: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/llama-3.2-1b-instruct-q4_k_m.gguf
4+
model: huggingface://bartowski/granite-embedding-107m-multilingual-GGUF/granite-embedding-107m-multilingual-f16.gguf
55

66
usage: |
77
You can test this model with curl like this:

aio/cpu/text-to-text.yaml

+49-93
Original file line numberDiff line numberDiff line change
@@ -1,101 +1,57 @@
1-
name: gpt-4
2-
mmap: true
3-
parameters:
4-
model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
51
context_size: 8192
6-
7-
stopwords:
8-
- "<|im_end|>"
9-
- "<dummy32000>"
10-
- "</tool_call>"
11-
- "<|eot_id|>"
12-
- "<|end_of_text|>"
13-
2+
f16: true
143
function:
15-
# disable injecting the "answer" tool
16-
disable_no_action: true
17-
184
grammar:
19-
# This allows the grammar to also return messages
20-
mixed_mode: true
21-
# Suffix to add to the grammar
22-
#prefix: '<tool_call>\n'
23-
# Force parallel calls in the grammar
24-
# parallel_calls: true
25-
26-
return_name_in_function_response: true
27-
# Without grammar uncomment the lines below
28-
# Warning: this is relying only on the capability of the
29-
# LLM model to generate the correct function call.
30-
json_regex_match:
31-
- "(?s)<tool_call>(.*?)</tool_call>"
32-
- "(?s)<tool_call>(.*?)"
33-
replace_llm_results:
34-
# Drop the scratchpad content from responses
35-
- key: "(?s)<scratchpad>.*</scratchpad>"
36-
value: ""
37-
replace_function_results:
38-
# Replace everything that is not JSON array or object
39-
#
40-
- key: '(?s)^[^{\[]*'
41-
value: ""
42-
- key: '(?s)[^}\]]*$'
43-
value: ""
44-
- key: "'([^']*?)'"
45-
value: "_DQUOTE_${1}_DQUOTE_"
46-
- key: '\\"'
47-
value: "__TEMP_QUOTE__"
48-
- key: "\'"
49-
value: "'"
50-
- key: "_DQUOTE_"
51-
value: '"'
52-
- key: "__TEMP_QUOTE__"
53-
value: '"'
54-
# Drop the scratchpad content from responses
55-
- key: "(?s)<scratchpad>.*</scratchpad>"
56-
value: ""
57-
5+
no_mixed_free_string: true
6+
schema_type: llama3.1 # or JSON is supported too (json)
7+
response_regex:
8+
- <function=(?P<name>\w+)>(?P<arguments>.*)</function>
9+
mmap: true
10+
name: gpt-4
11+
parameters:
12+
model: Hermes-3-Llama-3.2-3B-Q4_K_M.gguf
13+
stopwords:
14+
- <|im_end|>
15+
- <dummy32000>
16+
- <|eot_id|>
17+
- <|end_of_text|>
5818
template:
5919
chat: |
60-
{{.Input -}}
61-
<|im_start|>assistant
20+
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
21+
You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
22+
{{.Input }}
23+
<|start_header_id|>assistant<|end_header_id|>
6224
chat_message: |
63-
<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
64-
{{- if .FunctionCall }}
65-
<tool_call>
66-
{{- else if eq .RoleName "tool" }}
67-
<tool_response>
68-
{{- end }}
69-
{{- if .Content}}
70-
{{.Content }}
71-
{{- end }}
72-
{{- if .FunctionCall}}
73-
{{toJson .FunctionCall}}
74-
{{- end }}
75-
{{- if .FunctionCall }}
76-
</tool_call>
77-
{{- else if eq .RoleName "tool" }}
78-
</tool_response>
79-
{{- end }}<|im_end|>
25+
<|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|>
26+
{{ if .FunctionCall -}}
27+
{{ else if eq .RoleName "tool" -}}
28+
The Function was executed and the response was:
29+
{{ end -}}
30+
{{ if .Content -}}
31+
{{.Content -}}
32+
{{ else if .FunctionCall -}}
33+
{{ range .FunctionCall }}
34+
[{{.FunctionCall.Name}}({{.FunctionCall.Arguments}})]
35+
{{ end }}
36+
{{ end -}}
37+
<|eot_id|>
8038
completion: |
8139
{{.Input}}
82-
function: |-
83-
<|im_start|>system
84-
You are a function calling AI model.
85-
Here are the available tools:
86-
<tools>
87-
{{range .Functions}}
88-
{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
89-
{{end}}
90-
</tools>
91-
You should call the tools provided to you sequentially
92-
Please use <scratchpad> XML tags to record your reasoning and planning before you call the functions as follows:
93-
<scratchpad>
94-
{step-by-step reasoning and plan in bullet points}
95-
</scratchpad>
96-
For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
97-
<tool_call>
98-
{"arguments": <args-dict>, "name": <function-name>}
99-
</tool_call><|im_end|>
100-
{{.Input -}}
101-
<|im_start|>assistant
40+
function: |
41+
<|start_header_id|>system<|end_header_id|>
42+
You are an expert in composing functions. You are given a question and a set of possible functions.
43+
Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
44+
If none of the functions can be used, point it out. If the given question lacks the parameters required by the function, also point it out. You should only return the function call in tools call sections.
45+
If you decide to invoke any of the function(s), you MUST put it in the format as follows:
46+
[func_name1(params_name1=params_value1,params_name2=params_value2,...),func_name2(params_name1=params_value1,params_name2=params_value2,...)]
47+
You SHOULD NOT include any other text in the response.
48+
Here is a list of functions in JSON format that you can invoke.
49+
{{toJson .Functions}}
50+
<|eot_id|><|start_header_id|>user<|end_header_id|>
51+
{{.Input}}
52+
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
53+
54+
download_files:
55+
- filename: Hermes-3-Llama-3.2-3B-Q4_K_M.gguf
56+
sha256: 2e220a14ba4328fee38cf36c2c068261560f999fadb5725ce5c6d977cb5126b5
57+
uri: huggingface://bartowski/Hermes-3-Llama-3.2-3B-GGUF/Hermes-3-Llama-3.2-3B-Q4_K_M.gguf

aio/cpu/vision.yaml

+39-21
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,49 @@
1-
backend: llama-cpp
21
context_size: 4096
32
f16: true
43
mmap: true
4+
mmproj: minicpm-v-2_6-mmproj-f16.gguf
55
name: gpt-4o
6-
7-
roles:
8-
user: "USER:"
9-
assistant: "ASSISTANT:"
10-
system: "SYSTEM:"
11-
12-
mmproj: bakllava-mmproj.gguf
136
parameters:
14-
model: bakllava.gguf
15-
7+
model: minicpm-v-2_6-Q4_K_M.gguf
8+
stopwords:
9+
- <|im_end|>
10+
- <dummy32000>
11+
- </s>
12+
- <|endoftext|>
1613
template:
1714
chat: |
18-
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.
15+
{{.Input -}}
16+
<|im_start|>assistant
17+
chat_message: |
18+
<|im_start|>{{ .RoleName }}
19+
{{ if .FunctionCall -}}
20+
Function call:
21+
{{ else if eq .RoleName "tool" -}}
22+
Function response:
23+
{{ end -}}
24+
{{ if .Content -}}
25+
{{.Content }}
26+
{{ end -}}
27+
{{ if .FunctionCall -}}
28+
{{toJson .FunctionCall}}
29+
{{ end -}}<|im_end|>
30+
completion: |
1931
{{.Input}}
20-
ASSISTANT:
32+
function: |
33+
<|im_start|>system
34+
You are a function calling AI model. You are provided with functions to execute. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
35+
{{range .Functions}}
36+
{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
37+
{{end}}
38+
For each function call return a json object with function name and arguments
39+
<|im_end|>
40+
{{.Input -}}
41+
<|im_start|>assistant
2142
2243
download_files:
23-
- filename: bakllava.gguf
24-
uri: huggingface://mys/ggml_bakllava-1/ggml-model-q4_k.gguf
25-
- filename: bakllava-mmproj.gguf
26-
uri: huggingface://mys/ggml_bakllava-1/mmproj-model-f16.gguf
27-
28-
usage: |
29-
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
30-
"model": "gpt-4-vision-preview",
31-
"messages": [{"role": "user", "content": [{"type":"text", "text": "What is in the image?"}, {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" }}], "temperature": 0.9}]}'
44+
- filename: minicpm-v-2_6-Q4_K_M.gguf
45+
sha256: 3a4078d53b46f22989adbf998ce5a3fd090b6541f112d7e936eb4204a04100b1
46+
uri: huggingface://openbmb/MiniCPM-V-2_6-gguf/ggml-model-Q4_K_M.gguf
47+
- filename: minicpm-v-2_6-mmproj-f16.gguf
48+
uri: huggingface://openbmb/MiniCPM-V-2_6-gguf/mmproj-model-f16.gguf
49+
sha256: 4485f68a0f1aa404c391e788ea88ea653c100d8e98fe572698f701e5809711fd

aio/gpu-8g/embeddings.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1+
embeddings: true
12
name: text-embedding-ada-002
2-
backend: sentencetransformers
33
parameters:
4-
model: all-MiniLM-L6-v2
4+
model: huggingface://bartowski/granite-embedding-107m-multilingual-GGUF/granite-embedding-107m-multilingual-f16.gguf
55

66
usage: |
77
You can test this model with curl like this:

aio/gpu-8g/text-to-text.yaml

+35-83
Original file line numberDiff line numberDiff line change
@@ -1,101 +1,53 @@
1-
name: gpt-4
2-
mmap: true
3-
parameters:
4-
model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
5-
context_size: 8192
6-
7-
stopwords:
8-
- "<|im_end|>"
9-
- "<dummy32000>"
10-
- "</tool_call>"
11-
- "<|eot_id|>"
12-
- "<|end_of_text|>"
13-
1+
context_size: 4096
2+
f16: true
143
function:
15-
# disable injecting the "answer" tool
16-
disable_no_action: true
17-
4+
capture_llm_results:
5+
- (?s)<Thought>(.*?)</Thought>
186
grammar:
19-
# This allows the grammar to also return messages
20-
mixed_mode: true
21-
# Suffix to add to the grammar
22-
#prefix: '<tool_call>\n'
23-
# Force parallel calls in the grammar
24-
# parallel_calls: true
25-
26-
return_name_in_function_response: true
27-
# Without grammar uncomment the lines below
28-
# Warning: this is relying only on the capability of the
29-
# LLM model to generate the correct function call.
30-
json_regex_match:
31-
- "(?s)<tool_call>(.*?)</tool_call>"
32-
- "(?s)<tool_call>(.*?)"
7+
properties_order: name,arguments
8+
json_regex_match:
9+
- (?s)<Output>(.*?)</Output>
3310
replace_llm_results:
34-
# Drop the scratchpad content from responses
35-
- key: "(?s)<scratchpad>.*</scratchpad>"
36-
value: ""
37-
replace_function_results:
38-
# Replace everything that is not JSON array or object
39-
#
40-
- key: '(?s)^[^{\[]*'
11+
- key: (?s)<Thought>(.*?)</Thought>
4112
value: ""
42-
- key: '(?s)[^}\]]*$'
43-
value: ""
44-
- key: "'([^']*?)'"
45-
value: "_DQUOTE_${1}_DQUOTE_"
46-
- key: '\\"'
47-
value: "__TEMP_QUOTE__"
48-
- key: "\'"
49-
value: "'"
50-
- key: "_DQUOTE_"
51-
value: '"'
52-
- key: "__TEMP_QUOTE__"
53-
value: '"'
54-
# Drop the scratchpad content from responses
55-
- key: "(?s)<scratchpad>.*</scratchpad>"
56-
value: ""
57-
13+
mmap: true
14+
name: gpt-4
15+
parameters:
16+
model: localai-functioncall-qwen2.5-7b-v0.5-q4_k_m.gguf
17+
stopwords:
18+
- <|im_end|>
19+
- <dummy32000>
20+
- </s>
5821
template:
5922
chat: |
6023
{{.Input -}}
6124
<|im_start|>assistant
6225
chat_message: |
63-
<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
64-
{{- if .FunctionCall }}
65-
<tool_call>
66-
{{- else if eq .RoleName "tool" }}
67-
<tool_response>
68-
{{- end }}
69-
{{- if .Content}}
26+
<|im_start|>{{ .RoleName }}
27+
{{ if .FunctionCall -}}
28+
Function call:
29+
{{ else if eq .RoleName "tool" -}}
30+
Function response:
31+
{{ end -}}
32+
{{ if .Content -}}
7033
{{.Content }}
71-
{{- end }}
72-
{{- if .FunctionCall}}
34+
{{ end -}}
35+
{{ if .FunctionCall -}}
7336
{{toJson .FunctionCall}}
74-
{{- end }}
75-
{{- if .FunctionCall }}
76-
</tool_call>
77-
{{- else if eq .RoleName "tool" }}
78-
</tool_response>
79-
{{- end }}<|im_end|>
37+
{{ end -}}<|im_end|>
8038
completion: |
8139
{{.Input}}
82-
function: |-
40+
function: |
8341
<|im_start|>system
84-
You are a function calling AI model.
85-
Here are the available tools:
86-
<tools>
42+
You are an AI assistant that executes function calls, and these are the tools at your disposal:
8743
{{range .Functions}}
8844
{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
8945
{{end}}
90-
</tools>
91-
You should call the tools provided to you sequentially
92-
Please use <scratchpad> XML tags to record your reasoning and planning before you call the functions as follows:
93-
<scratchpad>
94-
{step-by-step reasoning and plan in bullet points}
95-
</scratchpad>
96-
For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
97-
<tool_call>
98-
{"arguments": <args-dict>, "name": <function-name>}
99-
</tool_call><|im_end|>
46+
<|im_end|>
10047
{{.Input -}}
101-
<|im_start|>assistant
48+
<|im_start|>assistant
49+
50+
download_files:
51+
- filename: localai-functioncall-phi-4-v0.3-q4_k_m.gguf
52+
sha256: 23fee048ded2a6e2e1a7b6bbefa6cbf83068f194caa9552aecbaa00fec8a16d5
53+
uri: huggingface://mudler/LocalAI-functioncall-phi-4-v0.3-Q4_K_M-GGUF/localai-functioncall-phi-4-v0.3-q4_k_m.gguf

0 commit comments

Comments
 (0)