-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ReOpen] Segmentation Recognition Fails When Using create_by_text in Parent-Child Mode in the Knowledge Base API #13013
Comments
The issue you're experiencing with segmentation recognition in the {
"name": "test-doc1",
"text": "content_part_1<sep>content_part_2\tcontent_part_3\n\ncontent_part_4\ncontent_part_5",
"indexing_technique": "high_quality",
"doc_form": "hierarchical_model",
"process_rule": {
"mode": "hierarchical",
"rules": {
"pre_processing_rules": [
{"id": "remove_extra_spaces", "enabled": false},
{"id": "remove_urls_emails", "enabled": false}
],
"segmentation": {
"separator": "whatever",
"max_tokens": 2000
},
"parent_mode": "paragraph",
"subchunk_segmentation": {
"separator": "***",
"max_tokens": 500
}
}
}
} Make sure to update your payload accordingly and try again. This should help in applying the custom segmentation settings correctly [1]. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
@dosu 真棒👍 |
谢谢!如果还有其他问题或需要进一步帮助,请随时告诉我。祝你好运! To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
@dosu 请帮我在dify项目代码中寻找一下,在什么地方计算的token数量?具体就是通过API做文档上传的过程。 我上传采用的接口是:create_by_text |
在 Dify 项目中, 关于 至于在文档上传过程中 token 数量的计算,代码库中并没有找到明确的文档或注释来解释这个过程,特别是在使用分层模型时 [1]。如果需要更详细的实现细节,建议查看相关的代码实现或 GitHub issue 中的讨论。 To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Self Checks
Dify version
0.15.2
Cloud or Self Hosted
Cloud
Steps to reproduce
@crazywoola Hi, I have reproduced the same issue on BOTH the cloud version and the latest version 0.15.2.
If there is a problem with my POST call, please point it out for me. Thank you.
Happy New Year.
POST:
Info: #13007
✔️ Expected Behavior
No response
❌ Actual Behavior
The text was updated successfully, but these errors were encountered: