You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found a similar open issue regarding the extraction order of paragraphs and tables from DOCX files. The problem arises because the code processes paragraphs and tables separately, which disrupts their original order. The suggested solution is to traverse the elements in the exact order they appear in the document, processing each paragraph and table as encountered. This approach ensures the extracted content retains the original sequence of the document [1].
Additionally, the WordExtractor class in the Dify platform's codebase provides a method to maintain the order by iterating over the document's elements and processing them based on their type (paragraph or table) [2][3].
Self Checks
Dify version
0.15.3
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
✔️ Expected Behavior
The order of paragraph and table should be in consistent with original document.
❌ Actual Behavior
The order of paragraph and table is not in consistent with original document.
The text was updated successfully, but these errors were encountered: