Skip to content

process_rule mode does not contemplate the hierarchical option #12506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
pr-maia opened this issue Jan 8, 2025 · 3 comments
Closed
4 tasks done

process_rule mode does not contemplate the hierarchical option #12506

pr-maia opened this issue Jan 8, 2025 · 3 comments
Labels
📚 documentation Improvements or additions to documentation stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed

Comments

@pr-maia
Copy link

pr-maia commented Jan 8, 2025

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Provide a description of requested docs changes

The documentation states that the mode parameter of process_rules can take on two values (automatic or custom) when in fact if we want to use the parent-child we should also change this to hierarchical.

If I use all the other parameters correctly, such as doc_form assuming hierarchical_model and parent_mode assuming paragraph, the document is still not processed as parent/child but as general.

image

Copy link

dosubot bot commented Jan 8, 2025

The process_rules function in the Dify codebase does indeed support a hierarchical mode. This mode is specifically configured within the getProcessRule function. When the hierarchical mode is selected, the function sets up a ProcessRule object that includes pre-processing rules, segmentation for parent chunks, and subchunk segmentation for child chunks. The configuration uses delimiters and maximum token lengths specified in parentChildConfig for both parent and child segments, and the parent_mode is determined by parentChildConfig.chunkForContext [1].

If you're experiencing issues with the hierarchical mode not functioning as expected, it might be worth checking the configuration of parentChildConfig and ensuring that all related parameters are correctly set. This includes verifying that doc_form is set to hierarchical_model and parent_mode is set to the desired value, such as paragraph. If these configurations are correct and the issue persists, it might be necessary to review the implementation details or any related modules that interact with process_rules.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the 📚 documentation Improvements or additions to documentation label Jan 8, 2025
@crazywoola
Copy link
Member

We had a fix #12495 here. And we will upgrade the docs soon.

Copy link

dosubot bot commented Feb 8, 2025

Hi, @pr-maia. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You raised a concern about the missing "hierarchical" option in the process_rules function documentation.
  • I confirmed the support for the hierarchical mode and provided guidance.
  • @crazywoola mentioned a fix in a pull request to update the documentation.
  • You acknowledged the resolution with a thumbs-up reaction.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the Dify repository. If so, feel free to comment to keep the discussion open.
  • If there are no further updates, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📚 documentation Improvements or additions to documentation stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Projects
None yet
Development

No branches or pull requests

2 participants