Skip to content

[Feature] Connector prepare for RAG #9713

@Hisoka-X

Description

@Hisoka-X

Search before asking

  • I had searched in the feature and found no similar feature requirement.

Description

As a multimodal data integration tool, we hope that SeaTunnel can support parsing complex file types, converting their contents into structured file streams, and ultimately writing them into a vector library through embedding. This issue tracks related tasks.

For chunking please refer
Please refer https://docs.dify.ai/en/guides/knowledge-base/create-knowledge-and-upload-documents/chunking-and-cleaning-text
and
https://docs.llamaindex.ai/en/stable/examples/node_parsers/semantic_chunking/

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions