类库 › torchv-unstructured
torchv

torchv/torchv-unstructured

TorchV Unstructured是专为RAG应用优化的Java文档解析库。基于Apache Tika等标准库,支持DOC、PDF等多格式,具备智能表格识别、内容提取及Markdown/HTML导出功能,旨在为AI/ML管道提供高效、低内存占用的结构化数据输出。

技术栈

根目录 java

查看全部依赖 (10)

依赖

cn.hutool:hutool-all junit:junit org.apache.poi:poi-ooxml-full org.apache.tika:tika-core org.apache.tika:tika-parsers-standard-package org.bouncycastle:bcpkix-jdk18on org.bouncycastle:bcprov-jdk18on org.bouncycastle:bcutil-jdk18on org.projectlombok:lombok org.slf4j:slf4j-simple

评论

inicio - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.2. UTC+08:00, 2026-05-12 23:28
浙ICP备14020137号-1 $mapa de visitantes$