admin管理员组文章数量:1332339
I am using Milvus as document store with Haystack.
MilvusDocumentStore connection object called with:
@lru_cache
def get_vector_db():
# Get document store from database
return MilvusDocumentStore(
connection_args={
"uri": get_settings().milvus_db_path
}, # Milvus Lite
drop_old=True
)
And the Haystack pipeline defined as below:
file_type_router = FileTypeRouter(
mime_types=[
"text/plain"
]
)
# Converter plain text files to Document objects
text_converter = TextFileToDocument()
# Join Documents coming from different branches of a pipeline
document_joiner = DocumentJoiner()
# Clean the text of the documents
document_cleaner = DocumentCleaner()
# Split the documents into smaller documents
document_splitter = DocumentSplitter(split_by="sentence", split_length=2)
# Create embeddings from the Documents
document_embedder = SentenceTransformersDocumentEmbedder(
model="sentence-transformers/all-MiniLM-L6-v2"
)
# Write the documents to the DocumentStore
document_writer = DocumentWriter(document_store, policy=DuplicatePolicy.NONE)
# Build the Indexing pipeline
preprocessing_pipeline = Pipeline()
preprocessing_pipeline.add_component(
name="file_type_router", instance=file_type_router
)
preprocessing_pipeline.add_component(name="text_converter", instance=text_converter)
preprocessing_pipeline.add_component(
name="document_joiner", instance=document_joiner
)
preprocessing_pipeline.add_component(
name="document_cleaner", instance=document_cleaner
)
preprocessing_pipeline.add_component(
name="document_splitter", instance=document_splitter
)
preprocessing_pipeline.add_component(
name="document_embedder", instance=document_embedder
)
preprocessing_pipeline.add_component(
name="document_writer", instance=document_writer
)
# Connect components
preprocessing_pipeline.connect(
"file_type_router.plain/text, "text_converter.sources"
)
preprocessing_pipeline.connect("text_converter", "document_joiner")
preprocessing_pipeline.connect("document_joiner", "document_cleaner")
preprocessing_pipeline.connect("document_cleaner", "document_splitter")
preprocessing_pipeline.connect("document_splitter", "document_embedder")
preprocessing_pipeline.connect("document_embedder", "duplicate_checker")
preprocessing_pipeline.connect(
"duplicate_checker.documents_to_index", "document_writer.documents"
)
When I try to write to the db I get the following error:
Failed to create collection: HaystackCollection error: <MilvusException:
(code=2000, message=Assert "!name_ids_.count(field_name)" at
/Users/zilliz/milvus-lite/thirdparty/milvus/internal/core/src/common/Schema.h:172
=> duplicated field name: segcore error)>
ERROR: Exception in ASGI application
Following the error with the debugger seems like there is an attempt to recreate the default database collection.
本文标签: vector databaseDuplicate Collection error with Melvis and HaystackStack Overflow
版权声明:本文标题:vector database - Duplicate Collection error with Melvis and Haystack - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742333249a2455114.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论