llama index - Search Milvus db before re-indexing a document - Stack Overflow

IT技术

更新时间：2025-01-0810

admin管理员组
文章数量:1122832

I am writing a function to index some resources using llamaindex and Milvus for the vector db.

When storing the data, I also include metadata for each resource that is ingested. I am trying to understand what is the correct way to avoid re-indexing all the documents every time I call my function. Only the documents missing from the index should be included. The idea was to use an id I am keeping in my metadata.

This is how I ingest and persist my data without checking if a document is already indexed:

documents = SimpleDirectoryReader(
        input_files=get_content_paths_list()
        file_metadata=get_metadata_paths_list(),
    ).load_data()

Settings.embed_model = HuggingFaceEmbedding(model_name="dunzhang/stella_en_1.5B_v5")

# ollama
Settings.llm = Ollama(model="llama3.2", request_timeout=360.0)

storage_context = StorageContext.from_defaults(
        vector_store=get_or_create_collection(dim=1024, collection_name="my_collection")
    )

index = VectorStoreIndex.from_documents(
        documents, storage_context=storage_context, show_progress=True
    )

本文标签： llama indexSearch Milvus db before reindexing a documentStack Overflow

版权声明：本文标题：llama index - Search Milvus db before re-indexing a document - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736304624a1932300.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

llama index - Search Milvus db before re-indexing a document - Stack Overflow

更多相关文章