admin管理员组

文章数量:1277401

I'm using the following code to try to test insert data into ChromaDB, I want to insert embeddings manually, so I use the embeddings parameter in the add method, but the embeddings in the database is always None.

import chromadb
import numpy as np

client = chromadb.Client()
collection = client.create_collection(name="test")

def get_embedding(text):
    a = np.random.rand(384) 
    print(f"Generated embedding: {a}")
    return a 

documents = ["This is a document.", "Another document.", "And a third document."]
embeddings = [get_embedding(doc) for doc in documents]

print(f"Embeddings to be added: {embeddings}")

collection.add(
    documents=documents,
    embeddings=embeddings,  
    ids=["doc1", "doc2", "doc3"]
)

query_embedding = get_embedding("This is a query.")
results = collection.query(query_embeddings=[query_embedding], n_results=1)
print("Query results:", results)

output:

Generated embedding: [5.94073439e-01 3.85563925e-01 ......... ]
Query results: {'ids': [['doc3']], 'embeddings': None, 'documents': [['And a third document.']], 'uris': None, 'data': None, 'metadatas': [[None]], 'distances': [[62.61347579956055]], 'included': [<IncludeEnum.distances: 'distances'>, <IncludeEnum.documents: 'documents'>, <IncludeEnum.metadatas: 'metadatas'>]}

I'm using the following code to try to test insert data into ChromaDB, I want to insert embeddings manually, so I use the embeddings parameter in the add method, but the embeddings in the database is always None.

import chromadb
import numpy as np

client = chromadb.Client()
collection = client.create_collection(name="test")

def get_embedding(text):
    a = np.random.rand(384) 
    print(f"Generated embedding: {a}")
    return a 

documents = ["This is a document.", "Another document.", "And a third document."]
embeddings = [get_embedding(doc) for doc in documents]

print(f"Embeddings to be added: {embeddings}")

collection.add(
    documents=documents,
    embeddings=embeddings,  
    ids=["doc1", "doc2", "doc3"]
)

query_embedding = get_embedding("This is a query.")
results = collection.query(query_embeddings=[query_embedding], n_results=1)
print("Query results:", results)

output:

Generated embedding: [5.94073439e-01 3.85563925e-01 ......... ]
Query results: {'ids': [['doc3']], 'embeddings': None, 'documents': [['And a third document.']], 'uris': None, 'data': None, 'metadatas': [[None]], 'distances': [[62.61347579956055]], 'included': [<IncludeEnum.distances: 'distances'>, <IncludeEnum.documents: 'documents'>, <IncludeEnum.metadatas: 'metadatas'>]}
Share Improve this question edited Feb 24 at 10:43 Ethan asked Feb 24 at 10:28 EthanEthan 12 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

change the query code from

results = collection.query(query_embeddings=[query_embedding], n_results=1)

to

results = collection.query(query_embeddings=[query_embedding], n_results=1, include=["embeddings"])

would fix the problem

本文标签: pythonEmbedding found None After ChromaDB insertionStack Overflow