chromadb - InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 1024 - Stack Overflow

IT技术

更新时间：2025-04-140

admin管理员组
文章数量:1389627

Everytime i add my jsonl to a new chromadb it says that my vector shape 384 and it should be 1024.

Something seems to be going wrong with the chromadb insertion but I can't figure it out.

I start with jsonl with embeddings (size 1024), id, metadata, and document.

I have checked that all lines are 1024.

Then I check what chromadb collections I have (ensuring there are none), create a collection, and insert my jsonl into the collection. But then when I attempt to query: InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 1024.

I've deleted everything and tried to restart everything. What else should I try?

This is the insertion code:

import chromadb
import json

# Initialize the client (modify the path if using PersistentClient)
client = chromadb.PersistentClient(path="./chromadb_store")

# Load the JSONL file
documents = []
embeddings = []
metadatas = []
ids = []

# Read JSONL and extract data
with open('810_and_embeddings.jsonl', 'r') as f:
    for line in f:
        data = json.loads(line)
        documents.append(data['document'])
        embeddings.append(data['embedding'])
        metadatas.append(data['metadata'])
        # Convert id to string
        ids.append(str(data['id']))  # Ensure the ID is a string

# Create or load a collection
collection_name = "df_810"
try:
    collection = client.get_collection(name=collection_name)  # Try to load existing collection
    print(f"Collection '{collection_name}' loaded.")
except Exception as e:
    collection = client.create_collection(name=collection_name)  # If it doesn't exist, create it
    print(f"Collection '{collection_name}' created.")

# Set batch size for processing
batch_size = 100  # Adjust this based on memory constraints

# Function to insert data in batches
def insert_in_batches(documents, embeddings, metadatas, ids, batch_size):
    for i in range(0, len(documents), batch_size):
        # Get the batch slice
        batch_docs = documents[i:i + batch_size]
        batch_embeddings = embeddings[i:i + batch_size]
        batch_metadatas = metadatas[i:i + batch_size]
        batch_ids = ids[i:i + batch_size]
        
        # Add the batch to the collection
        collection.add(
            documents=batch_docs,
            embeddings=batch_embeddings,
            metadatas=batch_metadatas,
            ids=batch_ids
        )
        print(f"Inserted batch {i // batch_size + 1} of {len(documents) // batch_size + 1}")

# Insert data in batches
insert_in_batches(documents, embeddings, metadatas, ids, batch_size)

print("All documents and embeddings upserted successfully.")

本文标签：

版权声明：本文标题：chromadb - InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 1024 - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744636618a2616859.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

javascript - What exactly does runInNewContext do? - Stack Overflow

IT技术

26分钟前

I am currently learning some code-base, and it has used runInNewContext more often, I tried looking up

plugins - Creating one user access account for all the multiple sites

IT技术

24分钟前

I am new to WordPress and ill just state the situation.We have multiple websites created independently and I need to cre

How to deal with input arguments as part of a @BatchMapping in spring boot GraphQL - Stack Overflow

IT技术

23分钟前

Given a GraphQL schema that contains data like the followingtype Person {name: String!age: Int!friends

wp query - Weird orderby => post__in issue

IT技术

22分钟前

I seem to have run into a sort of odd issue with WP_Query post__in I can't quite figure out. ANY help is appreciate

Makefile which compresses javascript - Stack Overflow

IT技术

20分钟前

I want to press javascript in yui pressor, How to write Make file for press javascript.Because grammar

Insert Wicket value to a JavaScript function in Java - Stack Overflow

IT技术

18分钟前

I am writing a web app using HTML and Wicket.In my HTML page, I have a small alert script and I need th

javascript - How to claim interface using WebUSB? - Stack Overflow

IT技术

17分钟前

After obtaining access to an attached device using navigator.usb.requestDevice I'm trying to open

server - Is it possible to open a port in Javascript for communication on the network? - Stack Overflow

IT技术

16分钟前

I saw that Socket.IO and Websockets usually require NodeJS or similar but don't run on clients.Is

javascript - Inject dependencies into ES2015 module - Stack Overflow

IT技术

15分钟前

Is it possible to inject dependencies into ES2015 modules like in other programming languages like C# o

android - Reading two values from datastore flow correctly - Stack Overflow

IT技术

14分钟前

I have a screen where I wan't to show a loading circle until I readed the default text values for

How would I use JavaScript to retrieve text from a different website? - Stack Overflow

IT技术

13分钟前

Disclaimer: please excuse the religious aspects of this post.Anyway, the title says what I'm looki

php - Use get() method to grab all categories and output inside another method

IT技术

11分钟前

I have the below get method that I would like to use as a template to output my category names into an output method:pub

typescript - How to add a marker on Angular for Leaflet Routing Machine - Stack Overflow

IT技术

11分钟前

I currently have this code here where I want to add custom icon that I will design.plotRoute(): void {

javascript - jQuery Highlight element on select option - Stack Overflow

IT技术

10分钟前

I have a list of values in a drop down style select box e.g.<select id="places"><op

tailwind css - Storybook installation guide is only available for TailwindCSS v3, how can I install it with TailwindCSS v4? - St

IT技术

8分钟前

Navigating to the TailwindCSS installation section in the Storybook documentation quickly reveals that

javascript - Angular - get synchronously from Promise - Stack Overflow

IT技术

8分钟前

I want to print history of products. I have an id of product in ActivatedRoute.params. In ngOnInit meth

amazon web services - AWS OpenSearch-Cognito authenticationauthorization for Lambda functions - Stack Overflow

IT技术

5分钟前

We have an AWS OpenSearch cluster with cognito userPoolidentityPool integrated with it. So when we go

jq - Include element only if present - Stack Overflow

IT技术

4分钟前

I am implementing a program accessing a REST API, which can be filtered server-side by passing in a jq-

javascript - Enforcing strict mode in google-apps-script files using Chrome on Chromebook - Stack Overflow

IT技术

4分钟前

In google-apps-script script file is it possible to set 'use strict'?I've created a fu

javascript - How to import a File path in React Js? - Stack Overflow

IT技术

3分钟前

I have to import multiple files in my App.js . My folder structure is srcponents layout.jsApp

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

chromadb - InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 1024 - Stack Overflow

更多相关文章

javascript - What exactly does runInNewContext do? - Stack Overflow

plugins - Creating one user access account for all the multiple sites

How to deal with input arguments as part of a @BatchMapping in spring boot GraphQL - Stack Overflow

wp query - Weird orderby =&gt; post__in issue

Makefile which compresses javascript - Stack Overflow

Insert Wicket value to a JavaScript function in Java - Stack Overflow

javascript - How to claim interface using WebUSB? - Stack Overflow

server - Is it possible to open a port in Javascript for communication on the network? - Stack Overflow

javascript - Inject dependencies into ES2015 module - Stack Overflow

android - Reading two values from datastore flow correctly - Stack Overflow

How would I use JavaScript to retrieve text from a different website? - Stack Overflow

php - Use get() method to grab all categories and output inside another method

typescript - How to add a marker on Angular for Leaflet Routing Machine - Stack Overflow

javascript - jQuery Highlight element on select option - Stack Overflow

tailwind css - Storybook installation guide is only available for TailwindCSS v3, how can I install it with TailwindCSS v4? - St

javascript - Angular - get synchronously from Promise - Stack Overflow

amazon web services - AWS OpenSearch-Cognito authenticationauthorization for Lambda functions - Stack Overflow

jq - Include element only if present - Stack Overflow

javascript - Enforcing strict mode in google-apps-script files using Chrome on Chromebook - Stack Overflow

javascript - How to import a File path in React Js? - Stack Overflow

发表评论

推荐文章

javascript - JSON.parse() not working on jQuery data object - Stack Overflow

javascript - Google Closure Compiler, how to handle JSC_INEXISTENT_PROPERTY gracefully? - Stack Overflow

javascript - Get parent document origin for cross-domain HTTPS iFrame - Stack Overflow

embedded - C Bare Metal LEDC module timer on ESP32 (emulated on wokwi) - Stack Overflow

gleam - Make HTTP API call and decode JSON payload - Stack Overflow

热门文章

javascript - Add a custom label to the top or bottom of a stacked bar chart - Stack Overflow

plugins - I need to show side menu in wordpress

javascript - Node.js - Transfer Large Files Without Consuming A Lot of Memory - Stack Overflow

javascript - Detecting when CAPS LOCK is ON - Stack Overflow

javascript - Is there an alternative to declaring a slot name in light DOM? - Stack Overflow

ios - Change the background color of weekdays along with the selected date in FSCalendar? - Stack Overflow

javascript - I don&#39;t want spaces after commas to be fixed width - best practise? - Stack Overflow

warnings - I have this error notice &#39;wp_enqueue_script was called incorrectly&#39; in my plugin

oauth - How to Securely Store Google Drive API Client ID &amp; Secret in a Flutter App? - Stack Overflow

ios - What is a ubiquity container in iCloud Documents storage? - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

html - Javascript: How can I set the cursor during a drag &amp; drop operation on a website? - Stack Overflow

python - Access text elements with dynamically loaded classes using Selenium - Stack Overflow

javascript - How can I place text on a resizing image that also resizes? - Stack Overflow

javascript - How can I make conditional Header with React - Stack Overflow

pagination - Is it possible to paginate posts correctly that are random ordered?

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

wp query - Weird orderby => post__in issue

javascript - I don't want spaces after commas to be fixed width - best practise? - Stack Overflow

warnings - I have this error notice 'wp_enqueue_script was called incorrectly' in my plugin

oauth - How to Securely Store Google Drive API Client ID & Secret in a Flutter App? - Stack Overflow

html - Javascript: How can I set the cursor during a drag & drop operation on a website? - Stack Overflow