python - How to improve query execution timing and indexing better in Ollama with llama_index Using a Local Model? - Stack Overf

IT技术

更新时间：2025-01-089

admin管理员组
文章数量:1122846

I'm working on a chatbot using LlamaIndex based on Ollama LLM. I have a set of pdf files, I'm creating a chatbot to read those files and answer the queries from those. Initially, I used this model

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")

but it took a lot of time to index.

Now I'm using this

model_name="distilbert-base-uncased"

this "distilbert-base-uncased" one reduced my index timing but the answers I'm getting for my queries are not great. Also, the query processing time is longer (1 min to 3 mins).

I want to improve my query processing time and better answers. I'm new to this field(LLM), so I apologize if my explanation is unclear.

My Output with timing:

Starting to read documents... Time taken for document loading: 38.18 seconds Document reading completed. Setting up the embedding and language models... No sentence-transformers model found with name distilbert-base-uncased. Creating a new one with mean pooling. Time taken for embedding model and language model setup: 1.84 seconds Model setup completed.Starting index creation... Time taken for index creation: 305.35 seconds Index creation completed. Enter your query (or type 'exit' to quit): what is Generics and give me the example program for Generic Classes? Do you want to proceed with this query? (yes to proceed, no to skip): yes Response: The ability .....
Query executed in 100.83 seconds. Start Time: 2024-11-22 16:59:49 End Time: 2024-11-22 17:01:29

import time
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama

def log_time(start_time, action):
"""Helper function to log the time taken for a specific action."""
elapsed_time = time.time() - start_time
print(f"Time taken for {action}: {elapsed_time:.2f} seconds")
return elapsed_time

# Load documents from the 'data' directory
def load_documents(directory):
print("Starting to read documents...")
start_time = time.time()  # Start time for document loading
documents = SimpleDirectoryReader(directory).load_data()
log_time(start_time, "document loading")
print("Document reading completed.")
return documents

# Set up embedding model and language model
def setup_models():
print("Setting up the embedding and language models...")
start_time = time.time()  # Start time for model setup
Settings.embed_model = HuggingFaceEmbedding(model_name="distilbert-base-uncased")
Settings.llm = Ollama(model="llama3", request_timeout=3600.0)
log_time(start_time, "embedding model and language model setup")
print("Model setup completed.")

# Create an index from the loaded documents
def create_index(documents):
print("Starting index creation...")
start_time = time.time()  # Start time for index creation
index = VectorStoreIndex.from_documents(documents)
log_time(start_time, "index creation")
print("Index creation completed.")
return index

# Query the index
def query_engine_loop(query_engine):
while True:
    user_query = input("Enter your query (or type 'exit' to quit): ")
    if user_query.lower() == 'exit':
        print("Exiting the query engine.")
        break

    confirm = input("Do you want to proceed with this query? (yes to proceed, no to skip): ").strip().lower()
    if confirm == 'yes':
        start_time = time.time()  # Start time for query execution
        response = query_engine.query(user_query)
        end_time = time.time()  # End time for query execution

        print("Response:", response)
        print(f"Query executed in {end_time - start_time:.2f} seconds.")
        print(f"Start Time: {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))}")
        print(f"End Time: {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))}")
    elif confirm == 'no':
        print("Query skipped.")
    else:
        print("Invalid input. Please type 'yes' or 'no'.")

def main():
# Start by loading documents
documents = load_documents("data")

# Setup models
setup_models()

# Create index
index = create_index(documents)

# Create a query engine from the index
query_engine = index.as_query_engine()

# Start querying loop
query_engine_loop(query_engine)

if __name__ == "__main__":
main()

Thanks!!!

本文标签：

版权声明：本文标题：python - How to improve query execution timing and indexing better in Ollama with llama_index Using a Local Model? - Stack Overf 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736304033a1932091.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

python - How to improve query execution timing and indexing better in Ollama with llama_index Using a Local Model? - Stack Overf

更多相关文章

Windows 11最稳定版本详解

python 3.x - AWS Lambda code to connect with EKS cluster - Stack Overflow

active directory - samba-tool GPO scripts - Stack Overflow

Custom Labelling in Multi-Class Classification in XGBoost LightGBM - Stack Overflow

swift - Cannot launch maps in CarPlay from my app - Stack Overflow

javascript - &quot;QUOTA_BYTES quota exceeded&quot; error in React app using IndexedDB - Stack Overflow

assembly - Calling the world&#39;s simplest NASM function from C - segfault - Stack Overflow

javascript - Odoo CORS Access Issue - Stack Overflow

python - Calling AIOKafkaConsumer via FastAPI raises &quot;object should be created within an async function or provide loop

Creating a listener for Branch.io deferred deep link in .NET MAUI - Stack Overflow

kubernetes - istio canary strategy with dynamic routing rules with different apps - Stack Overflow

asp.net core - aspnetboilerplate InvalidOperationException - Stack Overflow

apache kafka - Unknown feature gate KafkaNodePools found in the configuration - Stack Overflow

multithreading - C++ thread exiting without a notice -- need help debugging with gdb - Stack Overflow

c# - How to replace values in an array that is inside of a class? - Stack Overflow

Kubernetes: How can I run pods but reference of Volume on a different node? - Stack Overflow

The POST request body as part of the cache key in NGINX caching is not working as expected - Stack Overflow

javascript - event.originalEvent.propertyName not working in any none Chromium browsers - Stack Overflow

c# - Dataverse plugin accessing APIs inside company&#39;s Azure Tenant: error? - Stack Overflow

hcl - How to create parallel builds foreach item in list using packer template - Stack Overflow

发表评论

推荐文章

Windows 7环境中Python3.7版本下Pyinstaller的安装

Display WooCommerce product attribute on shop page

Show comments from multiple post IDs in comment template

amazon s3 - Not able to access aws cognito identity pool id from flutter app - Stack Overflow

Excel VBA ClearContents Dynamically - Stack Overflow

热门文章

linux - Docker : How to access base image environment variables at build time - Stack Overflow

Apple Musickit JS not playing some songs on Chrome or Firefox browsers (working in Safari) - Stack Overflow

reactjs - Rendering &lt;Context&gt; directly is not supported and will be removed in a future major release. Did you mea

Windows 7核心图形架构细致分析

css - Background image call problem

query - Display data from phpMyAdmin with WordPress

sveltekit - KaTeX mhchem Extension Not Working in Svelte - Unable to Render ce{} Command - Stack Overflow

database - Oracle PLSQL - Iterate table of objects through reflection - Stack Overflow

spring boot - java API GET with WebFlux - Stack Overflow

In C#, can you declare multiple function arguments as a single type? - Stack Overflow

最新文章

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

华硕笔记本电脑用U盘重装windows系统

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

如何一键安装win7系统(一键安装win7系统步骤)

Windows 11最稳定版本详解

winapi - Win32 DrawText() ignores text color set on the device context and draws text in background color - Stack Overflow

How to get Graalvm to convert AWT Java program to exe - Stack Overflow

Embedding of sequence of events sets - Stack Overflow

hcl - How to create parallel builds foreach item in list using packer template - Stack Overflow

react hooks - My browser localstorage clears everytime i refresh - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

javascript - "QUOTA_BYTES quota exceeded" error in React app using IndexedDB - Stack Overflow

assembly - Calling the world's simplest NASM function from C - segfault - Stack Overflow

python - Calling AIOKafkaConsumer via FastAPI raises "object should be created within an async function or provide loop

c# - Dataverse plugin accessing APIs inside company's Azure Tenant: error? - Stack Overflow

reactjs - Rendering <Context> directly is not supported and will be removed in a future major release. Did you mea