admin管理员组

文章数量:1287580

OS: Ubuntu 20.04.6 LTS

Python library mariadb versions: 1.1.10 and 1.1.12 (current)

MariaDB server versions: 10.3 and 11.7.1-rc

The following script raises the error mariadb.OperationalError: Can't create more than max_prepared_stmt_count statements (current value: 16382), which arises when the number of stored prepared statements in the backend database exceed a defined limit.

I could reproduce and would expect this error if I kept a list cursor objects, one per call and explicitly did not close the cursors.

However since I am using a with block to manage the cursor object, it should be closing (and indeed the error persists even when I explicity add cursor.close() to the for loop).

(The error occurs after 16382 calls of cursor.executemany)

import mariadb
import json

test_connection_params = {
    'user': MARIADB_USERNAME,
    'password': MARIADB_PASSWORD,
    'host': MARIADB_HOST_ADDRESS,
    'port': 3306,
    'database': MARIADB_DATABASE_NAME
}

def main():
    conn = mariadb.connect(**test_connection_params)

    with conn.cursor() as cursor:
        cursor.execute("""
            CREATE TABLE IF NOT EXISTS my_table (
                id INT AUTO_INCREMENT PRIMARY KEY,
                name VARCHAR(128),
                description VARCHAR(2000),
                embedding TEXT NOT NULL
            );
        """)
        connmit()

    data = [(f"Product {i}", f"Description for product {i}", json.dumps([i*0.01, i*0.02, i*0.03, i*0.04])) for i in range(1_000_000)]
    query = """
    INSERT INTO my_table (name, description, embedding)
    VALUES (?, ?, ?)
    """

    chunk_size = 10

    for i in range(0, len(data), chunk_size):
        with conn.cursor() as cursor:
            cursor.executemany(query, data[i:i + chunk_size])
            connmit()

    conn.close()

    cursor.close()
    conn.close()

This error does not occur when using the mypysql connector with similar logic.

The error also does not occur when I swap the TEXT field for the new MariaDB 11.7.1-rc vector feature:

def main():
    conn = mariadb.connect(**test_connection_params)

    with conn.cursor() as cursor:
        query = """
        CREATE TABLE IF NOT EXISTS products (
            id INT AUTO_INCREMENT PRIMARY KEY,
            name VARCHAR(128),
            description VARCHAR(2000),
            embedding VECTOR(4) NOT NULL
        );
        """
        cursor.execute(query)
        connmit()

    conn = mariadb.connect(**test_connection_params)

    data = [(f"Product {i}", f"Description for product {i}", json.dumps([i*0.01, i*0.02, i*0.03, i*0.04])) for i in range(1_000_000)]
    query = "INSERT INTO products (name, description, embedding) VALUES (?, ?, VEC_FromText(?))"

    chunk_size = 10
    for i in range(0, len(data), chunk_size):
        with conn.cursor() as cursor:
            cursor.executemany(query, data[i:i + chunk_size])
            connmit()

    conn.close()

However oddly, the error returns using the same logic that was just working, when I wrap the MariaDB part of the code into a Python Object:

class DatabaseManager:
    def __init__(self, config):
        self.connection = mariadb.connect(**config)

    def create_table_if_not_exists(self):
        query = """
        CREATE TABLE IF NOT EXISTS products (
            id INT AUTO_INCREMENT PRIMARY KEY,
            name VARCHAR(128),
            description VARCHAR(2000),
            embedding VECTOR(4) NOT NULL
        );
        """
        with self.connection.cursor() as cursor:
            cursor.execute(query)
            self.connectionmit()

    def insert_rows(self, chunk):
        query = """
        INSERT INTO products (name, description, embedding)
        VALUES (?, ?, VEC_FromText(?))
        """
        with self.connection.cursor() as cursor:
            cursor.executemany(query, chunk)

    def close_connection(self):
        if self.connection:
            self.connection.close()

def main():
    db_manager = DatabaseManager(test_connection_params)
    db_manager.create_table_if_not_exists()

    data = [(f"Product {i}", f"Description for product {i}", json.dumps([i * 0.01, i * 0.02, i * 0.03, i * 0.04])) for i
            in range(1_000_000)]

    chunk_size = 10
    for i in range(0, len(data), chunk_size):
        db_manager.insert_rows(data[i:i + chunk_size])

    db_manager.close_connection()

OS: Ubuntu 20.04.6 LTS

Python library mariadb versions: 1.1.10 and 1.1.12 (current)

MariaDB server versions: 10.3 and 11.7.1-rc

The following script raises the error mariadb.OperationalError: Can't create more than max_prepared_stmt_count statements (current value: 16382), which arises when the number of stored prepared statements in the backend database exceed a defined limit.

I could reproduce and would expect this error if I kept a list cursor objects, one per call and explicitly did not close the cursors.

However since I am using a with block to manage the cursor object, it should be closing (and indeed the error persists even when I explicity add cursor.close() to the for loop).

(The error occurs after 16382 calls of cursor.executemany)

import mariadb
import json

test_connection_params = {
    'user': MARIADB_USERNAME,
    'password': MARIADB_PASSWORD,
    'host': MARIADB_HOST_ADDRESS,
    'port': 3306,
    'database': MARIADB_DATABASE_NAME
}

def main():
    conn = mariadb.connect(**test_connection_params)

    with conn.cursor() as cursor:
        cursor.execute("""
            CREATE TABLE IF NOT EXISTS my_table (
                id INT AUTO_INCREMENT PRIMARY KEY,
                name VARCHAR(128),
                description VARCHAR(2000),
                embedding TEXT NOT NULL
            );
        """)
        connmit()

    data = [(f"Product {i}", f"Description for product {i}", json.dumps([i*0.01, i*0.02, i*0.03, i*0.04])) for i in range(1_000_000)]
    query = """
    INSERT INTO my_table (name, description, embedding)
    VALUES (?, ?, ?)
    """

    chunk_size = 10

    for i in range(0, len(data), chunk_size):
        with conn.cursor() as cursor:
            cursor.executemany(query, data[i:i + chunk_size])
            connmit()

    conn.close()

    cursor.close()
    conn.close()

This error does not occur when using the mypysql connector with similar logic.

The error also does not occur when I swap the TEXT field for the new MariaDB 11.7.1-rc vector feature:

def main():
    conn = mariadb.connect(**test_connection_params)

    with conn.cursor() as cursor:
        query = """
        CREATE TABLE IF NOT EXISTS products (
            id INT AUTO_INCREMENT PRIMARY KEY,
            name VARCHAR(128),
            description VARCHAR(2000),
            embedding VECTOR(4) NOT NULL
        );
        """
        cursor.execute(query)
        connmit()

    conn = mariadb.connect(**test_connection_params)

    data = [(f"Product {i}", f"Description for product {i}", json.dumps([i*0.01, i*0.02, i*0.03, i*0.04])) for i in range(1_000_000)]
    query = "INSERT INTO products (name, description, embedding) VALUES (?, ?, VEC_FromText(?))"

    chunk_size = 10
    for i in range(0, len(data), chunk_size):
        with conn.cursor() as cursor:
            cursor.executemany(query, data[i:i + chunk_size])
            connmit()

    conn.close()

However oddly, the error returns using the same logic that was just working, when I wrap the MariaDB part of the code into a Python Object:

class DatabaseManager:
    def __init__(self, config):
        self.connection = mariadb.connect(**config)

    def create_table_if_not_exists(self):
        query = """
        CREATE TABLE IF NOT EXISTS products (
            id INT AUTO_INCREMENT PRIMARY KEY,
            name VARCHAR(128),
            description VARCHAR(2000),
            embedding VECTOR(4) NOT NULL
        );
        """
        with self.connection.cursor() as cursor:
            cursor.execute(query)
            self.connectionmit()

    def insert_rows(self, chunk):
        query = """
        INSERT INTO products (name, description, embedding)
        VALUES (?, ?, VEC_FromText(?))
        """
        with self.connection.cursor() as cursor:
            cursor.executemany(query, chunk)

    def close_connection(self):
        if self.connection:
            self.connection.close()

def main():
    db_manager = DatabaseManager(test_connection_params)
    db_manager.create_table_if_not_exists()

    data = [(f"Product {i}", f"Description for product {i}", json.dumps([i * 0.01, i * 0.02, i * 0.03, i * 0.04])) for i
            in range(1_000_000)]

    chunk_size = 10
    for i in range(0, len(data), chunk_size):
        db_manager.insert_rows(data[i:i + chunk_size])

    db_manager.close_connection()
Share Improve this question asked Feb 24 at 18:30 waveform_tinkerwaveform_tinker 435 bronze badges 8
  • Database server resources are per-connection, not per-cursor. – Barmar Commented Feb 24 at 20:30
  • Cursors are client-side only, the server doesn't know anything about them. – Barmar Commented Feb 24 at 20:31
  • @Barmar Yes, I think this is a problem with the MariaDB Connector/Python, or MariaDB Connector/C. But I'm unsure if this behavior is intended, and if so, how I'm meant to use the Connector to release the prepared statements without resetting the connection. – waveform_tinker Commented Feb 24 at 21:23
  • I'm pretty sure it's by design and works similarly for all connectors. Cursors keep the state of a particular query, not a connection. – Barmar Commented Feb 24 at 21:25
  • 1 @Bamar I think you're right and I've submitted a bug report on the MariaDB Jira site. – waveform_tinker Commented Feb 24 at 21:45
 |  Show 3 more comments

1 Answer 1

Reset to default 2

Update: Sorry, I had to rewrite my answer, since the initial answer was not correct.

You're right, for some reason the statement handle in context manager is not released properly. I already have a fix, however I'm dealing with some problems in case a reconnect occurred or the underlying connection was closed or dropped (here MariaDB Connector/C invalidates the statements, which makes it difficult to free the resources). I will update my answer as soon the fix was pushed.

As a workaround (and more efficient solution) you should put the for loop within the context:

with conn.cursor() as cursor:
    for i in range(0, len(data), chunk_size):
        cursor.executemany(query, data[i:i + chunk_size])
        connmit()

This will save memory and reduce network traffic. In case of execute() it would also avoid reparsing the same statement both on client and server side.

本文标签: MariaDB ConnectorPythonPrepared Statements Not ReleasingStack Overflow