admin管理员组

文章数量:1180511

What I did

  • I created two tables participants and participant_dates
  • in one transaction, I insert one participant "A" and 30 participant_dates of A

below is my python code (full)

import asyncio
from datetime import date
from uuid import uuid4

import asyncpg

async def get_connection():
    return await asyncpg.connect(
        user='postgres',
        password='1234',
        database='postgres',
        host='127.0.0.1',
        port='5432'
    )


async def bulk() -> None:
    """
    CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

    CREATE TABLE IF NOT EXISTS participants (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        name VARCHAR(255) NOT NULL
    );

    CREATE TABLE IF NOT EXISTS participant_dates (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        participant_id UUID REFERENCES participants(id),
        date DATE
    );
    """
    conn = await get_connection()
    try:
        async with conn.transaction(isolation="serializable"):
            id_ = await conn.fetchval(
                f""" 
            INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
            """
            )
            await conn.executemany(
                """
                INSERT INTO participant_dates (participant_id, date) VALUES (
                    $1, 
                    $2
                ); 
                """,
                [
                    (id_, date(2025, 1, 1)),
                    (id_, date(2025, 1, 2)),
                    (id_, date(2025, 1, 3)),
                    (id_, date(2025, 1, 4)),
                    (id_, date(2025, 1, 5)),
                    (id_, date(2025, 1, 6)),
                    (id_, date(2025, 1, 7)),
                    (id_, date(2025, 1, 8)),
                    (id_, date(2025, 1, 9)),
                    (id_, date(2025, 1, 10)),
                    (id_, date(2025, 1, 11)),
                    (id_, date(2025, 1, 12)),
                    (id_, date(2025, 1, 13)),
                    (id_, date(2025, 1, 14)),
                    (id_, date(2025, 1, 15)),
                    (id_, date(2025, 1, 16)),
                    (id_, date(2025, 1, 17)),
                    (id_, date(2025, 1, 18)),
                    (id_, date(2025, 1, 19)),
                    (id_, date(2025, 1, 20)),
                ]
            )

    except Exception as e:
        print(e)


async def main() -> None:
    # 50 concurrent call
    await asyncio.gather(
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
    )

asyncio.run(main())

What happened

bunch of below errors appeared.

could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on conflict out to pivot 882, during read.
HINT:  The transaction might succeed if retried.

What I want to do.

  • I can not lower transaction level (I'm using edgedb and it forces SERIALIZABLE)
  • I want to avoid error.
  • and I want to maintain full concurrency (50 or more concurrent call)

Question

  • could you please explain why there are conflicts? I use uuid (it's not auto incremented integer so it doesn't require lock I guess) and they are all transactions inserting different rows. where do (lock) conflicts come from?

P.S.

my problem is simillar to Why does PostgreSQL serializable transaction think this as conflict? but even if I added

            await conn.execute(
                "SET enable_seqscan = off;"
            )
            id_ = await conn.fetchval(
                f"""
            INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
            """
...
            )

error stays the same

could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on commit attempt with conflict in from prepared pivot.
HINT:  The transaction might succeed if retried.

What I did

  • I created two tables participants and participant_dates
  • in one transaction, I insert one participant "A" and 30 participant_dates of A

below is my python code (full)

import asyncio
from datetime import date
from uuid import uuid4

import asyncpg

async def get_connection():
    return await asyncpg.connect(
        user='postgres',
        password='1234',
        database='postgres',
        host='127.0.0.1',
        port='5432'
    )


async def bulk() -> None:
    """
    CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

    CREATE TABLE IF NOT EXISTS participants (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        name VARCHAR(255) NOT NULL
    );

    CREATE TABLE IF NOT EXISTS participant_dates (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        participant_id UUID REFERENCES participants(id),
        date DATE
    );
    """
    conn = await get_connection()
    try:
        async with conn.transaction(isolation="serializable"):
            id_ = await conn.fetchval(
                f""" 
            INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
            """
            )
            await conn.executemany(
                """
                INSERT INTO participant_dates (participant_id, date) VALUES (
                    $1, 
                    $2
                ); 
                """,
                [
                    (id_, date(2025, 1, 1)),
                    (id_, date(2025, 1, 2)),
                    (id_, date(2025, 1, 3)),
                    (id_, date(2025, 1, 4)),
                    (id_, date(2025, 1, 5)),
                    (id_, date(2025, 1, 6)),
                    (id_, date(2025, 1, 7)),
                    (id_, date(2025, 1, 8)),
                    (id_, date(2025, 1, 9)),
                    (id_, date(2025, 1, 10)),
                    (id_, date(2025, 1, 11)),
                    (id_, date(2025, 1, 12)),
                    (id_, date(2025, 1, 13)),
                    (id_, date(2025, 1, 14)),
                    (id_, date(2025, 1, 15)),
                    (id_, date(2025, 1, 16)),
                    (id_, date(2025, 1, 17)),
                    (id_, date(2025, 1, 18)),
                    (id_, date(2025, 1, 19)),
                    (id_, date(2025, 1, 20)),
                ]
            )

    except Exception as e:
        print(e)


async def main() -> None:
    # 50 concurrent call
    await asyncio.gather(
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
        bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
    )

asyncio.run(main())

What happened

bunch of below errors appeared.

could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on conflict out to pivot 882, during read.
HINT:  The transaction might succeed if retried.

What I want to do.

  • I can not lower transaction level (I'm using edgedb and it forces SERIALIZABLE)
  • I want to avoid error.
  • and I want to maintain full concurrency (50 or more concurrent call)

Question

  • could you please explain why there are conflicts? I use uuid (it's not auto incremented integer so it doesn't require lock I guess) and they are all transactions inserting different rows. where do (lock) conflicts come from?

P.S.

my problem is simillar to Why does PostgreSQL serializable transaction think this as conflict? but even if I added

            await conn.execute(
                "SET enable_seqscan = off;"
            )
            id_ = await conn.fetchval(
                f"""
            INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
            """
...
            )

error stays the same

could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on commit attempt with conflict in from prepared pivot.
HINT:  The transaction might succeed if retried.
Share Improve this question edited 13 hours ago Ken White 126k15 gold badges234 silver badges463 bronze badges asked yesterday JerryJerry 4121 gold badge5 silver badges23 bronze badges 1
  • 1) In Postgres autoincrement(sequence/Identity) does not use a lock. 2) I'm going to say using async is the issue. Mixing a non-synchronous client with a sequential transaction level(serializable) is asking for issues. 3) Look at the Postgres log to see what is actually hitting the database and what is the specific error. – Adrian Klaver Commented 23 hours ago
Add a comment  | 

2 Answers 2

Reset to default 2

An INSERT into participant_dates will look up and lock the referenced row in participants. With SERIALIZABLE isolation, this read will put a predicate lock on participants. This predicate lock is probably an SIRead lock on the index leaf page of participants' primary key index that contains the id in question.

As soon as a concurrent transaction inserts rows for a different id that happens to be in the same index page, you will get a (false positive) serialization error. These false positive serialization errors are to be expected:

While PostgreSQL's Serializable transaction isolation level only allows concurrent transactions to commit if it can prove there is a serial order of execution that would produce the same effect, it doesn't always prevent errors from being raised that would not occur in true serial execution.

With SERIALIZABLE isolation, you have to be ready to repeat transactions if they fail with a serialization error. It makes sense to reduce these errors, but the attempt to avoid them completely is a fool's errand. As a consolation, you will find that the errors will become less frequent as the tables grow.

It's more of a lengthy comment or a small expansion on the answer from @Laurenz Albe:

A slightly controversial solution might be to lift the REFERENCES participants(id) constraint. If it's there just as a good practice, demoting that to a soft reference shouldn't hurt - the tables still join just fine because values match, there's an index on participants.id and participant_dates.participant_id can be indexed independently, but there's no referential trigger guarding the link, messing with serializability.

With the constraint in place, that's what you get in pg_locks (note the SIReadLocks in question) when you insert the new participant_dates record:
demo at db<>fiddle

relation locktype mode page tuple transactionid virtualxid fastpath
participants_pkey relation RowShareLock null null null null t
participants relation RowShareLock null null null null t
pg_locks relation AccessShareLock null null null null t
participant_dates relation RowExclusiveLock null null null null t
null virtualxid ExclusiveLock null null null 4/7 t
null transactionid ExclusiveLock null null 740 null f
participants_pkey page SIReadLock 1 null null null f
participants tuple SIReadLock 0 1 null null f

Without the constraint, there's just no need to bother participants:

relation locktype mode page tuple transactionid virtualxid fastpath
pg_locks relation AccessShareLock null null null null t
participant_dates relation RowExclusiveLock null null null null t
null virtualxid ExclusiveLock null null null 4/9 t
null transactionid ExclusiveLock null null 742 null f

It disables a bunch of things like cascading update/delete and protection against orphaned entries but it's not uncommon that none of that is actually in use, or it's set null/set default that nobody needs to happen right away. If it doesn't need to be immediately in effect, an independent process could run periodically and emulate that and other behaviours of a proper FK, and you can set up assertion checks whenever something relies on the possibly delayed and otherwise unguarded integrity.

Some of the potential problems of not having the constraint can be mitigated by how you operate on these tables. Pairing up both of your inserts in one transaction the way you already do is a good example of that: you won't get an orphaned participant_dates due to failed insert of its parent participants (or, a participant without any dates), because if one fails, they both get rolled back.


Minor remark: since Postgres 13 you no longer need "uuid-ossp" to get a uuid_generate_v4(), there's a built-in gen_random_uuid() that does the same thing. I'm using it in my demo only to get in each re-run the same stable, reproducible v3, based on a seed.

本文标签: postgresqlserialization error when bulk insert in quotSERIALIZABLEquot transactionStack Overflow