admin管理员组文章数量:1180511
What I did
- I created two tables
participants
andparticipant_dates
- in one transaction, I insert one participant "A" and 30 participant_dates of A
below is my python code (full)
import asyncio
from datetime import date
from uuid import uuid4
import asyncpg
async def get_connection():
return await asyncpg.connect(
user='postgres',
password='1234',
database='postgres',
host='127.0.0.1',
port='5432'
)
async def bulk() -> None:
"""
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE IF NOT EXISTS participants (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name VARCHAR(255) NOT NULL
);
CREATE TABLE IF NOT EXISTS participant_dates (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
participant_id UUID REFERENCES participants(id),
date DATE
);
"""
conn = await get_connection()
try:
async with conn.transaction(isolation="serializable"):
id_ = await conn.fetchval(
f"""
INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
"""
)
await conn.executemany(
"""
INSERT INTO participant_dates (participant_id, date) VALUES (
$1,
$2
);
""",
[
(id_, date(2025, 1, 1)),
(id_, date(2025, 1, 2)),
(id_, date(2025, 1, 3)),
(id_, date(2025, 1, 4)),
(id_, date(2025, 1, 5)),
(id_, date(2025, 1, 6)),
(id_, date(2025, 1, 7)),
(id_, date(2025, 1, 8)),
(id_, date(2025, 1, 9)),
(id_, date(2025, 1, 10)),
(id_, date(2025, 1, 11)),
(id_, date(2025, 1, 12)),
(id_, date(2025, 1, 13)),
(id_, date(2025, 1, 14)),
(id_, date(2025, 1, 15)),
(id_, date(2025, 1, 16)),
(id_, date(2025, 1, 17)),
(id_, date(2025, 1, 18)),
(id_, date(2025, 1, 19)),
(id_, date(2025, 1, 20)),
]
)
except Exception as e:
print(e)
async def main() -> None:
# 50 concurrent call
await asyncio.gather(
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
)
asyncio.run(main())
What happened
bunch of below errors appeared.
could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on conflict out to pivot 882, during read.
HINT: The transaction might succeed if retried.
What I want to do.
- I can not lower transaction level (I'm using edgedb and it forces SERIALIZABLE)
- I want to avoid error.
- and I want to maintain full concurrency (50 or more concurrent call)
Question
- could you please explain why there are conflicts? I use uuid (it's not auto incremented integer so it doesn't require lock I guess) and they are all transactions inserting different rows. where do (lock) conflicts come from?
P.S.
my problem is simillar to Why does PostgreSQL serializable transaction think this as conflict? but even if I added
await conn.execute(
"SET enable_seqscan = off;"
)
id_ = await conn.fetchval(
f"""
INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
"""
...
)
error stays the same
could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on commit attempt with conflict in from prepared pivot.
HINT: The transaction might succeed if retried.
What I did
- I created two tables
participants
andparticipant_dates
- in one transaction, I insert one participant "A" and 30 participant_dates of A
below is my python code (full)
import asyncio
from datetime import date
from uuid import uuid4
import asyncpg
async def get_connection():
return await asyncpg.connect(
user='postgres',
password='1234',
database='postgres',
host='127.0.0.1',
port='5432'
)
async def bulk() -> None:
"""
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE IF NOT EXISTS participants (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name VARCHAR(255) NOT NULL
);
CREATE TABLE IF NOT EXISTS participant_dates (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
participant_id UUID REFERENCES participants(id),
date DATE
);
"""
conn = await get_connection()
try:
async with conn.transaction(isolation="serializable"):
id_ = await conn.fetchval(
f"""
INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
"""
)
await conn.executemany(
"""
INSERT INTO participant_dates (participant_id, date) VALUES (
$1,
$2
);
""",
[
(id_, date(2025, 1, 1)),
(id_, date(2025, 1, 2)),
(id_, date(2025, 1, 3)),
(id_, date(2025, 1, 4)),
(id_, date(2025, 1, 5)),
(id_, date(2025, 1, 6)),
(id_, date(2025, 1, 7)),
(id_, date(2025, 1, 8)),
(id_, date(2025, 1, 9)),
(id_, date(2025, 1, 10)),
(id_, date(2025, 1, 11)),
(id_, date(2025, 1, 12)),
(id_, date(2025, 1, 13)),
(id_, date(2025, 1, 14)),
(id_, date(2025, 1, 15)),
(id_, date(2025, 1, 16)),
(id_, date(2025, 1, 17)),
(id_, date(2025, 1, 18)),
(id_, date(2025, 1, 19)),
(id_, date(2025, 1, 20)),
]
)
except Exception as e:
print(e)
async def main() -> None:
# 50 concurrent call
await asyncio.gather(
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(), bulk(),
)
asyncio.run(main())
What happened
bunch of below errors appeared.
could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on conflict out to pivot 882, during read.
HINT: The transaction might succeed if retried.
What I want to do.
- I can not lower transaction level (I'm using edgedb and it forces SERIALIZABLE)
- I want to avoid error.
- and I want to maintain full concurrency (50 or more concurrent call)
Question
- could you please explain why there are conflicts? I use uuid (it's not auto incremented integer so it doesn't require lock I guess) and they are all transactions inserting different rows. where do (lock) conflicts come from?
P.S.
my problem is simillar to Why does PostgreSQL serializable transaction think this as conflict? but even if I added
await conn.execute(
"SET enable_seqscan = off;"
)
id_ = await conn.fetchval(
f"""
INSERT INTO participants (name) VALUES ('test_{str(uuid4())}') RETURNING id;
"""
...
)
error stays the same
could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on commit attempt with conflict in from prepared pivot.
HINT: The transaction might succeed if retried.
Share
Improve this question
edited 13 hours ago
Ken White
126k15 gold badges234 silver badges463 bronze badges
asked yesterday
JerryJerry
4121 gold badge5 silver badges23 bronze badges
1
|
2 Answers
Reset to default 2An INSERT
into participant_dates
will look up and lock the referenced row in participants
. With SERIALIZABLE
isolation, this read will put a predicate lock on participants
. This predicate lock is probably an SIRead
lock on the index leaf page of participants
' primary key index that contains the id
in question.
As soon as a concurrent transaction inserts rows for a different id
that happens to be in the same index page, you will get a (false positive) serialization error. These false positive serialization errors are to be expected:
While PostgreSQL's Serializable transaction isolation level only allows concurrent transactions to commit if it can prove there is a serial order of execution that would produce the same effect, it doesn't always prevent errors from being raised that would not occur in true serial execution.
With SERIALIZABLE
isolation, you have to be ready to repeat transactions if they fail with a serialization error. It makes sense to reduce these errors, but the attempt to avoid them completely is a fool's errand. As a consolation, you will find that the errors will become less frequent as the tables grow.
It's more of a lengthy comment or a small expansion on the answer from @Laurenz Albe:
A slightly controversial solution might be to lift the REFERENCES participants(id)
constraint. If it's there just as a good practice, demoting that to a soft reference shouldn't hurt - the tables still join
just fine because values match, there's an index on participants.id
and participant_dates.participant_id
can be indexed independently, but there's no referential trigger guarding the link, messing with serializability.
With the constraint in place, that's what you get in pg_locks
(note the SIReadLock
s in question) when you insert the new participant_dates
record:
demo at db<>fiddle
relation | locktype | mode | page | tuple | transactionid | virtualxid | fastpath |
---|---|---|---|---|---|---|---|
participants_pkey | relation | RowShareLock | null | null | null | null | t |
participants | relation | RowShareLock | null | null | null | null | t |
pg_locks | relation | AccessShareLock | null | null | null | null | t |
participant_dates | relation | RowExclusiveLock | null | null | null | null | t |
null | virtualxid | ExclusiveLock | null | null | null | 4/7 | t |
null | transactionid | ExclusiveLock | null | null | 740 | null | f |
participants_pkey | page | SIReadLock | 1 | null | null | null | f |
participants | tuple | SIReadLock | 0 | 1 | null | null | f |
Without the constraint, there's just no need to bother participants
:
relation | locktype | mode | page | tuple | transactionid | virtualxid | fastpath |
---|---|---|---|---|---|---|---|
pg_locks | relation | AccessShareLock | null | null | null | null | t |
participant_dates | relation | RowExclusiveLock | null | null | null | null | t |
null | virtualxid | ExclusiveLock | null | null | null | 4/9 | t |
null | transactionid | ExclusiveLock | null | null | 742 | null | f |
It disables a bunch of things like cascading update/delete and protection against orphaned entries but it's not uncommon that none of that is actually in use, or it's set null
/set default
that nobody needs to happen right away. If it doesn't need to be immediately in effect, an independent process could run periodically and emulate that and other behaviours of a proper FK, and you can set up assertion checks whenever something relies on the possibly delayed and otherwise unguarded integrity.
Some of the potential problems of not having the constraint can be mitigated by how you operate on these tables. Pairing up both of your inserts in one transaction the way you already do is a good example of that: you won't get an orphaned participant_dates
due to failed insert of its parent participants
(or, a participant without any dates), because if one fails, they both get rolled back.
Minor remark: since Postgres 13 you no longer need "uuid-ossp"
to get a uuid_generate_v4()
, there's a built-in gen_random_uuid()
that does the same thing. I'm using it in my demo only to get in each re-run the same stable, reproducible v3, based on a seed.
本文标签: postgresqlserialization error when bulk insert in quotSERIALIZABLEquot transactionStack Overflow
版权声明:本文标题:postgresql - serialization error when bulk insert in "SERIALIZABLE" transaction - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1738173205a2067133.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
autoincrement
(sequence/Identity) does not use a lock. 2) I'm going to say usingasync
is the issue. Mixing a non-synchronous client with a sequential transaction level(serializable) is asking for issues. 3) Look at the Postgres log to see what is actually hitting the database and what is the specific error. – Adrian Klaver Commented 23 hours ago