admin管理员组

文章数量:1128131

I wanted to improve my upload time efficiency by using TPL Dataflow. I receive files with multiple rows. I parse and upload them into the database. I have two Dataflow blocks:

Receive & parse file with rows
  |
  V
Insert in Database in FileTable, RowTable

By using two dataflow blocks the time slightly improved. Now I wanted to improve more and add DOP (Degree of parallelism) >= 1. It was possible on first block because it parsed multiple files in parallel.
On the second block I got into EF Db context problems. The threads tried to insert into db at the same time.

Entity Framework Core: A second operation started on this context before a previous operation completed

ParseFilesBlock - DOP >= 1 (OK)
InsertToDatabaseBlock - DOP >= 1 (Db context errors, possible conflicts between related data)

Based on the official documentation you need to give each thread a scoped db context and get services.

However, I ended up getting services in every method:

void MyMethod1() {
  var service1 = IServiceProvider.GetRequiredService<IMyService>();
  ...
}

I had to write this in too many methods. All public and private methods used by the Upload feature which used the Dataflow get 1-7 services. And if one method was used by another API everything was even more error prone because it still required a wrapping scope.

  • I have been told multiple times that there are no advantages in parallelizing Update/Insert database operations. The database creates locks for data integrity, has connection limits, contention, etc.
  • my InsertToDatabaseBlock uses EF transactions and there are multiple Insert/Update operations done on the File and its content. Multiple files may have data related to each other (marking a row as complete based on content from multiple files) and then the multiple threads may produce inconsistent data.

Should I consider something different for improving my Upload time instead of DOP>=1 for the InsertToDatabaseBlock? Like: Bulk Updates.

Are DB Write operations so problematic when parallelized that they should be always avoided?

本文标签: cParallelizing the database Write operationsStack Overflow