admin管理员组

文章数量:1425879

It seems that MongoDB insertMany() function with ordered option set to false can insert documents more efficiently than with ordered option set to true. And it can continue inserting documents even if several documents fail to insert.

But I found that there is no clean way to get both each failed document's error and overall mand result.

(BTW, I'm using Node.js driver API 2.2. I will refer the driver's source code from now on: .2/api/lib_collection.js.html)

First, If Promise is used, there is no way to get both the errors and the result. On source code line 540, insertMany() returns either the error or the result - not both, while bulkWrite() callback returns both on source code line 703.

Second, If callback is used, things get worse. When bulkWrite() calls the callback with both the error and the result, insertMany() calls the callback with both the error and the result, but the result is the BulkWrite's result, not the properly converted one to InsertManyResults. Please see the source code line 535. I think this is a bug.

And even for bulkWrite(), when the number of the errors is 1, it does not convert the result properly to its format. Please see the source code line 669. I think this is also a bug.

Now I think that the Node.js driver is simply not prepared to handle the case.

For now, it seems that there is no way to get both the errors and results properly.

Am I right?

UPDATE

I've run a test code which is based on Neil Lunn's answer. Since my Node.js(4.4.5) does not understand async/await, I had to rewrite the test code with explicit Promises.

The test code is as follows:

function doTest()
{
    var MongoClient = require('mongodb').MongoClient;
    var testData = [ 1,2,2,3,3,4,5,6,7,8,9 ];
    var db;

    return MongoClient.connect('mongodb://127.0.0.1/test')
    .then(function (_db)
    {
        db = _db;
        return db.createCollection('test');
    })
    .then(function ()
    {
        return db.collection('test').deleteMany({})
        .then(function ()
        {
            return db.collection('test').insertMany(
                testData.map(function (_id)
                {
                    return { _id: _id };
                }),
                { ordered: false })
            .then(function (result)
            {
                console.log('Promise: result', result);
            }, function (err)
            {
                console.log('Promise: error', err);
            });
        })
        .then(function ()
        {
            return db.collection('test').deleteMany({});
        })
        .then(function ()
        {
            return new Promise(function (resolve, reject)
            {
                return db.collection('test').insertMany(
                    testData.map(function (_id)
                    {
                        return { _id: _id };
                    }),
                    { ordered: false },
                    function (err, result)
                {
                    console.log('callback: error', err);
                    console.log('callback: result', result);
                    console.log('callback: result.hasWriteErrors', result.hasWriteErrors());
                    console.log('callback: result.getWriteErrors',
                        JSON.stringify(result.getWriteErrors(), null, 2));
                    resolve();
                });
            });
        });
    })
    .catch(function (err)
    {
        console.log('catch', err);
    })
    .then(function ()
    {
        db.close();
    });
}
doTest();

And here is the result:

Promise: error { [MongoError: write operation failed]
  name: 'MongoError',
  message: 'write operation failed',
  driver: true,
  code: 11000,
  writeErrors: 
   [ { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] },
     { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] } ] }
callback: error { [MongoError: write operation failed]
  name: 'MongoError',
  message: 'write operation failed',
  driver: true,
  code: 11000,
  writeErrors: 
   [ { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] },
     { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] } ] }
callback: result { ok: [Getter],
  nInserted: [Getter],
  nUpserted: [Getter],
  nMatched: [Getter],
  nModified: [Getter],
  nRemoved: [Getter],
  getInsertedIds: [Function],
  getUpsertedIds: [Function],
  getUpsertedIdAt: [Function],
  getRawResponse: [Function],
  hasWriteErrors: [Function],
  getWriteErrorCount: [Function],
  getWriteErrorAt: [Function],
  getWriteErrors: [Function],
  getLastOp: [Function],
  getWriteConcernError: [Function],
  toJSON: [Function],
  toString: [Function],
  isOk: [Function],
  insertedCount: 9,
  matchedCount: 0,
  modifiedCount: 0,
  deletedCount: 0,
  upsertedCount: 0,
  upsertedIds: {},
  insertedIds: 
   { '0': 1,
     '1': 2,
     '2': 2,
     '3': 3,
     '4': 3,
     '5': 4,
     '6': 5,
     '7': 6,
     '8': 7,
     '9': 8,
     '10': 9 },
  n: 9 }
callback: result.hasWriteErrors true
callback: result.getWriteErrors [
  {
    "code": 11000,
    "index": 2,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 2 }",
    "op": {
      "_id": 2
    }
  },
  {
    "code": 11000,
    "index": 4,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }",
    "op": {
      "_id": 3
    }
  }
]

Now, I've run the code again, with the testData variable modified as follows:

var testData = [ 1,2,3,3,4,5,6,7,8,9 ];

In this case, the number of errors will be 1, instead of 2, since the duplicate '2' is removed.

And here is the result:

Promise: error { [MongoError: E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }]
  name: 'MongoError',
  message: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  driver: true,
  code: 11000,
  index: 3,
  errmsg: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  getOperation: [Function],
  toJSON: [Function],
  toString: [Function] }
callback: error { [MongoError: E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }]
  name: 'MongoError',
  message: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  driver: true,
  code: 11000,
  index: 3,
  errmsg: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  getOperation: [Function],
  toJSON: [Function],
  toString: [Function] }
callback: result { ok: [Getter],
  nInserted: [Getter],
  nUpserted: [Getter],
  nMatched: [Getter],
  nModified: [Getter],
  nRemoved: [Getter],
  getInsertedIds: [Function],
  getUpsertedIds: [Function],
  getUpsertedIdAt: [Function],
  getRawResponse: [Function],
  hasWriteErrors: [Function],
  getWriteErrorCount: [Function],
  getWriteErrorAt: [Function],
  getWriteErrors: [Function],
  getLastOp: [Function],
  getWriteConcernError: [Function],
  toJSON: [Function],
  toString: [Function],
  isOk: [Function] }
callback: result.hasWriteErrors true
callback: result.getWriteErrors [
  {
    "code": 11000,
    "index": 3,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }",
    "op": {
      "_id": 3
    }
  }
]

The format of error and result is fairly different from the first run.

  • the error has no writeErrors field.
  • the result has no 'converted' fields. (insertedCount, matchedCount, etc.) This is a 'bug' on the driver source code line 669 as I've told above.

And in both test runs, the types of the result argument are NOT Collection~insertWriteOpResult. The first one is Collection~bulkWriteOpCallback, and the second one is more internal one. So, the API document is wrong in this case. This is caused by the 'bugs' on line 535 and 669 as I've told above.

So, even if the result can be used (indeed, the result has hasWriteErrors() and getWriteErrors() as Neil Lunn told), since this behavior is not documented, I suspect that it can be changed on later versions without notice, and my code will break.

It seems that MongoDB insertMany() function with ordered option set to false can insert documents more efficiently than with ordered option set to true. And it can continue inserting documents even if several documents fail to insert.

But I found that there is no clean way to get both each failed document's error and overall mand result.

(BTW, I'm using Node.js driver API 2.2. I will refer the driver's source code from now on: http://mongodb.github.io/node-mongodb-native/2.2/api/lib_collection.js.html)

First, If Promise is used, there is no way to get both the errors and the result. On source code line 540, insertMany() returns either the error or the result - not both, while bulkWrite() callback returns both on source code line 703.

Second, If callback is used, things get worse. When bulkWrite() calls the callback with both the error and the result, insertMany() calls the callback with both the error and the result, but the result is the BulkWrite's result, not the properly converted one to InsertManyResults. Please see the source code line 535. I think this is a bug.

And even for bulkWrite(), when the number of the errors is 1, it does not convert the result properly to its format. Please see the source code line 669. I think this is also a bug.

Now I think that the Node.js driver is simply not prepared to handle the case.

For now, it seems that there is no way to get both the errors and results properly.

Am I right?

UPDATE

I've run a test code which is based on Neil Lunn's answer. Since my Node.js(4.4.5) does not understand async/await, I had to rewrite the test code with explicit Promises.

The test code is as follows:

function doTest()
{
    var MongoClient = require('mongodb').MongoClient;
    var testData = [ 1,2,2,3,3,4,5,6,7,8,9 ];
    var db;

    return MongoClient.connect('mongodb://127.0.0.1/test')
    .then(function (_db)
    {
        db = _db;
        return db.createCollection('test');
    })
    .then(function ()
    {
        return db.collection('test').deleteMany({})
        .then(function ()
        {
            return db.collection('test').insertMany(
                testData.map(function (_id)
                {
                    return { _id: _id };
                }),
                { ordered: false })
            .then(function (result)
            {
                console.log('Promise: result', result);
            }, function (err)
            {
                console.log('Promise: error', err);
            });
        })
        .then(function ()
        {
            return db.collection('test').deleteMany({});
        })
        .then(function ()
        {
            return new Promise(function (resolve, reject)
            {
                return db.collection('test').insertMany(
                    testData.map(function (_id)
                    {
                        return { _id: _id };
                    }),
                    { ordered: false },
                    function (err, result)
                {
                    console.log('callback: error', err);
                    console.log('callback: result', result);
                    console.log('callback: result.hasWriteErrors', result.hasWriteErrors());
                    console.log('callback: result.getWriteErrors',
                        JSON.stringify(result.getWriteErrors(), null, 2));
                    resolve();
                });
            });
        });
    })
    .catch(function (err)
    {
        console.log('catch', err);
    })
    .then(function ()
    {
        db.close();
    });
}
doTest();

And here is the result:

Promise: error { [MongoError: write operation failed]
  name: 'MongoError',
  message: 'write operation failed',
  driver: true,
  code: 11000,
  writeErrors: 
   [ { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] },
     { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] } ] }
callback: error { [MongoError: write operation failed]
  name: 'MongoError',
  message: 'write operation failed',
  driver: true,
  code: 11000,
  writeErrors: 
   [ { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] },
     { code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] } ] }
callback: result { ok: [Getter],
  nInserted: [Getter],
  nUpserted: [Getter],
  nMatched: [Getter],
  nModified: [Getter],
  nRemoved: [Getter],
  getInsertedIds: [Function],
  getUpsertedIds: [Function],
  getUpsertedIdAt: [Function],
  getRawResponse: [Function],
  hasWriteErrors: [Function],
  getWriteErrorCount: [Function],
  getWriteErrorAt: [Function],
  getWriteErrors: [Function],
  getLastOp: [Function],
  getWriteConcernError: [Function],
  toJSON: [Function],
  toString: [Function],
  isOk: [Function],
  insertedCount: 9,
  matchedCount: 0,
  modifiedCount: 0,
  deletedCount: 0,
  upsertedCount: 0,
  upsertedIds: {},
  insertedIds: 
   { '0': 1,
     '1': 2,
     '2': 2,
     '3': 3,
     '4': 3,
     '5': 4,
     '6': 5,
     '7': 6,
     '8': 7,
     '9': 8,
     '10': 9 },
  n: 9 }
callback: result.hasWriteErrors true
callback: result.getWriteErrors [
  {
    "code": 11000,
    "index": 2,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 2 }",
    "op": {
      "_id": 2
    }
  },
  {
    "code": 11000,
    "index": 4,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }",
    "op": {
      "_id": 3
    }
  }
]

Now, I've run the code again, with the testData variable modified as follows:

var testData = [ 1,2,3,3,4,5,6,7,8,9 ];

In this case, the number of errors will be 1, instead of 2, since the duplicate '2' is removed.

And here is the result:

Promise: error { [MongoError: E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }]
  name: 'MongoError',
  message: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  driver: true,
  code: 11000,
  index: 3,
  errmsg: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  getOperation: [Function],
  toJSON: [Function],
  toString: [Function] }
callback: error { [MongoError: E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }]
  name: 'MongoError',
  message: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  driver: true,
  code: 11000,
  index: 3,
  errmsg: 'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }',
  getOperation: [Function],
  toJSON: [Function],
  toString: [Function] }
callback: result { ok: [Getter],
  nInserted: [Getter],
  nUpserted: [Getter],
  nMatched: [Getter],
  nModified: [Getter],
  nRemoved: [Getter],
  getInsertedIds: [Function],
  getUpsertedIds: [Function],
  getUpsertedIdAt: [Function],
  getRawResponse: [Function],
  hasWriteErrors: [Function],
  getWriteErrorCount: [Function],
  getWriteErrorAt: [Function],
  getWriteErrors: [Function],
  getLastOp: [Function],
  getWriteConcernError: [Function],
  toJSON: [Function],
  toString: [Function],
  isOk: [Function] }
callback: result.hasWriteErrors true
callback: result.getWriteErrors [
  {
    "code": 11000,
    "index": 3,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }",
    "op": {
      "_id": 3
    }
  }
]

The format of error and result is fairly different from the first run.

  • the error has no writeErrors field.
  • the result has no 'converted' fields. (insertedCount, matchedCount, etc.) This is a 'bug' on the driver source code line 669 as I've told above.

And in both test runs, the types of the result argument are NOT Collection~insertWriteOpResult. The first one is Collection~bulkWriteOpCallback, and the second one is more internal one. So, the API document is wrong in this case. This is caused by the 'bugs' on line 535 and 669 as I've told above.

So, even if the result can be used (indeed, the result has hasWriteErrors() and getWriteErrors() as Neil Lunn told), since this behavior is not documented, I suspect that it can be changed on later versions without notice, and my code will break.

Share Improve this question edited Oct 20, 2017 at 8:24 Tom 1,4093 gold badges20 silver badges30 bronze badges asked Oct 20, 2017 at 1:31 zeodtrzeodtr 11.3k15 gold badges46 silver badges65 bronze badges 3
  • NODE-1158 which does have links to recent mits in the 3.x branch related to this. There are workarounds to this as mentioned on MongoDB Node.js native driver silently swallows bulkWrite exception. Perhaps you should show the code you have a problem with implementing so you can be pointed to the correct way to handle it if you don't understand what the answer there describes. – Neil Lunn Commented Oct 20, 2017 at 1:58
  • I should note that as mented on several places there, your statement that "you cannot get both" is intrinsically incorrect. The core issue addressed in the refactor for the new branch is that the underlying "Bulk" API calls actually do return "both" the error as well a BulkWriteResult which contains the error information in the "callback". The issue corrects this behavior to remove the "result" and make a BulkWriteError response only. Which for "unordered" includes the "success" information, as well as the array of "errors". – Neil Lunn Commented Oct 20, 2017 at 2:05
  • @NeilLunn Actually I didn't write any code yet. It's the first time for me to write a code to use any bulk-write API of MongoDB JavaScript driver. I've read the document (mongodb.github.io/node-mongodb-native/2.2/api/…) and found the document does not mention the case properly. So I began to explore the source code, and found the source code's current state. I don't want to write any code for this state. – zeodtr Commented Oct 20, 2017 at 2:05
Add a ment  | 

1 Answer 1

Reset to default 4

The issue is really only with how a "Promise" resolves and how the error information is passed, but of course the real core issue is that there is in fact a return of "both" the error and result information when any "Bulk" operation is set to { ordered: false }. This is being addressed in the driver for the 3.x release as noted on NODE-1158 which also contains links to the mits in the future branch that fix the issue.

The "workaround" for this is to be aware that "both" the result an error information are present in the BulkWriteResult object returned as the result in the "callback" invocation of any such methods ( note insertMany() and even bulkWrite() actually wrap an underlying Bulk API Implementation ).

To demonstrate with a listing:

const MongoClient = require('mongodb').MongoClient;

const uri = 'mongodb://localhost/test';
const testData = [1,2,3,3,4,5,6,6,7,8,9];

(async function() {

  let db;

  try {

    db = await MongoClient.connect(uri);

    await db.collection('test').remove();

    // Expect an error here - but it's just the errors
    try {
      let result = await db.collection('test').insertMany(
        testData.map( _id => ({ _id }) ),
        { "ordered": false }
      );
      console.log(result);   // never gets here
    } catch(e) {
      console.dir(e);
      console.log(JSON.stringify(e.writeErrors,undefined,2));
    }

    await db.collection('test').remove();
    // Wrapped callback so we see what happens

    try {
      let result = await new Promise((resolve,reject) => 
        db.collection('test').insertMany(
          testData.map( _id => ({ _id }) ),
          { "ordered": false },
          (err,result) => {
            if (err) reject(result);    // Because the errors are here as well
            resolve(result);
          }
        )
      );
      console.log(result);  // Never gets here
    } catch(e) {
      console.dir(e);
      console.log(e.hasWriteErrors());
      console.log(JSON.stringify(e.getWriteErrors(),undefined,2));
    }

  } catch(e) {
    console.error(e);
  } finally {
    db.close();
  }

})();

So there are two blocks of code there that attempt to use insertMany() with a list of values that are going to produce duplicate key errors for some of the values.

In the first attempt, we use the default Promise return which as should be indicated by the implemented code of the driver is simply going to pass the err callback result in the method it wraps into it's reject() statement. This means we go to the catch block here and produce the error information as output:

{ MongoError: [object Object]
    at Function.MongoError.create (/home/neillunn/projects/bulkerror/node_modules/mongodb-core/lib/error.js:31:11)
    at toError (/home/neillunn/projects/bulkerror/node_modules/mongodb/lib/utils.js:139:22)
    at /home/neillunn/projects/bulkerror/node_modules/mongodb/lib/collection.js:701:23
    at handleCallback (/home/neillunn/projects/bulkerror/node_modules/mongodb/lib/utils.js:120:56)
    at /home/neillunn/projects/bulkerror/node_modules/mongodb/lib/bulk/unordered.js:465:9
    at handleCallback (/home/neillunn/projects/bulkerror/node_modules/mongodb/lib/utils.js:120:56)
    at resultHandler (/home/neillunn/projects/bulkerror/node_modules/mongodb/lib/bulk/unordered.js:413:5)
    at /home/neillunn/projects/bulkerror/node_modules/mongodb-core/lib/connection/pool.js:469:18
    at _binedTickCallback (internal/process/next_tick.js:131:7)
    at process._tickCallback (internal/process/next_tick.js:180:9)
  name: 'MongoError',
  message: 'write operation failed',
  driver: true,
  code: 11000,
  writeErrors:
   [ WriteError {
       code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] },
     WriteError {
       code: [Getter],
       index: [Getter],
       errmsg: [Getter],
       getOperation: [Function],
       toJSON: [Function],
       toString: [Function] } ] }
[
  {
    "code": 11000,
    "index": 3,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }",
    "op": {
      "_id": 3
    }
  },
  {
    "code": 11000,
    "index": 7,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 6 }",
    "op": {
      "_id": 6
    }
  }
]

Note that this is a wrapped MongoError and despite the fact we set { ordered: false } there is no "result" information in the response. We can see on detailed inspection of the error information that the list of WriteError does have details on each duplicate key error produced.

So there was successful writes for everything in the batch that did not throw an error, but it's not reported in anything obtainable from the Promise. But this is not true of the underlying method, which is still implemented using a callback.

The second attempt wraps this callback "manually", so we can in fact see the result by altering the behavior and passing the result object to reject when an err is present. This tells us a different story:

BulkWriteResult {
  ok: [Getter],
  nInserted: [Getter],
  nUpserted: [Getter],
  nMatched: [Getter],
  nModified: [Getter],
  nRemoved: [Getter],
  getInsertedIds: [Function],
  getUpsertedIds: [Function],
  getUpsertedIdAt: [Function],
  getRawResponse: [Function],
  hasWriteErrors: [Function],
  getWriteErrorCount: [Function],
  getWriteErrorAt: [Function],
  getWriteErrors: [Function],
  getLastOp: [Function],
  getWriteConcernError: [Function],
  toJSON: [Function],
  toString: [Function],
  isOk: [Function],
  insertedCount: 9,
  matchedCount: 0,
  modifiedCount: 0,
  deletedCount: 0,
  upsertedCount: 0,
  upsertedIds: {},
  insertedIds:
   { '0': 1,
     '1': 2,
     '2': 3,
     '3': 3,
     '4': 4,
     '5': 5,
     '6': 6,
     '7': 6,
     '8': 7,
     '9': 8,
     '10': 9 },
  n: 9 }
true
[
  {
    "code": 11000,
    "index": 3,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 3 }",
    "op": {
      "_id": 3
    }
  },
  {
    "code": 11000,
    "index": 7,
    "errmsg": "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 6 }",
    "op": {
      "_id": 6
    }
  }
]

Since we did not pass back the err we now see the BulkWriteResult in the catch block. We know we got there because of the specific code we are running in that block to inspect the result.

The regular result does indeed have some things as the modified or inserted count, as well as the list of insertedIds. We can also see from inspection that hasWriteErrors() returns true, and we can obtain the list of WriteError which we have serialized for better viewing.

Fixed in 3.x

As noted on the linked issue, the actual fix will only be present in the 3.x driver release which will support MongoDB 3.6. The "fix" is essentially done at a "lower level" to not return a BulkWriteResult at all, but instead make err return a BulkWriteError.

This actually makes things more consistent with how some other drivers have correctly implemented this already. It's honestly a bit of a "hangover" from traditional "node style" callbacks, where "both" error and response are always returned.

So bringing this into "just an error" makes things more consistent and works as you would generally expect.

As a bit of a side note, the related issue in MongoDB Node.js native driver silently swallows bulkWrite exception. referenced in the JIRA issue shows the actual bulkWrite() method in the implementation ( which is not "directly" what insertMany() wraps ) has a slightly different problem in that "no error" actually gets thrown at all, because the code is expecting result to be null and as described it is not.

So the reverse case is true there where we never get to catch for the exception in the default implementation using a Promise. However the exact same method of handling is what should be applied, by manually wrapping the callback and sending the result through reject in preference to err when both those are returned as not null.

Work around them as described, and preferably move to the new driver as soon as it bees available. Which should be fairly soon anyway.

本文标签: javascriptFunction insertMany() unordered proper way to get both the errors and the resultStack Overflow