mongodb 异常停止 Too many open files

今天 mongodb 出现自动停止的情况,日志如下:
`2020-01-31T13:54:18.238+0800 I INDEX [conn35923] build index on: customer.customer_19652 properties: { v: 2, key: { wechat.openid: 1 }, name: "wechat.openid_1", ns: "customer.customer_19652", background: true }
2020-01-31T13:54:18.275+0800 I INDEX [conn35923] build index done. scanned 1 total records. 0 secs
2020-01-31T13:54:18.275+0800 I COMMAND [conn35923] command customer.$cmd command: createIndexes { createIndexes: "customer_19652", indexes: [ { name: "wechat.openid_1", key: { wechat.openid: 1 }, background: true } ], writeConcern: { w: 1 } } numYields:1 reslen:113 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2, W: 2 } }, Collection: { acquireCount: { w: 2 } } } protocol:op_query 101ms
2020-01-31T13:54:18.360+0800 I INDEX [conn35881] build index on: customer.customer_19652 properties: { v: 2, key: { addtime: -1 }, name: "addtime_-1", ns: "customer.customer_19652", background: true }
2020-01-31T13:54:18.399+0800 I INDEX [conn35881] build index done. scanned 1 total records. 0 secs
2020-01-31T13:54:18.399+0800 I COMMAND [conn35881] command customer.$cmd command: createIndexes { createIndexes: "customer_19652", indexes: [ { name: "addtime_-1", key: { addtime: -1 }, background: true } ], writeConcern: { w: 1 } } numYields:1 reslen:113 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2, W: 2 } }, Collection: { acquireCount: { w: 2 } } } protocol:op_query 118ms
2020-01-31T13:54:18.569+0800 I WRITE [conn35904] update message.message_16640 query: { _id: ObjectId('5e33c1090320a34ebc4eed93') } planSummary: IDHACK update: { $set: { state: "read" } } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 numYields:0 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 146597 } }, Collection: { acquireCount: { w: 1 } } } 146ms
2020-01-31T13:54:18.569+0800 I WRITE [conn35843] update message.message_12629 query: { _id: ObjectId('5e33b6010320a34ebc4b3563') } planSummary: IDHACK update: { $set: { state: "read" } } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 numYields:0 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 127366 } }, Collection: { acquireCount: { w: 1 } } } 127ms
2020-01-31T13:54:18.569+0800 I COMMAND [conn35927] command message.message_16589 command: find { find: "message_16589", filter: { customerId: ObjectId('5e33c1080320a34ebc4eed12') }, sort: { time: -1 }, projection: { __v: 0 }, skip: 2, limit: 20 } planSummary: IXSCAN { customerId: 1, time: -1 } keysExamined:3 docsExamined:3 cursorExhausted:1 numYields:0 nreturned:1 reslen:311 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 127372 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 127ms
2020-01-31T13:54:18.569+0800 I COMMAND [conn35904] command message.$cmd command: update { update: "message_16640", updates: [ { q: { _id: ObjectId('5e33c1090320a34ebc4eed93') }, u: { $set: { state: "read" } }, upsert: false, multi: false } ], ordered: true, writeConcern: { w: 1 } } numYields:0 reslen:59 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 146597 } }, Collection: { acquireCount: { w: 1 } } } protocol:op_query 147ms
2020-01-31T13:54:18.569+0800 I COMMAND [conn35912] command message.message_19652 command: listIndexes { listIndexes: "message_19652", cursor: {} } numYields:0 reslen:205 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 144433 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 144ms
2020-01-31T13:54:18.569+0800 I COMMAND [conn35843] command message.$cmd command: update { update: "message_12629", updates: [ { q: { _id: ObjectId('5e33b6010320a34ebc4b3563') }, u: { $set: { state: "read" } }, upsert: false, multi: false } ], ordered: true, writeConcern: { w: 1 } } numYields:0 reslen:59 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 127366 } }, Collection: { acquireCount: { w: 1 } } } protocol:op_query 127ms
2020-01-31T13:54:18.646+0800 I COMMAND [conn35880] command message.message_19652 command: insert { insert: "message_19652", documents: [ { customerId: ObjectId('5e33c1090320a34ebc4eeda1'), serviceId: ObjectId('5e33c0f50320a34ebc4ee709'), contents: "您好,有什么可以帮您的吗?", sendType: "service", system: true, type: "text", _id: ObjectId('5e33c10a0320a34ebc4eedc9'), state: "read", time: new Date(1580450058415), __v: 0 } ], ordered: false, writeConcern: { w: 1 } } ninserted:1 keysInserted:1 numYields:0 reslen:44 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 2, W: 1 } }, Collection: { acquireCount: { w: 2 } } } protocol:op_query 224ms
2020-01-31T13:54:18.728+0800 I INDEX [conn35904] build index on: message.message_19652 properties: { v: 2, key: { customerId: 1, time: -1 }, name: "customerId_1_time_-1", ns: "message.message_19652", background: true }
2020-01-31T13:54:18.728+0800 I WRITE [conn35927] update message.message_12629 query: { _id: ObjectId('5e33b5f80320a34ebc4b31e7') } planSummary: IDHACK update: { $set: { state: "read" } } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 numYields:0 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 101009 } }, Collection: { acquireCount: { w: 1 } } } 101ms
2020-01-31T13:54:18.728+0800 I COMMAND [conn35927] command message.$cmd command: update { update: "message_12629", updates: [ { q: { _id: ObjectId('5e33b5f80320a34ebc4b31e7') }, u: { $set: { state: "read" } }, upsert: false, multi: false } ], ordered: true, writeConcern: { w: 1 } } numYields:0 reslen:59 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 101009 } }, Collection: { acquireCount: { w: 1 } } } protocol:op_query 101ms
2020-01-31T13:54:18.766+0800 I INDEX [conn35904] build index done. scanned 1 total records. 0 secs
2020-01-31T13:54:18.766+0800 I COMMAND [conn35904] command message.$cmd command: createIndexes { createIndexes: "message_19652", indexes: [ { name: "customerId_1_time_-1", key: { customerId: 1, time: -1 }, background: true } ], writeConcern: { w: 1 } } numYields:2 reslen:113 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 3, W: 2 }, acquireWaitCount: { W: 1 }, timeAcquiringMicros: { W: 34340 } }, Collection: { acquireCount: { w: 3 } } } protocol:op_query 155ms
2020-01-31T13:54:23.146+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1634 #36031 (128 connections now open)
2020-01-31T13:54:23.147+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1630 #36032 (129 connections now open)
2020-01-31T13:54:23.171+0800 I NETWORK [conn36031] received client metadata from 113.247.178.184:1634 conn36031: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:23.172+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1631 #36033 (130 connections now open)
2020-01-31T13:54:23.172+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1632 #36034 (131 connections now open)
2020-01-31T13:54:23.172+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1633 #36035 (132 connections now open)
2020-01-31T13:54:23.172+0800 I NETWORK [conn36033] received client metadata from 113.247.178.184:1631 conn36033: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:23.173+0800 I NETWORK [conn36034] received client metadata from 113.247.178.184:1632 conn36034: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:23.173+0800 I NETWORK [conn36032] received client metadata from 113.247.178.184:1630 conn36032: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:23.173+0800 I NETWORK [conn36035] received client metadata from 113.247.178.184:1633 conn36035: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:23.583+0800 I ACCESS [conn36031] Successfully authenticated as principal hecong on order
2020-01-31T13:54:23.602+0800 I ACCESS [conn36033] Successfully authenticated as principal hecong on company
2020-01-31T13:54:23.602+0800 I ACCESS [conn36034] Successfully authenticated as principal hecong on dialog
2020-01-31T13:54:23.602+0800 I ACCESS [conn36035] Successfully authenticated as principal hecong on history
2020-01-31T13:54:23.603+0800 I ACCESS [conn36032] Successfully authenticated as principal hecong on dialog
2020-01-31T13:54:31.632+0800 I COMMAND [PeriodicTaskRunner] task: UnusedLockCleaner took: 108ms
2020-01-31T13:54:32.641+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1661 #36036 (133 connections now open)
2020-01-31T13:54:32.642+0800 I NETWORK [conn36036] received client metadata from 113.247.178.184:1661 conn36036: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:33.022+0800 I ACCESS [conn36036] Successfully authenticated as principal hecong on agent
2020-01-31T13:54:47.743+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1748 #36037 (134 connections now open)
2020-01-31T13:54:47.743+0800 I NETWORK [conn36037] received client metadata from 113.247.178.184:1748 conn36037: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:47.745+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1747 #36038 (135 connections now open)
2020-01-31T13:54:47.745+0800 I NETWORK [conn36038] received client metadata from 113.247.178.184:1747 conn36038: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:47.745+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1744 #36039 (136 connections now open)
2020-01-31T13:54:47.746+0800 I NETWORK [conn36039] received client metadata from 113.247.178.184:1744 conn36039: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:47.749+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1749 #36040 (137 connections now open)
2020-01-31T13:54:47.749+0800 I NETWORK [conn36040] received client metadata from 113.247.178.184:1749 conn36040: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:47.752+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1751 #36041 (138 connections now open)
2020-01-31T13:54:47.752+0800 I NETWORK [conn36041] received client metadata from 113.247.178.184:1751 conn36041: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:47.755+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1745 #36042 (139 connections now open)
2020-01-31T13:54:47.755+0800 I NETWORK [conn36042] received client metadata from 113.247.178.184:1745 conn36042: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:47.756+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1750 #36043 (140 connections now open)
2020-01-31T13:54:47.756+0800 I NETWORK [conn36043] received client metadata from 113.247.178.184:1750 conn36043: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:48.289+0800 I ACCESS [conn36040] Successfully authenticated as principal hecong on loginlogs
2020-01-31T13:54:48.290+0800 I ACCESS [conn36037] Successfully authenticated as principal hecong on service
2020-01-31T13:54:48.290+0800 I ACCESS [conn36041] Successfully authenticated as principal hecong on history
2020-01-31T13:54:48.294+0800 I ACCESS [conn36038] Successfully authenticated as principal hecong on service
2020-01-31T13:54:48.302+0800 I ACCESS [conn36039] Successfully authenticated as principal hecong on history
2020-01-31T13:54:48.303+0800 I ACCESS [conn36042] Successfully authenticated as principal hecong on statistics
2020-01-31T13:54:48.303+0800 I ACCESS [conn36043] Successfully authenticated as principal hecong on loginlogs
2020-01-31T13:54:52.957+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1757 #36044 (141 connections now open)
2020-01-31T13:54:52.957+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1756 #36045 (142 connections now open)
2020-01-31T13:54:52.960+0800 I NETWORK [conn36045] received client metadata from 113.247.178.184:1756 conn36045: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:52.961+0800 I NETWORK [conn36044] received client metadata from 113.247.178.184:1757 conn36044: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:54:53.484+0800 I ACCESS [conn36045] Successfully authenticated as principal hecong on risk
2020-01-31T13:54:53.485+0800 I ACCESS [conn36044] Successfully authenticated as principal hecong on statistics_dialog
2020-01-31T13:55:06.729+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1773 #36046 (143 connections now open)
2020-01-31T13:55:06.729+0800 I NETWORK [conn36046] received client metadata from 113.247.178.184:1773 conn36046: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:55:06.918+0800 I ACCESS [conn36046] Successfully authenticated as principal hecong on company
2020-01-31T13:55:13.640+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1777 #36047 (144 connections now open)
2020-01-31T13:55:13.641+0800 I NETWORK [conn36047] received client metadata from 113.247.178.184:1777 conn36047: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:55:14.669+0800 I ACCESS [conn36047] Successfully authenticated as principal hecong on service
2020-01-31T13:55:21.890+0800 I NETWORK [thread1] connection accepted from 113.247.178.184:1787 #36048 (145 connections now open)
2020-01-31T13:55:21.890+0800 I NETWORK [conn36048] received client metadata from 113.247.178.184:1787 conn36048: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.3-8-g05b19c6" }, os: { type: "Darwin", name: "Mac OS X", architecture: "x86_64", version: "18.6.0" } }
2020-01-31T13:55:21.942+0800 E - [conn36048] cannot open /dev/urandom Too many open files
2020-01-31T13:55:21.942+0800 I - [conn36048] Fatal Assertion 28839 at src/mongo/platform/random.cpp 158
2020-01-31T13:55:21.942+0800 I - [conn36048] `

我的紧急处理方式:
1、根据 E 级别的错误提示:cannot open /dev/urandom Too many open files,发现是文件打开最大值的问题。
2、从网上找了办法修改了 /etc/security/limits.conf 文件将里面的65535修改成了655350,然后重启服务器
3、重启 mongodb 失败,想用 --repair 修复后重启但是数据库上百G怕需要太久不敢用。
4、我最后是回滚磁盘数据,修改 ulimit 再重启服务器,然后再重新启动 mongodb 用了 20 分钟后启动成功。

我的疑问:
1、mongodb 对于文件打开实际使用的最大值从哪里可以看到,避免再次出现这个问题。
2、我不清楚是什么原因造成文件打开超过 65535 ,我此前服务器一直稳定,近期压力也并没有增加,我该如何找出原因。
3、我最后回滚数据后启动 mongodb 用了20分钟,是数据量越大启动就越久吗。

我网上查了一堆的资料,貌似是 mongodb 是需要分片+副本集的方式,我现在单台服务器有100G数据的情况还能转成分片副本集的方式吗。

今天被吓得不轻,有没有大神给小白的我提供一些思路,避免后续出现类似问题,感激不尽。

阅读 4.5k
1 个回答

自己解决了
1、mongodb 日志里可以看到集合文件打开数
2、我的单库集合数量太多,集合文件打开数也就很多,加上长期不重启数据库(重启数据库集合打开数会清零)。
3、集合数越多mongodb启动时间越长
只能说之前没搞明白没设计好 mongodb 其实尽可能少量集合,分片集群的方式很完美。

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进