monitoring - Sudden Mongodb high connections/queues, db completely freezes -
the issue
we have strange issue on our mongodb setup. peaks of high connections , high queues , mongodb process stops responding if let queues , connections increase. need restart instance using sigkill htop.
it seems there system limit / mongodb configuration blocking mongodb operating, because hardware resources ok. versions of issue happening on stand alone , replica set on production servers. details ahead.
about software environment
this stand alone mongodb instance (not sharded nor replica sets), it's operating on dedicated machine, , it's queried other machines. i'm using mongodb-linux-x86_64-2.6.11 under debian 7.7.
the machines querying mongo using django==1.7.4, mongoengine=0.10.1 pymongo==2.8.
on django settings.py file i'm connecting database using following lines:
from mongoengine import connect connect( mongo_db, username = mongo_user, password = mongo_pwd, host = mongo_host, port = mongo_port )
mms stats
as can see in following img mms service have peaks on connections , queques:
when happens, our mongodb process freezes. must use sigkill restart mongodb, bad.
in image there 3 freeze events.
as img shows, when happens, have peak on non-mapped virtual memory too.
also spotted increase on btree chart around 2nd , 3rd freeze.
we have checked logs, there no suspicious query, opcounters don't skyrocket, seems there no more queries usual.
here screenshot on same bug on day/time:
on cases, lock on db not increasing, has peak not reaching 4%:
opcounter drops zero, seems every op goes mongodb queque, database creates new connections try execute new requests, of them going queue well.
machine resources
regarding hardware, machine google cloud compute instance 4 intel xeon cores, 16 gb ram, 100 gb ssd disk.
no noticeable high network/io/cpu/ram issues detected, no peaks on resources, when mongod process frozen.
mysql on machine gets affected
also detect @ same time of mongod peak on queques , connections, spike on mysql connections, running on another machine. when kill mongodb process, mysql connections released (without doing mysql restart).
ulimit
i increased system limits, see if cause of issue seems did not fix problem.
i set recommended on this mongodb article. spike on connections continue. i'm trying find way debug connections coming from.
$ ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 60240 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 409600 pipe size (512 bytes, -p) 8 posix message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 60240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
db.currentop
i added shell scripts runs every 1 second following:
var ops = db.currentop().inprog if (ops !== undefined && ops.length > 0){ ops.foreach(function(op){ if(op.secs_running > 0) printjson(op); }) }
the log not report operation taking more 1 second execute. thinking process taking long time on seems not case.
mongodb logs
regarding mongodb.log, here full mongodb log around problem. happens on log line 361. there connections start go up, , no more queries executed. cant call mongo shell, says:
[wed feb 10 15:46:01 utc 2016] 2016-02-10t15:48:31.940+0000 dbclientcursor::init call() failed 2016-02-10t15:48:31.941+0000 error: dbclientbase::findn: transport error: 127.0.0.1:27000 ns: admin.$cmd query: { whatsmyuri: 1 } @ src/mongo/shell/mongo.js:148
log extract
2016-02-10t15:41:39.930+0000 [initandlisten] connection accepted 10.240.0.3:56611 #3665 (79 connections open) 2016-02-10t15:41:39.930+0000 [conn3665] command admin.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:39.930+0000 [conn3665] command admin.$cmd command: ping { ping: 1 } keyupdates:0 numyields:0 reslen:37 0ms 2016-02-10t15:41:39.992+0000 [conn3529] command db.$cmd command: count { count: "notification", fields: null, query: { read: false, recipient: 310 } } plansummary: ixscan { recipient: 1 } keyupdates:0 numyields:0 locks(micros) r:215 reslen:48 0ms 2016-02-10t15:41:40.038+0000 [conn2303] query db.column query: { _id: objectid('56b395dfbe66324cbee550b8'), client_id: 20 } plansummary: ixscan { _id: 1 } ntoreturn:2 ntoskip:0 nscanned:1 nscannedobjects:1 keyupdates:0 numyields:0 locks(micros) r:116 nreturned:1 reslen:470 0ms 2016-02-10t15:41:40.044+0000 [conn1871] update db.column query: { _id: objectid('56b395dfbe66324cbee550b8') } update: { $set: { last_request: new date(1455118900040) } } nscanned:1 nscannedobjects:1 nmatched:1 nmodified:1 fastmod:1 keyupdates:0 numyields:0 locks(micros) w:126 0ms 2016-02-10t15:41:40.044+0000 [conn1871] command db.$cmd command: update { update: "column", writeconcern: { w: 1 }, updates: [ { q: { _id: objectid('56b395dfbe66324cbee550b8') }, u: { $set: { last_request: new date(1455118900040) } }, multi: false, upsert: true } ] } keyupdates:0 numyields:0 reslen:55 0ms 2016-02-10t15:41:40.048+0000 [conn1875] query db.user query: { sn: "mobile", client_id: 20, uid: "56990023700" } plansummary: ixscan { client_id: 1, uid: 1, sn: 1 } ntoreturn:2 ntoskip:0 nscanned:1 nscannedobjects:1 keyupdates:0 numyields:0 locks(micros) r:197 nreturned:1 reslen:303 0ms 2016-02-10t15:41:40.056+0000 [conn2303] winning plan had 0 results. not caching. ns: db.case query: { sn: "mobile", client_id: 20, created: { $gt: new date(1454295600000), $lt: new date(1456800900000) }, deleted: false, establishment_users: { $all: [ objectid('5637640afefa2654b5d863e3') ] }, is_closed: true, updated_time: { $gt: new date(1455045840000) } } sort: { updated_time: 1 } projection: {} skip: 0 limit: 15 winner score: 1.0003 winner summary: ixscan { client_id: 1, is_closed: 1, deleted: 1, updated_time: 1 } 2016-02-10t15:41:40.057+0000 [conn2303] query db.case query: { $query: { sn: "mobile", client_id: 20, created: { $gt: new date(1454295600000), $lt: new date(1456800900000) }, deleted: false, establishment_users: { $all: [ objectid('5637640afefa2654b5d863e3') ] }, is_closed: true, updated_time: { $gt: new date(1455045840000) } }, $orderby: { updated_time: 1 } } plansummary: ixscan { client_id: 1, is_closed: 1, deleted: 1, updated_time: 1 } ntoreturn:15 ntoskip:0 nscanned:26 nscannedobjects:26 keyupdates:0 numyields:0 locks(micros) r:5092 nreturned:0 reslen:20 5ms 2016-02-10t15:41:40.060+0000 [conn300] command db.$cmd command: count { count: "notification", fields: null, query: { read: false, recipient: 309 } } plansummary: ixscan { recipient: 1 } keyupdates:0 numyields:0 locks(micros) r:63 reslen:48 0ms 2016-02-10t15:41:40.133+0000 [initandlisten] connection accepted 127.0.0.1:43266 #3666 (80 connections open) 2016-02-10t15:41:40.133+0000 [conn3666] command admin.$cmd command: whatsmyuri { whatsmyuri: 1 } ntoreturn:1 keyupdates:0 numyields:0 reslen:62 0ms 2016-02-10t15:41:40.134+0000 [conn3666] command db.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:40.134+0000 [conn3666] authenticate db: db { authenticate: 1, nonce: "xxx", user: "xxx", key: "xxx" } 2016-02-10t15:41:40.134+0000 [conn3666] command db.$cmd command: authenticate { authenticate: 1, nonce: "xxx", user: "xxx", key: "xxx" } ntoreturn:1 keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:40.136+0000 [conn3666] end connection 127.0.0.1:43266 (79 connections open) 2016-02-10t15:41:40.146+0000 [conn3051] command db.$cmd command: count { count: "notification", fields: null, query: { read: false, recipient: 301 } } plansummary: ixscan { recipient: 1 } keyupdates:0 numyields:0 locks(micros) r:284 reslen:48 0ms 2016-02-10t15:41:40.526+0000 [conn3529] query db.column query: { _id: objectid('56a8d864be6632718f9fb087'), client_id: 1 } plansummary: ixscan { _id: 1 } ntoreturn:2 ntoskip:0 nscanned:1 nscannedobjects:1 keyupdates:0 numyields:0 locks(micros) r:176 nreturned:1 reslen:440 0ms 2016-02-10t15:41:40.529+0000 [conn3529] update db.column query: { _id: objectid('56a8d864be6632718f9fb087') } update: { $set: { last_request: new date(1455118900527) } } nscanned:1 nscannedobjects:1 nmatched:1 nmodified:1 fastmod:1 keyupdates:0 numyields:0 locks(micros) w:61 0ms 2016-02-10t15:41:40.529+0000 [conn3529] command db.$cmd command: update { update: "column", writeconcern: { w: 1 }, updates: [ { q: { _id: objectid('56a8d864be6632718f9fb087') }, u: { $set: { last_request: new date(1455118900527) } }, multi: false, upsert: true } ] } keyupdates:0 numyields:0 reslen:55 0ms 2016-02-10t15:41:40.531+0000 [conn3529] query db.user query: { sn: "email", client_id: 1, uid: "asdasdasdasdas" } plansummary: ixscan { client_id: 1, uid: 1, sn: 1 } ntoreturn:2 ntoskip:0 nscanned:1 nscannedobjects:1 keyupdates:0 numyields:0 locks(micros) r:278 nreturned:1 reslen:285 0ms 2016-02-10t15:41:40.546+0000 [conn3529] winning plan had 0 results. not caching. ns: db.case query: { answered: true, sn: "email", client_id: 1, establishment_users: { $all: [ objectid('5669b930fefa2626db389c0e') ] }, deleted: false, is_closed: { $ne: true } } sort: { updated_time: -1 } projection: {} skip: 0 limit: 1 winner score: 1.0003 winner summary: ixscan { client_id: 1, establishment_users: 1, updated_time: 1 } 2016-02-10t15:41:40.547+0000 [conn3529] query db.case query: { $query: { answered: true, sn: "email", client_id: 1, establishment_users: { $all: [ objectid('5669b930fefa2626db389c0e') ] }, deleted: false, is_closed: { $ne: true } }, $orderby: { updated_time: -1 } } plansummary: ixscan { client_id: 1, establishment_users: 1, updated_time: 1 } ntoskip:0 nscanned:103 nscannedobjects:103 keyupdates:0 numyields:0 locks(micros) r:9410 nreturned:0 reslen:20 9ms 2016-02-10t15:41:40.557+0000 [conn3529] winning plan had 0 results. not caching. ns: db.case query: { answered: true, sn: "email", client_id: 1, establishment_users: { $all: [ objectid('5669b930fefa2626db389c0e') ] }, deleted: false, is_closed: { $ne: true } } sort: { updated_time: -1 } projection: {} skip: 0 limit: 15 winner score: 1.0003 winner summary: ixscan { client_id: 1, establishment_users: 1, updated_time: 1 } 2016-02-10t15:41:40.558+0000 [conn3529] query db.case query: { $query: { answered: true, sn: "email", client_id: 1, establishment_users: { $all: [ objectid('5669b930fefa2626db389c0e') ] }, deleted: false, is_closed: { $ne: true } }, $orderby: { updated_time: -1 } } plansummary: ixscan { client_id: 1, establishment_users: 1, updated_time: 1 } ntoreturn:15 ntoskip:0 nscanned:103 nscannedobjects:103 keyupdates:0 numyields:0 locks(micros) r:7572 nreturned:0 reslen:20 7ms 2016-02-10t15:41:40.569+0000 [conn3028] command db.$cmd command: count { count: "notification", fields: null, query: { read: false, recipient: 145 } } plansummary: ixscan { recipient: 1 } keyupdates:0 numyields:0 locks(micros) r:237 reslen:48 0ms 2016-02-10t15:41:40.774+0000 [conn3053] command db.$cmd command: count { count: "notification", fields: null, query: { read: false, recipient: 143 } } plansummary: ixscan { recipient: 1 } keyupdates:0 numyields:0 locks(micros) r:372 reslen:48 0ms 2016-02-10t15:41:41.056+0000 [conn22] command admin.$cmd command: ping { ping: 1 } keyupdates:0 numyields:0 reslen:37 0ms ######################### here problem starts ######################### 2016-02-10t15:41:41.175+0000 [initandlisten] connection accepted 127.0.0.1:43268 #3667 (80 connections open) 2016-02-10t15:41:41.212+0000 [initandlisten] connection accepted 10.240.0.6:46021 #3668 (81 connections open) 2016-02-10t15:41:41.213+0000 [conn3668] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:41.213+0000 [conn3668] authenticate db: db { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } 2016-02-10t15:41:41.213+0000 [conn3668] command db.$cmd command: authenticate { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:41.348+0000 [initandlisten] connection accepted 10.240.0.6:46024 #3669 (82 connections open) 2016-02-10t15:41:41.349+0000 [conn3669] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:41.349+0000 [conn3669] authenticate db: db { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } 2016-02-10t15:41:41.349+0000 [conn3669] command db.$cmd command: authenticate { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:43.620+0000 [initandlisten] connection accepted 10.240.0.6:46055 #3670 (83 connections open) 2016-02-10t15:41:43.621+0000 [conn3670] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:43.621+0000 [conn3670] authenticate db: db { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } 2016-02-10t15:41:43.621+0000 [conn3670] command db.$cmd command: authenticate { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:43.655+0000 [initandlisten] connection accepted 10.240.0.6:46058 #3671 (84 connections open) 2016-02-10t15:41:43.656+0000 [conn3671] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:43.656+0000 [conn3671] authenticate db: db { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } 2016-02-10t15:41:43.656+0000 [conn3671] command db.$cmd command: authenticate { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:44.045+0000 [initandlisten] connection accepted 10.240.0.6:46071 #3672 (85 connections open) 2016-02-10t15:41:44.045+0000 [conn3672] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:44.046+0000 [conn3672] authenticate db: db { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } 2016-02-10t15:41:44.046+0000 [conn3672] command db.$cmd command: authenticate { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:44.083+0000 [initandlisten] connection accepted 10.240.0.6:46073 #3673 (86 connections open) 2016-02-10t15:41:44.084+0000 [conn3673] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms 2016-02-10t15:41:44.084+0000 [conn3673] authenticate db: db { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } 2016-02-10t15:41:44.084+0000 [conn3673] command db.$cmd command: authenticate { authenticate: 1, user: "xxx", nonce: "xxx", key: "xxx" } keyupdates:0 numyields:0 reslen:82 0ms 2016-02-10t15:41:44.182+0000 [initandlisten] connection accepted 10.240.0.6:46076 #3674 (87 connections open) 2016-02-10t15:41:44.182+0000 [conn3674] command db.$cmd command: getnonce { getnonce: 1 } keyupdates:0 numyields:0 reslen:65 0ms
collection information
currently our database contains 163 collections. important ones messages
, column
, cases
, ones heavy inserts, updates , queries on. rest if analytics , many collections of 100 records each:
{ "ns" : "db.message", "count" : 2.96615e+06, "size" : 3906258304.0000000000000000, "avgobjsize" : 1316, "storagesize" : 9305935856.0000000000000000, "numextents" : 25, "nindexes" : 21, "lastextentsize" : 2.14643e+09, "paddingfactor" : 1.0530000000000086, "systemflags" : 0, "userflags" : 1, "totalindexsize" : 7952525392.0000000000000000, "indexsizes" : { "_id_" : 1.63953e+08, "client_id_1_sn_1_mid_1" : 3.16975e+08, "client_id_1_created_1" : 1.89086e+08, "client_id_1_recipients_1_created_1" : 4.3861e+08, "client_id_1_author_1_created_1" : 2.29713e+08, "client_id_1_kind_1_created_1" : 2.37088e+08, "client_id_1_answered_1_created_1" : 1.90934e+08, "client_id_1_is_mention_1_created_1" : 1.8674e+08, "client_id_1_has_custom_data_1_created_1" : 1.9566e+08, "client_id_1_assigned_1_created_1" : 1.86838e+08, "client_id_1_published_1_created_1" : 1.94352e+08, "client_id_1_sn_1_created_1" : 2.3681e+08, "client_id_1_thread_root_1" : 1.88089e+08, "client_id_1_case_id_1" : 1.89266e+08, "client_id_1_sender_id_1" : 1.5182e+08, "client_id_1_recipient_id_1" : 1.49711e+08, "client_id_1_mid_1_sn_1" : 3.17662e+08, "text_text_created_1" : 3320641520.0000000000000000, "client_id_1_sn_1_kind_1_recipient_id_1_created_1" : 3.15226e+08, "client_id_1_sn_1_thread_root_1_created_1" : 3.06526e+08, "client_id_1_case_id_1_created_1" : 2.46825e+08 }, "ok" : 1.0000000000000000 } { "ns" : "db.case", "count" : 497661, "size" : 5.33111e+08, "avgobjsize" : 1071, "storagesize" : 6.29637e+08, "numextents" : 16, "nindexes" : 34, "lastextentsize" : 1.68743e+08, "paddingfactor" : 1.0000000000000000, "systemflags" : 0, "userflags" : 1, "totalindexsize" : 8.46012e+08, "indexsizes" : { "_id_" : 2.30073e+07, "client_id_1" : 1.99985e+07, "is_closed, deleted_1" : 1.31061e+07, "is_closed_1" : 1.36948e+07, "sn_1" : 2.1274e+07, "deleted_1" : 1.39728e+07, "created_1" : 1.97777e+07, "current_assignment_1" : 4.20819e+07, "assigned_1" : 1.33678e+07, "commented_1" : 1.36049e+07, "has_custom_data_1" : 1.42426e+07, "sentiment_start_1" : 1.36049e+07, "sentiment_finish_1" : 1.37275e+07, "updated_time_1" : 2.02192e+07, "identifier_1" : 1.73822e+07, "important_1" : 1.38256e+07, "answered_1" : 1.41772e+07, "client_id_1_is_closed_1_deleted_1_updated_time_1" : 2.90248e+07, "client_id_1_is_closed_1_updated_time_1" : 2.86569e+07, "client_id_1_sn_1_updated_time_1" : 3.58436e+07, "client_id_1_deleted_1_updated_time_1" : 2.8477e+07, "client_id_1_updated_time_1" : 2.79619e+07, "client_id_1_current_assignment_1_updated_time_1" : 5.6071e+07, "client_id_1_assigned_1_updated_time_1" : 2.87713e+07, "client_id_1_commented_1_updated_time_1" : 2.86896e+07, "client_id_1_has_custom_data_1_updated_time_1" : 2.88286e+07, "client_id_1_sentiment_start_1_updated_time_1" : 2.87223e+07, "client_id_1_sentiment_finish_1_updated_time_1" : 2.88776e+07, "client_id_1_identifier_1_updated_time_1" : 3.48216e+07, "client_id_1_important_1_updated_time_1" : 2.88776e+07, "client_id_1_answered_1_updated_time_1" : 2.85669e+07, "client_id_1_establishment_users_1_updated_time_1" : 3.93838e+07, "client_id_1_identifier_1" : 1.86413e+07, "client_id_1_sn_1_users_1_updated_time_1" : 4.47309e+07 }, "ok" : 1.0000000000000000 } { "ns" : "db.column", "count" : 438, "size" : 218672, "avgobjsize" : 499, "storagesize" : 696320, "numextents" : 4, "nindexes" : 2, "lastextentsize" : 524288, "paddingfactor" : 1.0000000000000000, "systemflags" : 0, "userflags" : 1, "totalindexsize" : 65408, "indexsizes" : { "_id_" : 32704, "client_id_1_owner_1" : 32704 }, "ok" : 1.0000000000000000 }
mongostat
here of lines have running mongostat during normal operation:
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netin netout conn time *0 34 2 *0 0 10|0 0 32.6g 65.5g 1.18g 0 db:0.1% 0 0|0 0|0 4k 39k 87 20:44:44 2 31 13 *0 0 7|0 0 32.6g 65.5g 1.17g 3 db:0.8% 0 0|0 0|0 9k 36k 87 20:44:45 1 18 2 *0 0 5|0 0 32.6g 65.5g 1.12g 0 db:0.4% 0 0|0 0|0 3k 18k 87 20:44:46 5 200 57 *0 0 43|0 0 32.6g 65.5g 1.13g 12 db:2.3% 0 0|0 0|0 46k 225k 86 20:44:47 1 78 23 *0 0 5|0 0 32.6g 65.5g 1.01g 1 db:1.6% 0 0|0 0|0 18k 313k 86 20:44:48 *0 10 1 *0 0 5|0 0 32.6g 65.5g 1004m 0 db:0.2% 0 0|0 1|0 1k 8k 86 20:44:49 3 48 23 *0 0 11|0 0 32.6g 65.5g 1.05g 4 db:1.1% 0 0|0 0|0 16k 48k 86 20:44:50 2 38 13 *0 0 8|0 0 32.6g 65.5g 1.01g 8 db:0.9% 0 0|0 0|0 10k 76k 86 20:44:51 3 28 16 *0 0 9|0 0 32.6g 65.5g 1.01g 7 db:1.1% 0 0|0 1|0 11k 62k 86 20:44:52 *0 9 4 *0 0 8|0 0 32.6g 65.5g 1022m 1 db:0.4% 0 0|0 0|0 3k 6k 87 20:44:53 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netin netout conn time 3 107 34 *0 0 6|0 0 32.6g 65.5g 1.02g 1 db:1.1% 0 0|0 0|0 23k 107k 87 20:44:54 4 65 37 *0 0 8|0 0 32.6g 65.5g 2.69g 57 db:6.2% 0 0|0 0|0 24k 126k 87 20:44:55 9 84 45 *0 0 8|0 0 32.6g 65.5g 2.63g 17 db:5.3% 0 0|0 1|0 32k 109k 87 20:44:56 4 84 47 *0 0 44|0 0 32.6g 65.5g 1.89g 10 db:5.9% 0 0|0 1|0 30k 146k 86 20:44:57 3 73 32 *0 0 9|0 0 32.6g 65.5g 2.58g 12 db:4.7% 0 0|0 0|0 20k 112k 86 20:44:58 2 165 48 *0 0 7|0 0 32.6g 65.5g 2.62g 7 db:1.3% 0 0|0 0|0 34k 147k 86 20:44:59 3 61 26 *0 0 12|0 0 32.6g 65.5g 2.2g 6 db:4.7% 0 0|0 1|0 19k 73k 86 20:45:00 3 252 64 *0 0 12|0 0 32.6g 65.5g 1.87g 85 db:3.2% 0 0|0 0|0 52k 328k 86 20:45:01 *0 189 40 *0 0 6|0 0 32.6g 65.5g 1.65g 0 db:1.6% 0 0|0 0|0 33k 145k 87 20:45:02 1 18 10 *0 0 5|0 0 32.6g 65.5g 1.55g 3 db:0.9% 0 0|0 0|0 6k 15k 87 20:45:03 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netin netout conn time 1 50 11 *0 0 6|0 0 32.6g 65.5g 1.57g 6 db:0.8% 0 0|0 0|0 9k 63k 87 20:45:04 2 49 16 *0 0 6|0 0 32.6g 65.5g 1.56g 1 db:1.1% 0 0|0 0|0 12k 50k 87 20:45:05 1 35 11 *0 0 7|0 0 32.6g 65.5g 1.58g 1 db:0.9% 0 0|0 0|0 8k 41k 87 20:45:06 *0 18 2 *0 0 42|0 0 32.6g 65.5g 1.55g 0 db:0.4% 0 0|0 0|0 5k 19k 86 20:45:07 6 75 40 *0 0 11|0 0 32.6g 65.5g 1.56g 10 db:1.9% 0 0|0 0|0 27k 89k 86 20:45:08 6 60 35 *0 0 7|0 0 32.6g 65.5g 1.89g 5 db:1.5% 0 0|0 1|0 23k 101k 86 20:45:09 2 17 14 *0 0 7|0 0 32.6g 65.5g 1.9g 0 db:1.3% 0 0|0 1|0 8k 29k 86 20:45:10 2 35 7 *0 0 4|0 0 32.6g 65.5g 1.77g 1 db:1.3% 0 0|0 0|0 7k 60k 86 20:45:12 4 50 28 *0 0 10|0 0 32.6g 65.5g 1.75g 10 db:2.0% 0 0|0 0|0 19k 79k 87 20:45:13 *0 3 1 *0 0 5|0 0 32.6g 65.5g 1.63g 0 .:0.7% 0 0|0 0|0 1k 4k 87 20:45:14 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netin netout conn time 5 77 35 *0 0 8|0 0 32.6g 65.5g 1.7g 13 db:3.0% 0 0|0 0|0 23k 124k 88 20:45:15 3 35 18 *0 0 7|0 0 32.6g 65.5g 1.7g 5 db:0.8% 0 0|0 0|0 12k 43k 87 20:45:16 1 18 5 *0 0 11|0 0 32.6g 65.5g 1.63g 2 db:0.9% 0 0|0 0|0 5k 35k 87 20:45:17 3 33 21 *0 0 5|0 0 32.6g 65.5g 1.64g 3 db:0.8% 0 0|0 0|0 13k 32k 87 20:45:18 *0 25 4 *0 0 42|0 0 32.6g 65.5g 1.64g 0 db:0.3% 0 0|0 0|0 5k 34k 86 20:45:19 1 25 5 *0 0 5|0 0 32.6g 65.5g 1.65g 3 db:0.2% 0 0|0 0|0 5k 24k 86 20:45:20 12 88 65 *0 0 7|0 0 32.6g 65.5g 1.7g 25 db:4.2% 0 0|0 0|0 42k 121k 86 20:45:21 2 53 17 *0 0 4|0 0 32.6g 65.5g 1.65g 2 db:1.5% 0 0|0 0|0 12k 82k 86 20:45:22 1 9 6 *0 0 7|0 0 32.6g 65.5g 1.64g 1 db:1.0% 0 0|0 0|0 4k 13k 86 20:45:23 *0 6 2 *0 0 7|0 0 32.6g 65.5g 1.63g 0 db:0.1% 0 0|0 0|0 1k 5k 87 20:45:24
replica set: updated on may 15th 2016
we migrated our stand alone instance replica set. 2 secondaries serving reads , 1 primary doing writes. machines on replica set area snapshots of original machine. happened new configuration issue changed , it's harder detect.
it happens less instead of sky rocketing connections , queues, whole replica set stops reading/writing, no high connections, no queues no expensive operations @ all. request db time out. fix issue sigkill mongodb process must sent 3 machines.
Comments
Post a Comment