Wednesday, June 25, 2014

#MongoDBworld Hardware provisioning of @MongoDb – What I need to know!

Session details:

Hardware Provisioning for MongoDB

Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.

Chad Tindel

Senior Solution Architect at MongoDB

Chad Tindel is a Senior Solution Architect at MongoDB where he specializes in helping customers understand and use the nosql product to solve complex business problems. Previously, Chad was a Solution Architect at Cloudera focusing on the Hadoop space and was also a Solution Architect at Red Hat, helping customers build out their enterprise Linux infrastructures. He holds a BS in Computer Science from California Polytechnic in San Luis Obispo as well as an MS in Finance from the University of Denver.

Hardware Provisioning

There is not a lot of information out there on sizing MongoDB. This session, even though the last session was well attended.

How do you size?

Often customers over or under engineer.

Think the scenario where your app gets listed and suddenly lots of sign ups. The server gets beaten and needs to be re-sized.

Requirements – Step 1

What are the business requirements

  • Uptime (do you need more than one Data Center
  • Availability
  • Throughput
  • Responsiveness
  • Acceptable latency – especially during peak times.


Resources available

Continuing Requirements

  • Requirements can change over time
  • More users, more data, new indexes
  • More writes


  • Collect metrics!
  • Adjust configuration incrementally
  • Plan ahead

Try to avoid a crisis.

Do a Proof of Concept

  • Start small on a single node
  • Design your schema (Read and write applications are different)
  • Understand query patterns
  • Get a handle on working set (the active data)

Then add replication to see impact

Review Requirements as result of POC

  • Data sizes (Number of documents, Average document size, size of data on disk, size of indexes, expected growth, document model)
  • Ingestion – Throughput / Updates / Deletes per second peak and average
  • Bulk inserts? How large and How often?

Do you have SLAs on this performance?

  • Performance expectations
  • Life of data
  • Security requirements (SSL, Encryption at rest)
  • Number of data centers in use (Active/Active , Active/Passive Cross Data Center latency)

Resource Usage:

IOPS (4K in size)
Data and loading patterns.

CPU tends to be less important

Fast storage and as much RAM as you can.

Network latency affects replication lag


7200 RPM SATA = 75-100 IOPS
15000 SAS = 175-210 IOPS
Amazon SSD EBS = 4,000 PIOPS / Volume
48,000 PIOPS / Instance

Intel X-25-E SLC = 5,000 IOPS

Use IOSTAT to monitor disk performance (or MongoPerf).

Release 2.4 added a feature to estimate the size of a working set.

Network Performance

Latency impacts WriteConcern time and ReadPreference

Throughput impacts Update and Write Patterns and Read/Queries

Use Netperf to measure network performance.

CPU Usage

Only really comes in to play when using queries without indexes which mean performing a table scan.
or for Sorting within a Shard and MergeSorted when aggregated.

Aggregation Framework or MapReduce require CPU Performance.

Case Study – Spanish Bank:

  • 6 months of logs held for 6 months
  • 18TB at 3TB/Month
  • Prod environment:
    3 Nodes / shard * 36 Shards = 108 Physical Machines
    128GB/RAM * 36 = 4.6TB RAM

2 Mongos
3 Config servers (virtual machines)

Online Retailer

  • moving Product catalog from SQL Server to MongoDB as an overhaul to Open Source
  • 2 Main Data Centers active/active
  • Cyber Monday peaks at 214 Requests/Sec. Budget for 400 Requests/Sec for headroom.
  • Heavy Read process orientation.


  • 4 M product SKU’s with JSON document size of 30KB
  • Requests for specific product (by _id)
  • Products by Category (Return 72 documents – or 200 if a google bot)


  • Partition (Shard) by Category.
  • Products in multiple categories are duplicated means on average doc is in 2 categories so store 4M SKUs x 2 = 8M

8M docs * 30K want everything in memory. 384GB RAM/Server

Sharding adds a layer of complexity (eg. Add config server) so don’t shard unless you need to.

Determined a 4 Node Replica set 2 in each Data Center. Plus an Arbiter.

Recommended a Single Replica Set
- 4 Node Replica

But customer found they could only deploy on 64G RAM. So they deployed 3 shards 4 nodes each + Arbiter.

Arbiters are small. They just exist for voting. Can be a small 1VCPU with 4GB RAM.

[tag health cloud BigData MongoDB MongoDBWorld NoSQL]

Mark Scrimshire
Health & Cloud Technology Consultant

Mark is available for challenging assignments at the intersection of Health and Technology using Big Data, Mobile and Cloud Technologies. If you need help to move, or create, your health applications in the cloud let’s talk.
Stay up-to-date: Twitter @ekivemark
Disclosure: I began as a Patient Engagement Advisor and am now CTO to Personiform, Inc. and their platform. Medyear is a powerful free tool that helps you collect, organize and securely share health information, however you want. Manage your own health records today. Medyear: The Power Grid for your Health.

via WordPress