What are Page Faults in MongoDB?
In General, page faults will occur when the database fails to read data from RAM, so it is forced to read off of the physical disk. Now MongoDB page faults occur when the database fails to read data from virtual memory and must read the data from the physical disk. Most databases cache data in RAM as often as possible to avoid having to read from physical disks since it is slow and costs you valuable time. We all wish, that all of our data would be stored in RAM, but that’s extremely expensive and usually infeasible, so the database will inevitably need to read from disk.
Now, depending on the version of MongoDB, that you’re running, the storage engine you will be using is either MMAPv1 or WiredTiger. MMAPv1 was MongoDB’s original storage engine until 3.2, which was then replaced by WiredTiger as the default storage engine, and has been officially deprecated as of the release of 4.2. This blog mainly focuses on MMAPv1 since it is an older, deprecated technology and page faults are not tracked as a relevant statistic in WiredTiger.
The older versions of MongoDB use memory to manage documents, indexes in memory and then has MMAPv1 translate these files to virtual memory. MMAPv1 uses memory-mapped files to read data from virtual memory through a
mmap () syscall. Since MMAPv1 relies heavily on virtual memory for its processes, it is prone to page faults as there is a limited amount of space to cache data to virtual memory.
Find The Root Cause
As mentioned in our previous MongoDB troubleshooting blogs, just like database locking, and replication lag, MongoDB page faults are a common occurrence in your MongoDB Database (in MMAPv1), but if they begin to happen consistently, or at a higher than normal volume, then you may need to take action. MongoDB Page faults are usually indicative of a deeper problem within your system. Due to it being an older technology, MongoDB page faults will occur more often in MMAPv1. Some of its constraints are a lack of data compression options, using all of its free memory to cache, and its inability to scale, which all relates back to not enough available RAM to read off of. MongoDB page faults may also happen due to unindexed queries, and this may be the root cause if you are noticing a high ratio of page faults compared to operations, usually a 1:1 ratio or higher.
Minimize and Prevent Future Page Faults
Now that you know what causes Page Faults to happen and what the underlying problem could be, its time to stop them from happening too often!
Ideally, in a perfect world, you would be able to completely stop MongoDB page faults from ever happening again… sadly, we don’t live in a perfect world, so your best hope is to minimize their occurrence. Since you know that there are a few causes for MongoDB page faults, then you can expect that there are a few different methods in preventing them. Since MMAPv1’s performance relies heavily on the amount of RAM available and caching data in the virtual memory, the first and foremost preventative measure you will want to take is to monitor the amount of RAM that your systems have available for use. You can use Mongostat, which allows you to get stats returned to you from MongoDB, including page faults. MongoStat is a pretty basic monitoring tool, that won’t give you much insight into why problems are arising. You may want to consider setting up a more comprehensive monitoring system for your entire environment with services like Google Cloud Monitoring and Logging, or New Relic monitoring.
Using a more comprehensive monitoring solution can allow you to stay on top of the problem and be proactive, instead of reactive. BindPlane can be implemented with these monitoring services and will let you monitor and set up alerts for metrics relevant to Page faults including, file size, index size, the number of indexes, Memory usage (mapped, resident, virtual) and a lot more, you can find the rest in our MongoDB for Stackdriver docs.
Along with monitoring the relevant metrics of MongoDB page faults, you can also make sure your data is configured into working sets that fit into memory and won’t use more RAM than required. You should also make sure your data is indexed in MongoDB correctly. Indexing is very important when it comes to executing queries efficiently. When your data isn’t indexed, it will take more RAM to access your data and could push page faults if there isn’t enough available in your environment. Visit the MongoDB docs on indexing to learn more about properly indexing your data.
Now unless there are extenuating circumstances, if you are using MMAPv1, you might want to consider upgrading to the newest version of MongoDB and jump to WiredTiger. It could be difficult to migrate all of your data to a new engine, but in the long run, it is worth the upgrade now that MMAPv1 has been deprecated and is no longer supported by MongoDB.
WiredTiger Storage Engine:
MongoDB’s storage engine released in version 3.0 and is now the default engine as of MongoDB version 3.2.
MMAPv1 Storage Engine:
MongoDB’s original storage engine that has since been deprecated in MongoDB 4.0.
WiredTiger vs MMAPv1 Data compression
WiredTiger Data compression: Having its own write-cache and a filesystem cache, as well as supporting snappy and Zlib compression, WiredTiger takes up much less space than MMAPv1
MMAPv1 Data compression: Data compression is not supported so MMAPv1 is based on memory-mapped files. Consequentially, MMAPv1 succeeds at high volume actions.
WiredTiger vs MMAPv1 Journaling compression
WiredTiger Journaling: Using checkpoints is at the core, while all journal writes maintain data changes between checkpoints. To recover from crashes, the journal entries from the last checkpoint are used.
MMAPv1 Journaling: In the event of a crash, MongoDB with MMAPV1 will access the journal files to apply a consistent state.
WiredTiger vs MMAPV1 Locks and Concurrency
WiredTiger Locks and Concurrency: Employs document-level locking where intent locks are only used at the global, database, and collection layers.
MMAPV1 3.0 Locks and Concurrency: Uses collection-level locking.
MMAPV1 2.6: Allows concurrent reads to the database but single writes get exclusive access.
WiredTiger vs MMAPV1 Memory
WiredTiger Memory: MongoDB with WiredTiger deploys a filesystem cache and an internal cache. All free memory is used by MongoDB. WiredTiger will use either 50% of RAM or 256 MB depending on which is larger.
MMAPv1 Memory: MongoDB with MMAPv1 will access as much available memory. However, MongoDB will yield cached memory when another process requires at least half of the server’s RAM.
WiredTiger vs MMAPv1 Comparison Table
|Updates||Documents are forced to rewrite. In-place updates not supported.||High volume & in-place updates.|
|CPU Performance||multi-core system performance||more CPU cores != better performance|
|Transaction||Multi-document transactions||Atomic operations on a single document|
|Encryption||Encryption at rest||Not possible|
|Memory||Internal & filesystem cache||Uses all free memory as cache|
|Tuning||More variables available for tuning||Less opportunities to tune|
|Locks & Concurrency||Document-level locking||Collection level locking|
|Journaling||Checkpoints used & journals used between checkpoints||Uses journal files for a consistent state|
|Data Compression||Snappy & Zlib compression||Not supported|
This article originally appeared on Intellyx
Today, Blue Medora offers two products with different go-to-market strategies: True Visibility for on-premises VMware deployments, and BindPlane, a SaaS offering that supports cloud and hybrid IT environments.
Blue Medora has also recently added log monitoring to its capabilities, rounding out its cloud-native observability story, which extends its hybrid IT capabilities.
BindPlane is also able to query the environments it has access to, identifying available sources of data automatically, thus dramatically simplifying installation and configuration.
Update: Google Stackdriver is now Google Cloud Logging and Google Cloud Monitoring. BindPlane will continue to integrate and support both of these products.
A few weeks ago I sat down with Justin Brodley and Jonathan Baker, hosts of the Cloud Pod podcast to talk about BindPlane, and to discuss logging and monitoring. The podcast was posted this past week and I want to provide you a quick overview of the topics we covered and how BindPlane from Blue Medora fits into the mix. You can listen to the podcast here.
In order to get the most out of the podcast and the concepts we discussed, I want to quickly break down who Blue Medora is and share some background on our latest SaaS product, BindPlane. Blue Medora lives in the performance monitoring space for IT, but we don’t think of ourselves as the user’s primary platform. With our BindPlane product, we help customers expand their logging and monitoring to their preferred destinations like Google Cloud Platform (GCP) and New Relic to expand the aperture of what they are able to observe. Our goal is to reduce the requirements for monitoring tools so that an organization can understand its full environment with BindPlane. We do this by making it easy to deploy agents, monitor the different components an organization needs, and put it all into a single view within a customer’s preferred centralized location.
Our latest release expands BindPlane to now monitor logs within Google Stackdriver and New Relic. BindPlane Logs allowed us to deploy a fully managed log agent that comes with pre-configured log bundles, which lets customers get everything up and running with only what they need and want to be monitoring. Helping customers take control of what they are monitoring, and the frequency, helps them to avoid the surprise bill that comes at the end of the month from over-monitoring. By making it easy for users to get started and to update, BindPlane separates itself from other open source solutions by putting the customer in control.
Now that you have a basic understanding of BindPlane, here are a few of the additional topics that we dive into on the podcast:
- Serverless Technology (Kubernetes v Serverless)
- Machine Learning and AI
- Configuration and Compliance Management
- On-premise to Cloud Migration
- The BindPlane Roadmap
I don’t want to give too much away, but as we discuss the above topics, I share our relevant customer use cases for how BindPlane is helping DevOps save time and cut costs with monitoring both logs and metrics. I would also like to thank Justin and Jonathan once more for being great hosts and I hope that you will get a lot out of our conversation!
Company demonstrates strong revenue growth, product innovation, and strengthening of the executive team
GRAND RAPIDS, MI. – JANUARY 27, 2020 – Blue Medora, the leading provider of enterprise IT monitoring integration solutions, reported today that it closed a record-breaking year in 2019 with sector-leading performances across revenue, customer acquisition and product innovation as well as adding key hires to its executive staff.
The leading indicator of the company’s growth is its thriving customer acquisition. Blue Medora’s total customer base grew 60% in 2019, resulting in over 550 total customers worldwide in industries spanning government, healthcare, telecommunication, finance and retail. Monthly recurring revenue increased 52% as a result of this strong customer growth.
To drive this momentous growth, the company expanded its executive team by welcoming Bekim Protopapa as chief executive officer, Carol Volk as chief marketing officer and Greg Pattison as general manager.
- Bekim Protopapa joined Blue Medora in October, bringing over 20 years of leadership experience from companies within the systems management, networking and security sectors including roles at NetIQ, BladeLogic, Blue Coat Systems, Cymtec and Mimecast. During his tenure at NetIQ, the company grew from a small startup to a public company with a $250+ million run rate and made several strategic acquisitions. While at Blue Coat Systems, the company experienced a period of unprecedented growth, expanding from roughly $100 million in annual revenue to over $400 million in under four years. Lastly, as the GM/COO of Mimecast in North America, he led a high-velocity go-to-market team, ultimately culminating in one of the NASDAQ’s most successful IPO’s in 2015.
- Carol Volk joined Blue Medora in July from cybersecurity leader STEALTHbits where she ran marketing, growing marketing-sourced revenue from near 0% to over 30%. Prior to STEALTHbits, Carol held marketing and product management positions at Oracle and RightNow Technologies. Carol also co-founded and ran a startup that provided premiere sales and marketing tools for high tech sales organizations, focusing on Fortune 500 organizations that included Microsoft, IBM, Hewlett Packard, Cisco, Siemens, D&B, and Johnson Controls. Prior to this, Carol held product development and other technical positions at Voxware, McAfee and Prudential Investments.
- Greg Pattison returned to Blue Medora in July to lead the True Visibility Group business unit. Greg leverages his past experience in product development for the automotive, finance, computer and paper industries to produce world-class IT monitoring solutions. Greg served in the US Navy as a Nuclear Power Plant Supervisor for six years prior to getting his computer engineering degree. He has worked for Mead, Burke Porter, and Atomic Object before joining Blue Medora.
In addition to these executive hires, Blue Medora also expanded its partner ecosystem. The company established a growing relationship with Google, that encompasses Google’s monitoring service providing metrics and logs data from non-Google Cloud Platform technologies, launching a metrics offering in March and expanding to logs in November 2019. Blue Medora also deepened its relationship with New Relic, offering New Relic One customers the ability to collect log data from more than 50 log sources, announcing availability in October. This builds on the momentum established with New Relic over the last two years with the BindPlane solution for metrics.
“Blue Medora continues to deliver strong operating results, and product innovation that solves real customer problems. To that end, 2019 was no different, and we accelerated in several key areas,” said Bekim Protopapa, CEO of Blue Medora. “The pace of product innovation throughout the year was a significant component in driving the momentum we experienced. On average, we brought to market one new metrics or logs source per week. It’s clear our rapid product development is resonating with our users given the impressive year over year growth in our customer base.”
In addition to the standard operating benchmarks, Blue Medora has always had a deep tradition of giving back to the western Michigan community it calls home. A sampling of the organizations Blue Medorians are proud to work with includes:
- SoftwareGR.org – board member and donors
- Bitcamp – Blue Medora hosts a coding camp for 7th and 8th-grade girls
- FIRST Robotics Team – Blue Medora hosts the practice space and has donated laptops for a local club
- Django Girls GR, The Right Place, Degage and Friends of GR Parks – volunteers and/or donors.
About Blue Medora
Blue Medora’s pioneering IT monitoring integration as a service addresses today’s IT challenges by easily connecting system health and performance data–no matter its source–with the world’s leading monitoring and analytics platforms. Blue Medora helps customers unlock dimensional data across their IT stack, otherwise hidden by traditional approaches to metrics collection.
P: +1 650 996 0778