For any given task, there are many tools you can use. But few are the right tools. If you’re in IT, you’ve probably heard the saying “Use the right tool for the job.” In architects’ meetings, this sentence echoes from one corner to the other. In my career, I’ve heard it more times than I can count. And the wisdom behind it has stood the test of time. That’s because when it comes to how data is stored, it’s not one-tool-fits-all for every type of data. In this post, you’re going to learn about MongoDB on Google cloud, including how to set it up and keep it healthy.
There’s always the right database for every task. Today, data grows rapidly and often at a fast pace. It comes in all shapes and sizes. This gives rise to questions like
- How flexible is your data storage? Can it adapt to ever-changing business demands?
- How easy is to scale and respond to an increasing workload?
Sometimes these questions lead you to the door of a NoSQL database. In fact, it might lead you to MongoDB.
What Is MongoDB?
MongoDB is a NoSQL database. It’s completely document-oriented.
For some time, traditional relational databases have been the modus operandi for most organizations.
In a relational database like PostgreSQL, there are tables that have relations to other tables. A single row in the table constitutes a record. The schema is predetermined and well planned before creation.
MongoDB uses a different approach. In MongoDB, everything that constitutes a record is stored as a single JSON object, and MongoDB stores them without conforming to a specific schema. This allows for flexibility—you can store data of any attributes.
So now you know the difference between a relational database and a NoSQL database. But why would you want to run a NoSQL database like MongoDB anyway?
They say data is the heart of every application. When something’s wrong with your data, it will propagate and infect the rest of your application.
Modern applications change frequently during the course of development. Most of the time, it affects the data schema.
The traditional relational database management system lacks the flexibility to easily move with the changes associated with data schema.
Relational databases were built for structured data, where each row has an identical column. But some data are just too ambiguous to be confined to a specific data schema.
With MongoDB, you’re not limited to a specific data schema.
Gone are the days when applications were designed to serve a small, finite of set people in one office location. Now, users troop in from every geographical region to use modern applications. You need a database that is easily scalable as you grow.
MongoDB is that database that will help you scale. Database systems with a large data set could be challenging for a server to handle. MongoDB implements sharding, a technique that allows you to scale out your database horizontally by distributing your data and load over multiple servers.
There are even more reasons why MongoDB might just be the right database for your project, but let’s move on to talking about why it works especially well on GCP.
Why MongoDB on Google Cloud Platform?
These days, we see a lot of cloud evangelists preaching that running an on-premise data center is archaic and that you shouldn’t do it.
And yes, you probably know that. But how do you know which cloud provider is suitable for you? Why choose MongoDB on Google Cloud and not other cloud providers?
Here are a few reasons:
- GCP has the most powerful network of all cloud providers. Running MongoDB on Google’s cloud platform means you benefit from the low latency infrastructure Google provides.
- GCP supports live migration—that is, you can migrate your existing VMs from one host to another without downtime.
- Google has security built into its culture. You’ll benefit from all the security practices Google has learned and built in over the years.
With that covered, let’s jump into implementation details!
When it comes to using MongoDB on Google Cloud’s infrastructure, you have two options:
- Set up MongoDB on Google Cloud yourself.
- Use one of the integration options provided by GCP.
We’re going to cover the self-managed route. The next few paragraphs will show you how to set up MongoDB on Google Cloud yourself.
Choosing the Right Instance Type
First, you need a compute instance to run MongoDB on. But how do you know which instance is right for you?
The answer largely depends on your workload. You should first consider how much data you have, your queries, and the kind of service-level agreements you have with your customers.
Ideally, you want to ensure your choice of instance has enough RAM to support your workload and that it has fast enough persistent storage. Generally, databases do well with fast persistent storage.
For this example, I’m using instance type n1-standard-1 (1 vCPU, 3.75 GB memory) running on Ubuntu 18.00LTS.
Let’s go through the process of creating a new instance.
Creating the Instance
1. Go to Compute Engine in GCP and click on Create Instance
2. Fill in the details of your instance type and click Create.
This could take a few seconds—don’t worry if it does.
3. By default, MongoDB listens on port 27017. You want to make sure this port is allowed in your firewall rules for the specific IP addresses you want to grant access to. Mind you, don’t open this port to the entire world. One of the ways to limit access is by creating a new firewall rule in your VPC like this:
Source IP ranges are the allowed IP addresses of servers to allow access, protocols, and ports.
At this point, you’ve provisioned your instance. The next step is to install MongoDB.
How to Install MongoDB
First, you need to SSH into the Ubuntu instance you created. There are many ways you could do this. One way is to use the SSH from the browser option that GCP provides.
From the command line, run the command below to import the MongoDB public GPG key:
$ wget -qO – https://www.mongodb.org/static/pgp/server-4.0.asc | sudo apt-key add –
Create the file /etc/apt/sources.list.d/mongodb-org-4.0.list using the command below:
$ echo “deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.0 multiverse” | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.list
Finally, you can install MongoDB by running this command:
$ sudo apt-get update && sudo apt-get install -y mongodb-org
The command above will take a couple of seconds to complete. And afterward, you should have MongoDB installed.
Now you’ll want to take a moment to verify that MongoDB was installed correctly. To do that, run this command:
$ sudo systemctl status mongod
If everything went fine, you’ll have a screen like this:
At this point, MongoDB is installed but it’s not running. So start the MongoDB with this command:
$ sudo systemctl start mongod
If you check the status again with sudo systemctl status mongod, the status changes to active, as shown here:
Congratulations! You’ve successfully installed a MongoDB instance.
There’s still more to configure, though. In the next section, I’ll show you how to create an administrator account.
Creating an Administrator User
By default, MongoDB doesn’t require authorization to accept connections from the bind IP address (bindIp). This means you can manage MongoDB databases from the bind IP address without authenticating. In later versions of MongoDB, the bindIp value is usually set to 127.0.0.1 (localhost).
This becomes a huge security risk if you change the bind IP address to an IP address accessible from the internet without enabling authorization. Don’t do it!
So, how do you create an administrator account? It’s simple.
First, execute the command below:
You should have a MongoDB shell connection like this:
Now, you can execute the command below to create an admin account. Don’t forget to replace AdminUser with your preferred username and strongPassword with a strong password.
If you run the command above, you’ll get an output similar to this:
Type exit and hit ENTER or use CTRL+C to leave the MongoDB shell.
At this point, users are still able to connect to the MongoDB instance without authentication. To enforce authentication, you need to enable security in mongodb.conf.
Open mongodb.conf with Vim, or any editor of your choice, like this:
$ sudo vim /etc/mongodb.conf
Enable authorization under the security configuration option in the file as
There are more options possible. To see those, check outMongoDB’s official documentation on security configuration.
Now, you can save the changes and restart MongoDB with this command:
$ sudo systemctl restart mongod
To confirm that unauthorized connections cannot manage databases, execute this command:
In the shell, run the command show users, as shown here:
$ show users;
You should see a message that says the command usersInfo requires authentication, like this:
2019-08-10T09:41:20.016+0000 E QUERY [js] Error: command usersInfo requires authentication :
Okay. But it’s not the end of the road yet. We’ll need to change the bind IP next.
Changing the Bind IP
Bind IP is a configuration option in MongoDB that prevents it from accepting connections other than the IP addresses set in the bindIp. By default, it’s set to 127.0.0.1.
To let other servers—for example, the application server in the VPC—connect to the MongoDB instance, you need to change the bind IP address in the MongoDB configuration file. Let’s discuss how to do that now.
Open file etc/mongodb.conf with Vim:
$ sudo vim /etc/mongodb.conf
Change the bindIp to the IP address of your servers like this:
Save the changes and restart MongoDB with the command sudo systemctl restart mongod.
At this point, shouldn’t you test that you can actually connect to MongoDB from the remote host IP addresses you configured above?
The answer is yes, and there are a few ways you can do this. And the next section, I’ll walk you through one of those ways.
Connecting to a Remote MongoDB Instance
We’ll now connect to a remote MongoDB instance.
You can do this using MongoDB clients and drivers. MongoDB has libraries/drivers for major programming languages. You can use these libraries in your code to connect to remote instances. However, for this example, I’ll use a MongoDB client on a Ubuntu machine. You can install it with this command:
sudo apt install mongodb-clients
To connect from this machine (application server) to the remote MongoDB server (database server) you set up earlier, run this command:
mongo -u “YourAdminUserName” -p –authenticationDatabase admin –host 188.8.131.52
Replace YourAdminUserName with your actual username and 184.108.40.206 with your actual MongoDB server IP address. Then, enter your password when prompted.
So far, the MongoDB instance is running. But that doesn’t mean it’s going to keep running forever. How do you make sure the instance is running, healthy, and always available?
Evolving the Architecture
Now, I’ll show you how to keep your instance running and healthy.
At the moment, this what the architecture looks:
This architecture is not the best. If the hardware fails and the database goes offline, your application will fail. Having a single MongoDB instance introduces a single point of failure.
To make this architecture more resilience, you should configure a replica set. A replica set is a set of MongoDB instances that shares the same data content. It contains a primary node and secondary nodes.
A replica set’s primary node is where write operations happen. The secondary nodes replicate data changes from the primary nodes.
A replica set in MongoDB is self-healing. The failover and recovery process is automated. When the primary node fails, a new primary node is elected using theRaft consensus algorithm. This means your database continues to be available, even in time of failure.
If you introduce replication, the new architecture becomes as follows:
You can improve on this architecture by distributing your replicas across multiple GCP regions such that when a region suffers an outage, you’re not affected.
Now, the question is, “How do you gain visibility into your clusters, monitor performance, and failure?” The answer is simple: monitoring.
Monitoring MongoDB Replicas
MongoDB from version 4 or later offers free cloud monitoring for standalone instances and replica sets. This monitoring includes operation execution times, memory usage, CPU usage, and operation counts.
The free monitoring is not only limited in metrics but also has a retention time of 24 hours. For more information on how to enable free monitoring, check MongoDB’s documentation on monitoring.
If you’re running MongoDB in production, the free monitoring option might not cut it for you. You might want to opt into more advanced, paid services like
In this post, we’ve learned how to deploy MongoDB on Google Cloud Platform using the self-managed route. We also talked about architecting a highly available MongoDB cluster. We touched briefly on monitoring and paid services available for your use. We’ve covered a lot of ground!
To build on this knowledge I recommend you check out MongoDB’s security checklist, MongoDB’s documentation on administration, and Google Cloud’s article on designing robust systems. And remember to always choose the right tool for the job!
Monitoring MongoDB Made easy
Maintaining the health and performance of your IT infrastructure can be one of the biggest headaches you face in your time as an IT professional. With so many moving parts it can be nearly impossible to keep track of them all and how they interact with each other making it quite difficult to efficiently and successfully monitor MongoDB. One day you may log into your MongoDB environment and it’s not returning your queries, or you’re not getting the results you expected. Now you have to figure out why this is happening, is it a network issue? Have you run out of disk space or other resources? Have your files within MongoDB somehow been corrupted? Is there even a malicious DDOS attack being carried out against your organization? Now you have to go through and troubleshoot to try and find the underlying cause. Well with BindPlane all of those hours of troubleshooting are a thing of the past. Here we will show you how BindPlane with Stackdriver will make it easier to monitor MongoDB.
Centralized Monitoring with BindPlane
BindPlane is a monitoring solution that integrates all of your monitoring needs into a single, centralized location. BindPlane lets you integrate the monitoring of everything from the fan speed of hardware your database is hosted on, to tracking each individual user that access your database and the documents they query into a single Destination. BindPlane integrates MongoDB into a monitoring destination of your choice such as Google Stackdriver, New Relic, Azure, Wavefront and others.
BindPlane’s UI gives you the ability to see how many metrics are being sent from MongoDB to your destination, the metrics per minute, and helps you easily see if data stops flowing from your source to your destination.
Within BindPlane, you can also set up resource types with different keys, separating your metrics into different categories, allowing you to track different metrics for different resources. In this section of BindPlane, you can enable the different KPI and non-KPI metrics you would like to collect and monitor.
Metrics can be sent to any destination you choose, but today we are going to focus more on using BindPlane’s integration with Google Stackdriver to monitor MongoDB.
Monitor MongoDB with Detailed Dashboards and Metrics
When you monitor MongoDB with Stackdriver through BindPlane, you gain the ability to send detailed metrics to Google Stackdriver and create custom dashboards that give you the ability to visualize your data in real time, allowing you closely monitor important Key performance indicators (KPI) and to compare and analyze the data to gain a deeper insight into your system’s performance. You can create dashboard to visualize metrics within MongoDB such as disk Space, CPU usage, connection count, available connection count, number of queries and pretty much any part of your environment that you want to monitor. Being able to easily keep track of these metrics will help you stay proactive when it comes to keeping your environment running efficiently, and letting you avoid sifting through all of the symptoms of the problem, getting you directly to the source of your problem in a fraction of the time, saving you time, money and resources.
Google Stackdriver has the ability to create alerts for your environment. These alerts will monitor MongoDB for when any thresholds that you have set are exceeded, such as storage, temperature, down time, etc. furthering your ability to be proactive rather than reactive when it comes to dealing with any problems.
Monitor MongoDB with BindPlane Logs
An exclusive feature to BindPlane for Stackdriver is BindPlane allows you to integrate your MongoDB source into Google Stackdriver Logging, helping you to better understand the performance, and health of your system through the use of logs, log-based metrics, and alerts. BindPlane logs can be used to track numerous events within the MongoDB environment to let you know when something may need your attention. Google Stackdriver Logging can be used to monitor MongoDB events that you may consider integral to the performance and health of your system, for example a few events that you may want to use logs to track would be:
- When users input queries and notify when a return fails
- Track when files are replicated and if they fail to replicate.
- Send a log for every time an index is created
- Successful and failed authentications
- Alerts can also be set up to notify you about failures such as a primary replication failure and when a secondary replication is designated as a new primary
- Set an alert for when capped collections are hit.
Monitor MongoDB With Log-based metrics and alerts
Another great feature you get when you monitor MongoDB with BindPlane and Stackdriver is the ability to create log-based metrics. Just like with metrics, Log-based metrics allow you to create graphs that allow you to visualize the logs being sent from MongoDB to Stackdriver, giving you a clean, easy to read dashboard to display your log data. Using these log-based metrics can assist you when trying to understand the patterns in the volume and time periods when certain log events occur.
Alerts can also be configured for each of these log-based metrics, letting you know when a threshold has been hit, down-time alerts, or a specific incident you are monitoring for occurs.
Get Started Today
BindPlane is capable of monitoring more than 150 sources and can provide valuable insights into your IT environments daily activities, to help you find the root of your problems, and help you be proactive to avoid any issues in the future. If you are a Google Stackdriver user, you can activate BindPlane at no extra cost. To start your free trial today, visit our website for more information and how to get started.
Relational databases have commonly been an essential component of every application. Cloud vendors like GCP have a service for relational databases. GCP’s relational database offering is GCP Cloud SQL, which is a managed service for relational databases similar to RDS from AWS.
Traditionally, you had to manage and operate a lot of things when working with databases on-premises. This includes things like installing and configuring the database engine, taking backups periodically, monitoring servers, patching servers, upgrading servers, etc. With a managed service like Cloud SQL or RDS, almost everything is the responsibility of the cloud vendor.
Maybe Cloud SQL is new to you. But if you already have experience with AWS RDS, this post is for you. Today’s post is not necessarily about comparing Cloud SQL features with RDS ones as in a comparison table. You’ll learn about Cloud SQL through the lens of RDS. I’m going to use the console wizard to create a database to learn about GCP’s offering.
Let’s get started!
To begin with, let me talk about the basics of Cloud SQL and how that translates into RDS terms.
As of today, Cloud SQL only has support for the following engines: MySQL, PostgreSQL, and SQL Server (in alpha). RDS has support for the same engines and more. Still, Cloud SQL features are as good as the ones from RDS. Another big difference is that in AWS you have a lot of options to choose for machine types. In GCP, you have fewer options—just shared-core machines for development purposes, standard, and high memory machine types. The bigger the VM is, the better performance you’ll get for the VM network.
Here’s the first screen when creating a database in GCP:
Notice that you’re able to create a database without a password, allowing anyone to connect with the root user. You have the option to set a password later. In this first section, you get to choose the region and zone for the server. In AWS, you don’t choose the region because it’s taken from the one the user is navigating in the console.
When you click on “Show configuration options,” you’ll see a section where you can add database flags. A database flag is similar to a parameter group in RDS. You can change the default values of the database server—for example, the default time zone.
Networking and Connectivity
As you might already know, networking in GCP is quite different from AWS. In RDS, you can create subnet groups that will define whether the database servers have public or private access. Although, you still have the option in RDS to specify if the database is public or private. In GCP, you can also configure public and private access. When you set private access, you have to select from which network GCP will assign a static IP address. You don’t have that type of control in RDS.
Another big difference is securing access to the database. In RDS, you can choose which security groups will be attached to the database. A security group is a mechanism to restrict or allow access at the networking level. In GCP, you can only restrict access by adding the static IP address(es) for public access. For private access, the configuration is limited to only a VPC at a time. If you connect from a VM in GCP, it has to be in the same region.
Private connectivity has its limitations, so to speak. For example, you won’t have access from on-premises or another VPC even if there’s VPC peering. You don’t have too much control to change any access restriction at this level, but you also have the SQL Proxy option. By using a SQL Proxy, you’ll be able to connect from on-premises or other networks. You can go more in-depth with this topic by reading the official docs on how to set up connections, among other things.
Networking and security are quite a bit more complicated in Cloud SQL than in RDS—security groups make this easier.
High availability (HA) is another important topic for Cloud SQL and RDS.
In RDS, you can configure Multi-AZ to increase HA in the same region. You can also set read replicas in a different region either for performance needs or to increase HA as well. When the primary server goes down, AWS manages the switch to a secondary server automatically at the DNS level. Your applications don’t have to change the connection string.
In GCP, you also have the option to create a read replica in a different zone, but not to a different region. If you want replication to a different region, you have to set up an external replica. In both clouds, you have to pay for the replica the same as you pay for the primary server.
A significant difference in Cloud SQL is how GCP manages the failover to a replica. When a failover happens, the switch is done by re-assigning the IP address to the replica. You don’t have to wait for any DNS cache, which is a better solution from the networking standpoint.
Backup and Storage Management
Both cloud vendors have the support for backups. You schedule a backup window, and the cloud vendor takes a differential backup every day. You can also make manual backups when you see the need. But the most significant difference is that in RDS you can retain automated backups after you delete an RDS instance, whereas in GCP, when you remove the database server, all automated backups are removed as well.
When you need to restore a backup, there’s also a difference. In RDS, you need to create a new instance. You have to take care of the switch—if you replace the server—or copy any missing data. Disruption is minimum, and you can control it. In GCP, you have to delete all its replicas before restoring a backup and recreating the replicas. But if you prefer, you can create a new server, or even restore a backup to another project.
At the time of configuring the storage, you’ll see in the Cloud SQL console that the more storage capacity you add, the better the disk throughput you’ll get—same as in AWS with RDS. You’ll also notice from the image below that in Cloud SQL you can configure to increase storage automatically when the server runs out of space. The same feature you have in RDS but under the name of storage autoscaling.
Last, but not least, pricing.
There’s no significant difference between how each vendor charges you for the service. And I’m not going to go deep and compare prices because many variables can affect your monthly bill. For example, the price for a MySQL is different from a PostgreSQL (GCP could charge you per core) database in both clouds. You also have to consider the transfer of data, storage, backups, etc. In both clouds, you’re charged by any replica server with the same pricing structure as the primary server.
An interesting aspect of GCP is that you have to pay for the instance IP address per idle hour. It’s not expensive, but still a cost you’d need to consider when creating a replica, for example.
Managed Relational Database Services
Much of the knowledge and experience you have with AWS RDS can still be used with Cloud SQL. The managed relational database offering from both vendors are very similar. Still, there are small differences, like how each vendor handles HA or if automated database backups can persist after deleting an instance. Many of the differences I included in this post will help you to manage and operate your database workloads in a better way. For example, important aspects like security where you won’t be able to access a private database instance in GCP in a conventional way outside the VPC.
I’d recommend you go and give it a try creating a Cloud SQL instance. It’s the best way to understand how different, or similar, these two services are from your needs.
|This post was written by Christian Meléndez. Christian is a technologist that started as a software developer and has more recently become a cloud architect focused on implementing continuous delivery pipelines with applications in several flavors, including .NET, Node.js, and Java, often using Docker containers.|
Google Cloud Summit Chicago 2019 has come and gone, and with it Google has revealed some great ways on how Google Cloud has helped out businesses across the nation, and how Google Kubernetes (GKE) is instrumental in the future of computing. It was a great time to get too see and talk to all of the different attendees at the Google Cloud Summit 2019, whether they were GCP veterans and familiar with BindPlane or had never even heard of Google Stackdriver and were there to learn about all of the great things that we can do. A lot of learning was done on our part as well, through speaking with other Google Partners and learning about the broader landscape of the Google Cloud community. But we learned the most about what google is up to from the Keynote and breakouts put on by some very talented individuals.
GKE at the Google Cloud Summit
During the Google Cloud Summit 2019 keynote, the GCP team was very excited to get to talk about Kubernetes, the future of GKE and how their new service Google Cloud Run fits into it all. For those who are not too familiar with Kubernetes, a high level over view is that it allows you to deploy containerized clusters of ‘worker nodes’ to a ‘control plane’. This is extremely helpful for the rapid development and deployment of many different activations. GKE will automatically provision and manage your resources including “compute, memory and storage resources”1 freeing up time for you to focus on more pressing matters. It has the ability to roll out updates in increments to ensure that there are no issues. So, if there is a problem with an update, only a small portion of the user base will encounter it, and then Kubernetes can automatically roll back the update so changes can be made. Everything about Kubernetes is seamless, including the availability and scaling, which will feed into the next point of why Google Cloud Run with GKE is so great. Visit our blog for more information on Kubernetes and how to monitor it.
Cloud Run at the Google Cloud Summit
Now Google Cloud Run brings GKE to the next level through the use of serverless containers. Serverless containers now rid you of the headaches of managing physical infrastructure, further alleviating pain points that would otherwise distract you from focusing on perfecting your applications. You can run your containers either on Google Cloud Run by itself, or if you are a GKE fan, you can fully manage your containers on a GKE cluster through Cloud Run on GKE. Since Cloud Run is built on the open source API Knative it allows for consistent management of GKE clusters, which will help you to move your containers across different platforms and environments that support Knative. Being built on Knative also allows for the flexibility to use any code that you prefer, making it much easier to develop Kubernetes applications.
Google Anthos – Hybrid/Multi-Cloud
You may have heard of Google Anthos (Google’s new Cloud/on-prem hybridization model) by now, but if you haven’t, and you like to work partially on-prem and in the cloud, or want to migrate from one to the other, then you’re in for a treat. Rob Enslin, the president of Google Cloud Global customer operations, describes Anthos as allowing for “Policy driven decisions that can be made without changing backend code”. Anthos allows for consistency between on-prem and Google cloud environments, without any custom code required, letting you very easily work on, store and manage a project on-prem and then transition it to the cloud, and vice-versa. Anthos also works with GKE as you can migrate your workloads from your on-prem infrastructure directly into containers on GCP. For more information, read our blog that includes using Anthos to migrate your architecture from on-prem to the cloud
Bringing it together with SRE monitoring
Now all of these great things Google is doing in the name of bringing computing into the future can be grouped under the umbrella of Site Reliability Engineering (SRE). The term SRE is pretty self-explanatory as it is essentially the process of ensuring you site is as reliable as possible. A site reliability engineer takes the application development process usually used by DevOps, and applies it to web applications, ensuring websites are available, scalable, can handle change management and are efficient. SRE is often compared to DevOps, and for a good reason. SRE relies heavily of the use of automation to help them keep up with all of the requirements to keep their sites reliable and running. Anthos, Cloud Run and GKE all lend a hand in making this all easier for SREs, allowing them to easily automate many of the core tenets of SRE without having to spend too much of their time on creating custom code and processes to get the job done. Another responsibility of SRE is being able to monitor your site to ensure everything is working as it should, and this is where BindPlane fits in.
Where BindPlane Fits In
BindPlane’s Metrics and Logs features are an invaluable tool when it comes to monitoring the performance and health of your physical and cloud-based infrastructure and sites. BindPlane can help you integrate your systems that are not usually supported by Stackdriver out of the box to provide logs and metrics that help you gain valuable insights when it comes to supporting SRE. BindPlane can monitor GKE, to make sure it is working as intended when it comes to using it to support your SRE, keeping an eye on containers and can alert you when certain parameters are not met. If you are using GKE to help manage the scalability and the roll-out of updates to your site, using BindPlane for Stackdriver logging monitoring will be a huge benefit.
BindPlane for Stackdriver will also allow you to create metrics graphs for KPIs and log-based metrics to help you visualize how your architecture is performing. You could potentially monitor your Anthos migration from on-prem to the Cloud, giving better visibility on how data and applications are being transferred and shared between the two. You may also monitor the health of your physical and cloud servers, and alerts can also be set to notify you if any issues are occurring during transfers.
Special Thanks to the Blue Medora Team
I just wanted to add a big thanks to the Blue Medora BindPlane Product team for bringing their A game to the booth at the Google Cloud Summit 2019. It was great getting to see them all in their element as people were lined up to learn about BindPlane and Stackdriver. None of this would be possible without their talent, and I was glad to be able to tag along for the event!
RabbitMQ is a great tool to have for resilient message passing in your system instead of HTTP calls that can get lost in the void. It also sets you up nicely to drive your systems with events for looser coupling among workflow steps. In this post, we’re going to look at how we can quickly get RabbitMQ up and running when using Google Cloud Platform, also known as GCP.
First, we’ll look at what options we have to install RabbitMQ. Then we’ll look at how to run and connect to it. After that, we’ll talk about how to monitor our queues for efficiency with Stackdriver. Since I’m new to GCP, we’ll be going through this with the eyes of a GCP beginner.
First, we want to quickly run through how to install RabbitMQ onto GCP. There are a few options to do this: as a Docker container, as a series of VMs, or as a Kubernetes cluster on Google Kubernetes Engine. We’ll be taking the third option: Google Kubernetes Engine, also known as GKE.
Installing RabbitMQ on GKE
We’re choosing to install RabbitMQ onto GKE because GKE will handle most of the low-level details. This will allow us to maintain the best level of flexibility for most business applications. However, we do need to trade off low-level knowledge with Kubernetes knowledge. If you’re new to Kubernetes, like I am, this little book can get you started.
With knowledge of Kubernetes in tow, we’ll be taking the path of many great IT administrators: installing via the command line. If you’re already using GCP, you may know that you can access Cloud Shell from your browser within the GCP web UI. This allows us to do manual installs without needing to set everything up on our local workstation. However, for business systems, I do recommend scripting the installation out, version controlling it, and automating it as part of environment provisioning. While we’ll be installing via the command line in Google’s Cloud Shell, I’ll also show how the instructions look from the web UI.
For the vast majority of our installation, we’ll use Google’s RabbitMQ Cluster User Guide. This gives you guidance whether you prefer to install via the UI or command line. Follow the installation instructions up to getting the cluster status. The status should tell us that everything is up and running. Ensure that you enable Stackdriver Exporting. This is key for later when we talk about how to run RabbitMQ efficiently. Your status should look something like this:
There is one caveat to the process: I did run into an issue with the stateful set the first time I installed RabbitMQ via the UI. It didn’t properly provision a persistent volume, aka disk storage space, for the pod. Installing via the command line did take longer, but I didn’t run into any snags.
Here’s what it looks like when installing via the UI:
Connecting Your Apps to RabbitMQ
Once the status is good, we can figure out how to connect our applications to RabbitMQ. This depends on where your app is located.
URL and Credentials
The first step to connecting our applications is to get appropriate credentials. GKE stores this in a configuration secret called $INSTANCE_NAME-rabbitmq-secret. Follow the instructions for authentication and copy the password somewhere secure. The username is what we already specified as part of the installation process.
The next step to connecting our applications is to get the right URL wired up. If your application is also on GKE, you can cohesively manage and monitor everything together. In this case, we’ll use port-forwarding. Follow access option 2 to make RabbitMQ accessible to your application’s pod. You can check access from within the Cloud Shell by hitting http://127.0.0.1:15672 and logging in to the admin site with the credentials.
If your app is hosted elsewhere on GCP or even outside of GCP, we’ll follow the instructions on exposing the service externally. You can log in from your browser at http://[EXTERNAL IP]:15672 to ensure the service is working.
Building Queues for RabbitMQ
Now that we’re up and running, there are many different ways to add and maintain RabbitMQ exchanges and queues. The best ways are highly sensitive to the type of application you have and frameworks you use. Most languages have an admin SDK you can use to configure RabbitMQ. You can also use the HTTP API no matter your language or framework.
Whatever you use to configure, I recommend you version control the changes and automate them. This is often done as part of your continuous delivery pipeline.
Making It Efficient
We made quite some progress getting RabbitMQ installed and our application able to connect to it. Now we want to ensure we run RabbitMQ efficiently. To us, “efficiently” means that we’re only paying for what we need in order to meet our service-level objectives. There are many aspects of performance we can look at to monitor efficiency, but the main two we care about are message lead times and utilization. We’ll look intelligently at metrics through Stackdriver since we enabled this property during installation.
Message Lead Times
Message lead time is like web request latency. If it’s too slow, people may experience a laggy application and look for alternatives. Unfortunately, the metrics that come out of RabbitMQ to Stackdriver don’t directly tell us what these times are. Instead, we can derive them from knowing the count of messages in the queue is per time period. We call this queue depth. Ideally, our queue depth should be low for most parts of the day, with occasional spikes during busy periods. We should see these spikes quickly go back down, assuring us that we have low lead time per message.
Here is what queue depth can look like in Stackdriver:
In this chart, I break down depth by queue. I also track total messages, ready messages, and unacknowledged messages. This lets me detect errors in consumers. For example, a lot of unacknowledged messages could mean a bad consumer. In this chart, I only have one published message, so it’s not that exciting.
Using this data, I can make a few changes to my queuing to make it efficient. The main change is often to add more consumers to a specific queue where the depth builds up too consistently. But by monitoring real usage, I only add more consumers to the queues that need it and when they need it.
The next aspect of efficiency is utilization. This is how much of a resource, mainly CPU and memory, RabbitMQ uses up. Since RabbitMQ is stateful, meaning it persists the state of its message queues, we want to vertically scale pod CPU and memory capacity up or down. Kubernetes can actually autoscale these using its vertical pod autoscaler. While Kubernetes can simplify our scaling process, we still want to keep an eye on our utilization. We can do this directly on the pods through Stackdriver:
Ideally, both our CPU and memory usage are below 80%; otherwise, we can expect exponential slowdowns. If these charts go above the 80% threshold, we may need to tweak our autoscaling settings. Besides this, most scaling in RabbitMQ will be done at the consumer/queue level.
If these metrics are at too high a level, we can add more of our exported RabbitMQ metrics that show utilization by exchange or queue.
Efficient Messaging Out of the Box
Even though I’m new to both GCP and Kubernetes, I was able to understand and deploy a working RabbitMQ service in under two hours. I never cease to be amazed at how far we’ve come in the provisioning of new infrastructure. In my early career, something like this would have taken days or weeks to procure the software. Then it would take days or hours to install and start up the service. Finally, it would take months of painful operational errors for us to get the scale and settings “just right.” Although GKE has a learning curve, the flexibility and power we have to keep our RabbitMQ healthy is amazing.
|This post was written by Mark Henke. Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.|