An essential aspect of every migration to the cloud is storage. In some instances, migrating storage to the cloud is about copying and pasting. But problems can arise when the amount of data you need to move is too big or too sensitive, the transfer needs to be secure, or you need to have business continuity while migrating. In many enterprises, all these use cases apply. But there’s a solution to these problems enterprises face when migrating to Google Cloud Platform.
In today’s post, I’ll share the list of services that will help you to migrate your on-premises storage layer to GCP.
In GCP, there are many services that you can use for storage, depending on the storage type your applications need. GCP has a service for common needs. If it’s about unstructured data like log files, database backups, images, or any other file, you have Cloud Storage. There are also fully managed services for MySQL and PostgreSQL with Cloud SQL, where Google takes care of patching, high availability, and read replicas. And if you want a database fully managed by Google with extra features like horizontal scaling, strong consistency, and global, you have Cloud Spanner. In case you can’t decide which storage service to use—I’m with you— there’s a decision tree in the official docs, including detailed information for when to choose each storage service.
So let me introduce you to a few services that you might use when migrating to GCP.
The first option you have to store data is Cloud Storage, and you can upload data via their drag and drop tool in the console by installing the command line or through their JSON API. Consider this approach when the data you need to upload doesn’t have a file size that’s too big. You can learn more about the limits in Cloud Storage—for example, you can’t have individual files (or objects) with more than 5 TB of size. Also, try to avoid uploading too many objects at the same time. GCP will throttle the requests you send to Cloud Storage. I’d advise you to take a look at their best practices and recommendations when working with Cloud Storage.
Besides being able to use the GCP console to upload data, you can also make use of their CLI called gsutil. Because you’ll be running the commands from on-premises, you need to ensure that you have good internet connectivity or else you’ll need more time to upload data. You could also use a direct peering to access Google’s network through its edge points of presence. Or, ultimately, configure a direct connection using Cloud Interconnect through GCP’s service providers. Lastly, what’s good about gsutil is that you can upload data in a multi-threaded way, which is useful when you need to upload several files at the same time faster.
Another option is the Transfer Appliance service, which is like a USB drive used to store terabytes of data. This is useful if you need to upload more than 100 TB and up to 480 TB of data and want to avoid network connectivity problems. If you need to upload more data, you need to order more appliances. The Transfer Appliance is a physical server that you request from Google. Then, you connect the appliance to your data center, upload the data to the appliance, and ship it to Google. All the data you upload to the appliance is encrypted. Then, Google uploads the data to your Cloud Storage account for you faster.
Once the data is in GCP, you can access it, decrypt it, and use it. You don’t need to reserve bandwidth or configure a direct connection, and the data will be migrated faster.
Komprise is a partner solution where you deploy agents in your data center to upload data to GCP. This solution is applicable in many cases. But regarding migration, you might find Komprise useful if you don’t urgently need to upload data and it can be done transparently. As Komprise puts it, it’s like you’d be extending your network-attached storage on-premises to GCP.
Once you’ve installed and configured the virtual appliances on-premises, Komprise analyzes the data you have in your data center. After reviewing the insights the tool provides, you can decide how to manage data and plan the capacity you’ll need in GCP. This will help you determine how much it’ll cost you too. Komprise can automatically upload—or replicate—the data in the background to GCP. Or you can also configure it to retrieve only the data your users need. Keep the hot data on-premises and everything else in the cloud, for example.
One important thing to stress, you can read in their blog post: “Komprise does not store data, and simply moves data through SSL to Cloud Storage, which is HIPAA-compliant.”
To migrate databases from on-premises to GCP, Google has acquired a company called Alooma. Alooma is an enterprise data pipeline tool that can integrate different data sources and transform the data before it is stored in a data warehouse. With this acquisition, you’ll be able to migrate data in an ETL way. Alooma will enable you to maintain business continuity when migrating their database workloads.
Moreover, Google has created a series of blog posts using existing tools to migrate data from different data engines to GCP. You may also request assistance with assessing the migration with the help of partners by contacting Google’s sales team.
From the list of migration assessment guides, you have the following documents:
As you can see, Google just published a set of guides for existing tools on how to migrate databases. But this might change in the future. Google might decide to create a service dedicated to migrating databases as the one AWS currently has.
I talked about Velostrata in a previous post, where I discussed the migration options to GCP in general. But Velostrata is a new offering from GCP that you can install and configure to migrate on-premises VMs to GCP. Velostrata is a good solution because it’ll help you migrate workloads by streaming the data. Therefore, once the migration finishes, you can configure Velostrata to replicate the data from the cloud to on-premises. This feature is useful in case you need to run a rollback and avoid losing data.
With Velostrata, you can also anticipate migration for the VM storage only. As a consequence, you might be able to create multiple VMs to increase redundancy or performance. Having the storage in GCP before the VM will help to know precisely how much it’ll cost to migrate specific workloads to GCP.
Although this post focused specifically on migrating on-premises data to GCP, bear in mind that there are services that you can use to migrate data from AWS to GCP or from a SaaS solution like Google Ads to BigQuery. In this post, I included a lot of tools from GCP and partners. Which tool and storage service you choose will depend on your needs and how much data you need to migrate.
Hopefully, you’ve found this post useful, but remember that GCP updates their services very frequently. So I’d advise you to confirm in their official docs for data transfer.
|This post was written by Christian Meléndez. Christian is a technologist that started as a software developer and has more recently become a cloud architect focused on implementing continuous delivery pipelines with applications in several flavors, including .NET, Node.js, and Java, often using Docker containers.|