How to deploy CockroachDB on IBM Cloud Kubernetes — Hands on guide
In his 1915 masterpiece “The Metamorphosis”, Franz Kafka described a surreal world in which a salesman, named Gregor Samsa, wakes one morning to find himself inexplicably transformed into a huge cockroach. Poor Gregor struggles across the book to adjust to this new horrible condition.
Almost 100 years later, many companies put together their greatest efforts building and developing innovative and powerful technologies, aiming to transform our lives. These technologies can sometimes overwhelm us with their complexity (which, oddly enough, aims to simplify things). In some cases, even top IT professionals might struggle in a way which feels almost as painful as poor Gregor’s misery…
This is true especially when there is a need to deploy and run these complex platforms , which can turn out to be a real frustration. Could it be that ahead of his time, Franz Kafka actually wrote a masterpiece about a poor IT guy?…
This is why ‘step by step’ guides are so very important, keeping things simple and clear, saving hours of frustration trying to figure things up. This is the reason I took the time to write this hand-on guide on how to deploy CockroachDB on IBM Cloud Kubernetes, keeping every step very clear, while demonstrating the process from start to end.
(On a side note, I wonder if there is a hidden subtext to the fact that CockroachDB chose Kafka to be their first supported platform...).
If Kubernetes is the Yin, CockroachDB is the Yang
It is no secret that there is a rise in cloud based databases and building individual clusters, which work per a specific region, or across a multi-region zones. It can be powerful when you need to design a system which serves, for example, a global-distribution, as it brings the benefit of reduced dependencies and latency.
This is why I wanted to first get familiar with K8s. K8s helps orchestrate computing within a region, so you can basically make sure your app will constantly run while dynamically scale or de-scale the volume of services you need to use at any given moment. The issues with K8s start when it comes to multi-region clusters, in other words: when your clusters a spread around different regions, operating on a global scale might become tricky. This is where CockroachDB comes into the picture.
CockroachDB is a distributed SQL database. It was designed with a unique and dedicated distribution architecture with a single atomic unit which acts as a global geo distributed platform with the ability to support global DB clusters across different clouds, or even outside the cloud. The nodes, which are virtual or physical machines, can communicate with each other and can be used either as an end point or as an access point of the DB.
Yep. If Kubernetes is the Yin, CockroachDB seems to be the Yang.
When a C++ guy looks at the clouds
I have been a C++ professional for the past +25 years, so I admit dockers, containers, cloud computing and containerized DBs are not my ordinary playground. When I was looking for online information on how to deploy and run CockroachDB IBM Cloud K8s I could hardly find any good, straightforward guide. I had to begin from the very start.
What I did find is a lot of theoretical information backed up with a lot of very nice slides and presentations, or information which was scattered all over the place. So, I had to roll up my sleeves and figure it out myself, writing this guide in the process for the benefit of the entire community. A lot of hair was pulled in the process, but I can assure you no animals were harmed during the making of this guide.
In this hands-on article, I explain step by step how to setup Cockroach DB on IBM Cloud K8s. I then demonstrate how to write a simple app to handle the DB. It’s not as hard as you should expect once you figure it out and I am sure you will be able to follow this guide and successfully deploy it yourself.
Step 1: Setting up K8s on your IBM Cloud
After opening you IBM Cloud account, you need to set up K8s in a configuration which can support Cockroach DB. The first step is to create a cluster with minimum resources:
- Go to the main menu/Kubernetes/Cluster and select “Create Cluster”.
2. You now have two options: a free cluster (will expire in 30 days) or ‘Standard’ cluster. When you choose ‘Standard’ watch for the price on the right hand side, as it will change as you continue the configuration as you scroll down. Use the latest/recommended version. In this case I used version 1.18.9.
3. Classic infrastructure should be fine to choose and is what I choose in this case. I also chose a single zone (North America) which will cost less and is enough for this guide.
4. In this case I used a single node of 2CPU/4GB which is the smallest option possible which will serve my purpose for the sake of this guide. Note that a single node rather than three nodes is considered an unsecure deployment, but for the purpose of this guide it will work.
5. Make sure to choose both private and public endpoints to access cluster on internet. You also should make sure to choose “Encrypt local disk” to be secure.
6. Now we can give our cluster a name. In this case I called it ‘cluster-cockroachdb’.
7. Now we can create our cluster. This process can take up to 10-20 minutes. This might be a good time to get yourself some coffee…
In the creation screen you already have some useful scripts for setting up CLI and accessing your cluster. We will get back to them once the setup and download are completed.
What the Helm??
Now that your cluster is in the oven, let’s start working on our Helm. But first, let’s explain what Helm actually is.
To put it simply, Helm is like some sort of package manager for K8s. When we want to deploy K8s, we must use certain K8s resources, for example, StatefulSet, Services, Ingress, etc. Helm contains all of these and more, and it also helps maintaining it all. You can think of Helm as if it was an “editable” DLL. I will show you how Helm works later on in this guide.
The good news is that when you work with IBM Cloud you can skip the part where you need to download Helm manually, as IBM Cloud K8s installs it for you automatically.
Step 2: Setup IBM Cloud binaries
While Helm is being installed, we can start to work on some other components we need. We need to setup our CLI tools. Copy the script from your screen and paste it in your PowerShell (Important! don’t forget to run PowerShell as admin).
Now let’s copy and execute:
Once you run the script, it will take a while until the setup is finished:
While we wait, let’s go to K8s dashboard:
This is the control panel for your K8s activities, and will will serve you with various operations, settings and overall management. Go to “Pods”. A pod is a group of one of more Docker containers. One Docker container will be running a single CockroachDB node. We have a single node in this case, as you may remember.
At this point there is nothing yet to see, as the system is still processing. This might be a long wait.
After around 15–20 minutes, setup is ready! Let’s go to PowerShell and check:
Now that CLI is downloaded, lets access the cluster from our own terminal. copy the following script:
You would need to enter your login and credentials.
Let’s get a look at the cluster we created. You need to copy this script:
And here is how it looks like:
Step 3: Set CockroachDB
So far everything was straightforward and not so complicated. Deploying CockroachDB should not be any more complicated than that. The first step is to downloaded the helm chart from the following repository: https://github.com/cockroachdb/helm-charts and navigate to the ‘dir’.
In order to open the file I used Visual Studio, but any other suitable IDE will do. Once the file is open in VS, you should be interested in the yaml file, as we need to change its value as per our requirement. These values will determine how Cockroach DB will be deployed in our K8s cluster. Once you open the file you can see that it is heavily commented, so you really can’t go wrong here. If you have any doubts you can find the detailed description in the repo itself.
Now lets edit to customize the required deployment. I will point out a few important configuration elements:
- Configure single-node as true. It will make the deployment as standalone and not as a cluster.
2. Configure statefulset.replicas to 1 as it will only use a single container. Of course you need to configure it to the actual number of containers you are using, but in my case it’s 1.
3. The resources section describes how much Memory/CPU will be used.
The limits is the hard limit, and in a case a container starts consuming more, it will be killed. I did not change anything in order to leave it to use the maximum values.
4. It’s important to point out that ingress better remain “false”, otherwise it will expose CockroachDB publicly. It’s good practice to keep the exposer of the db only to the backend service.
5. Storage is the most important section for the DB deployment.
these settings will be used in order to create a volume on IBM Cloud, so our data will remain persistent. The minimum amount of volume IBM offers is 20G and you will not be able to provision less then that. Even if you set your storage to 10G, you will get 20G.
The storage classes IBM Cloud provides are of multiple types which you can choose from. In this case, I will use ibmc-block-bronze, which is the most basic. We can list storageclasses via the command line:
Also, at this point, only file type storage is available, which is why I chose to use ibmc-file-bronze instead of ibmc-block-bronze. In case this does not work for you, you can always add the block file storage plugin.
This is the basics settings you should know about, and as at this stage editing is done, we can now use this configuration in order to deploy CockroachDB.
Step 4: Deploy CockroachDB
K8s uses a concept of namespaces for segmentation, and in order to deploy CockroachDB we can create a separate namespace for that purpose.
First, let’s have a look and check the default namespaces/ns used:
kubectl.exe get ns
The second stage is to create a new ns:
kubectl.exe create ns <name> (we will use name “cockroach-db”.)
Here is our new ns:
Now let’s deploy CockroacDB with the namespace cockroach-db via helm:
in the directory where Cvalues.yaml is located we run the following command:
helm.exe install <release name> — namespace <namespace> .
Once we deployed CockroachDB successfully, we are able to view and monitor our pods inside the K8s cluster.
Step 5: Add some data to our DB
Since CockroachDB is basically an SQL based DB, we can use SQL commands in order to work with the DB and build it. The following command will create a sample client application from which we can access the DB:
kubectl run -it — rm cockroach-client — image=cockroachdb/cockroach — restart=Never — command — ./cockroach sql — insecure — host=cockroach-cockroachdb-public.cockroachdb
Now let’s have a pick at our default DB:
Here I created a sample table named “Products”:
Now lets exit this sample container/sample app so we can recreate the client and verify that the data/table are persistent. Simply use the command “exit” and you should expect the client to be deleted. To check this simply go back to your K8s control panel and check your pods. It should be empty:
Let’s redeploy using the same command as before (kubectl run -it — rm cockroach-client — image=cockroachdb/cockroach — restart=Never — command — ./cockroach sql — insecure — host=cockroach-cockroachdb-public.cockroachdb). and.. Ta-Dah! We are back!
Let’s check the table:
And of course we can continue working and building/displaying our DB:
Step 6: Working with CockroachDB and accessing the data programmatically
Executing code with containers/Dockers can be very different then doing so for desktop or web apps as itis based on several moving parts which must work together. Let’s look at the steps we will need to follow:
- Write the application we wish to use.
- Containerize the app via docker tool (Dockerfile)- “docker build”.
- Push to registry “docker push”.
- Use the registry URL/Address in the sample-app.yaml
- Deploy the app on K8s “kubectl apply”.
- View the logs with the output in the K8s dashboard.
Let’s break this down and explain how to make this work.
- The application
The full source code can be downloaded from Github.
The application is written in JavaScript and it’s really simple and basic. The actual app code is part of “server.js”, which takes HOST DATABASE PORT USER from the environment variables and does the following:
1. List databases;
2. Create a new table named ‘videos’ in defaultdb (which exists by default).
3. Inserts two entries into the table.
4. Query the entries and display them.
5. Drop the table videos.
We will deploy this app after the next stages.
2. Containerize the app
Containerizing an app isolates the app along with it dependencies from the underlying infrastructure, in other words, it’s a form of virtualization that allows users to deploy and run distributed apps within in containers, sparing the need to launch the entire VM. The containers will hold the essential components (i.e. configuration files, binaries and libraries, etc.). This is why in order to be able to run the app on K8s we need to “containerize” it.
We will first need to use Docker image, which is basically the read-only template that contains a set of instructions for creating a container that can run on the Docker platform. A Docker image is made up of a collection of files that bundle together all the essentials, such as installations, application code and dependencies, required to configure a fully operational container environment. For this purpose I used the Dockerfile app which uses uses a base Docker image node (12-slim). I used a public image in this case, but you can of course write and use your own one.
The image below shows the dependencies which are part of the base image:
Now we need to build the image, for this purpose we write the following script in the sample-app dir:
docker built -t <docker_registry_url>/<docker_registry_username>/<image_name>:<image_tag> .
This script will build the image of the app locally.
3. Push to registry
Once our app has been built into a Docker image, the next step is to push it into container registry for safe-keeping, so it will be ready for deployment. We use the following script for doing so:
docker push <docker_registry_url>/<docker_registry_username>/<image_name>:<tag>
Once the image is pushed we can move to the next step and deploy our app on K8s.
4. Deploy the app on K8s
In order to deploy and execute our app we use the sample-app.yaml file, which generates a container/Pod with the image we just pushed to registry with the connection parameters as environment variables.
We deploy the Pod manifest via the following command:
kubectl.exe apply -f sample-app.yaml
Our app has been executed!
Now we can go back to the dashboard and view the logs in order to check the output of our program:
Summary
This is all the basics you need to know in order to start working with CockroachDB on IBM Cloud K8s. I hope this guide can serve as a good root foundation which will push you forward towards your future experience with these powerful platforms. At the end of the day, deploying CockroachDB on IBM Cloud K8s is not so complicated, as long as you follow the necessary steps. It is also important in my opinion to understand not only the “how” but also the “why”, so if there is any term you do not understand, just look it up and learn more about it.
Additional useful resources:
A video I created