
Mastering Scalable Storage: A Step-by-Step Guide to Deploying a Ceph Cluster on AlmaLinux
In today’s data-driven world, the need for a robust, scalable, and resilient storage solution is more critical than ever. Traditional storage systems often hit limitations in performance, cost, and flexibility. This is where Ceph comes in—a powerful, open-source, software-defined storage platform that unifies object, block, and file storage into a single, distributed cluster.
Pairing Ceph with AlmaLinux, a stable and enterprise-grade operating system, creates a formidable foundation for your data infrastructure. This guide will walk you through the essential steps to deploy a modern Ceph storage cluster on AlmaLinux using the cephadm
deployment tool, empowering you to build a highly available and scalable storage system.
Understanding the Core Components of Ceph
Before diving into the deployment, it’s important to understand the key daemons that make Ceph work:
- Monitors (MONs): These maintain the master copy of the cluster map, which tracks the state of the cluster, its nodes, and data placement. A minimum of three monitors is required for a production-ready, high-availability cluster.
- Managers (MGRs): The manager daemon is responsible for providing additional monitoring and management services, including the Ceph Dashboard and REST API.
- Object Storage Daemons (OSDs): These are the workhorses of the cluster. Each OSD is responsible for storing data on a physical disk, handling data replication, recovery, and rebalancing across the cluster.
- Metadata Servers (MDSs): Required only for the Ceph File System (CephFS), these servers store metadata for files and directories.
Prerequisites for Your Ceph Cluster
Proper preparation is the key to a smooth deployment. Before you begin, ensure your environment meets the following requirements:
- Multiple AlmaLinux Nodes: A stable cluster requires at least three nodes to ensure quorum and data redundancy. For this guide, we’ll assume a three-node setup.
- Dedicated Storage Devices: Each node should have at least one dedicated storage device (SSD or HDD) that is not used for the operating system. These will become your OSDs.
- Network Configuration: All nodes must be on the same network subnet and able to communicate with each other. It is crucial to configure static IP addresses and ensure proper hostname resolution via DNS or the
/etc/hosts
file on each node. - Time Synchronization: All nodes in the cluster must have their time synchronized. Using a Network Time Protocol (NTP) client is mandatory, as time drifts can cause severe issues with cluster stability.
- User Account: You will need a user account with passwordless
sudo
privileges on all nodes forcephadm
to perform administrative tasks.
Step 1: Initial Node Preparation
On every node that will be part of your Ceph cluster, perform the following initial setup steps.
Update Your System: Ensure all packages are up to date.
sudo dnf update -y
Configure Hostnames and Hosts File: Assign a unique hostname to each node and ensure all nodes can resolve each other by name. Add entries for all nodes in the
/etc/hosts
file on each server. For example:192.168.1.101 ceph-node1.yourdomain.com ceph-node1 192.168.1.102 ceph-node2.yourdomain.com ceph-node2 192.168.1.103 ceph-node3.yourdomain.com ceph-node3
Install and Configure NTP: Use
chrony
to keep your nodes synchronized.sudo dnf install -y chrony sudo systemctl enable --now chronyd
Create a Ceph User:
cephadm
requires a dedicated user with passwordlesssudo
to manage the cluster.
bash
sudo useradd -m -s /bin/bash cephuser
sudo echo "cephuser ALL=(root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephuser
sudo chmod 0440 /etc/sudoers.d/cephuser
You will also need to set up SSH key-based authentication for this user between all nodes.
Step 2: Firewall and Security Configuration
Ceph requires several ports to be open for communication between its daemons. Configure the firewall on all nodes to allow this traffic.
A crucial security tip is to only allow traffic from trusted subnets.
# Add your trusted subnet, e.g., 192.168.1.0/24
sudo firewall-cmd --add-source=192.168.1.0/24 --permanent
# Ceph Monitor and Manager Ports
sudo firewall-cmd --add-service={ceph-mon,ceph-mgr} --permanent
# Ceph OSD Ports
sudo firewall-cmd --add-port=6800-7300/tcp --permanent
# Reload the firewall to apply changes
sudo firewall-cmd --reload
During installation, it can be helpful to set SELinux to permissive
mode to troubleshoot any access issues. Remember to re-enable enforcing
mode after a successful deployment.
sudo setenforce 0
sudo sed -i 's/SELINUX=enforcing/SELINUX=permissive/g' /etc/selinux/config
Step 3: Installing cephadm
cephadm
is the modern tool that simplifies the deployment and management of a Ceph cluster. It uses containers to run Ceph daemons, ensuring consistency and ease of upgrades.
Add the Ceph Repository: On your first node (which will act as the bootstrap node), add the official Ceph repository.
sudo dnf install -y https://download.ceph.com/rpm-reef/el9/noarch/ceph-release-1-1.el9.noarch.rpm
Install the Deployment Tool:
bash
sudo dnf install -y cephadm
Step 4: Bootstrapping the First Cluster Node
Now it’s time to bring your cluster to life. The bootstrap process initializes the first monitor and manager daemons.
Run this command only on your first node. Replace <mon-ip>
with the static IP address of this node.
sudo cephadm bootstrap --mon-ip <mon-ip>
This command will take several minutes to complete. It will pull the necessary container images and start the initial daemons. Upon completion, it will provide you with the command to access the Ceph command-line interface and the URL and credentials for the Ceph Dashboard.
Save this output, as you will need it. You can check the status of your newly created cluster at any time with:
sudo ceph -s
Step 5: Expanding the Cluster by Adding Nodes and OSDs
A single-node cluster isn’t very useful. Let’s add the other prepared nodes and their storage devices.
Add New Hosts: First, copy the cluster’s public SSH key to the
cephuser
on each additional node. Then, from your bootstrap node, add the other hosts to the cluster’s control.# Run this on the bootstrap node for each new node sudo ceph orch host add <hostname-of-new-node>
You can verify that the hosts have been added with
sudo ceph orch host ls
.Discover and Add Storage Devices (OSDs): Once the hosts are part of the cluster,
cephadm
can automatically discover and deploy OSDs on any available, unused disks. This is the simplest method for adding storage.
bash
# This command will turn all available, unused disks on all hosts into OSDs
sudo ceph orch apply osd --all-available-devices
After running this, usesudo ceph osd tree
to see your OSDs being created and added to the cluster. The cluster will begin to rebalance data across them, and its health status will eventually return toHEALTH_OK
.
Step 6: Accessing the Ceph Dashboard
The Ceph Dashboard is a powerful web-based tool for monitoring and managing your cluster. The URL and initial password were provided during the bootstrap step.
To get the dashboard URL at any time, run:
sudo ceph mgr services
You can retrieve the admin password with:
sudo ceph dashboard get-admin-password
Log in to the dashboard to get a comprehensive, real-time overview of your cluster’s health, performance, and capacity.
Final Thoughts and Next Steps
You have now successfully deployed a foundational Ceph storage cluster on AlmaLinux. This setup provides a powerful platform for scalable object storage, block devices for virtual machines, or a distributed file system.
From here, your next steps could include:
- Creating Storage Pools: Define logical partitions for different types of data.
- Setting up Rados Block Devices (RBD): Provide reliable block storage for VMs and applications.
- Deploying CephFS: Create a POSIX-compliant file system that can be mounted by multiple clients.
Always remember to monitor your cluster’s health regularly using ceph -s
and the dashboard. By following these steps and best practices, you have built a storage solution that can grow and adapt with your needs.
Source: https://kifarunix.com/how-to-deploy-ceph-storage-cluster-on-almalinux/