Glusterfs 3.2 Updates

At our office, we’ve been using Glusterfs in an unconventional way. Instead of merely using it in a distributed or replicated cluster, we’re using it as central storage for our user home directories.

As part of our office-wide upgrade to Ubuntu 12.04LTS, we have had to upgrade our Glusterfs from 3.0 to 3.2 and it was not immediately evident how we could accomplish the same setup as we had previously. Now, it’s been sorted out and we’d like to share our setup.

Let’s assume that there are four workstations with one file server. The four workstations have the IP address 192.186.1.101-104 and the file server has the IP 192.168.1.100

On the file server, do the following:

# gluster peer probe 192.168.1.101 # gluster peer probe 192.168.1.102 # gluster peer probe 192.168.1.103 # gluster peer probe 192.168.1.104 # gluster peer status

You should see that there are 5 machines in the cluster. On each workstation and server, we will use the /data/export directory for storage. However, we need to make a few adjustments to the server to make our scheme work.

# ln -s /data /data101 # ln -s /data /data102 # ln -s /data /data103 # ln -s /data /data104

From here on, things are going to get a little weird but these instructions can be replicated on each of the workstations without any problems.

# gluster volume create vol101 replica 2 192.168.1.101:/data/export 192.168.1.100:/data101/export # gluster volume start vol101 # mount -t glusterfs 192.168.101:/vol101 /home

That’s it. That will mount the gluster file-system to the home directory.

Doing things this way allows the server to act as a mirror for the data in the home directory. By replicating the setup across different workstations, the server also acts as a shared storage for the cluster.

This gives us the power of automatic replication and recovery using the Gluster AFR mechanism with self-healing features. It also gives us the advantage of a shared storage that allows a user’s data to be accessed from any workstation. It also gives us the performance boost of having to only read/write to the local machine instead of over the network.

Any change that is made on one workstation, will be distributed to the other workstations via the central server.

This scheme may be heresy for some but it’s served us well for more than a year.

6 Comments

Jeff Darcy (@Obdurodon) · 2012-05-15 at 00:35

I think you’re setting yourself up for data loss. If I’m interpreting this correctly, each workstation is replicating between a local directory and /data on the server, using symlinks to fool the code that’s there to guard against this very kind of reuse. Writes done at 101 won’t propagate properly to any of the other workstation-local copies, so they’ll be reading stale data. Worse, if one of them later writes to the same file then there will be missing writes in both directions. That can be nearly impossible to reconcile even manually. Please email me (jdarcy@redhat.com), or send email to the user list (gluster-users@gluster.org), or stop by #gluster on Freenode IRC, so we can discuss valid configurations that might meet your needs.

Shawn Tan · 2012-05-15 at 01:02

Thanks for the quick reply. I’m writing you an email as I write this comment.

Yorkim Parmentier (@trIx0r) · 2012-07-02 at 19:01

Hi,

Thank you for your blogpost. I’m experimenting with GlusterFS as well at the moment and I have a question about it.

My current demo setup is as followed: 2 Servers, 10.13.38.2 and 10.13.38.6. As you see both of the servers are in the same LAN segment. However, I want to move 1 of the 2 servers to an other physical site, it will get IP 10.13.39.6.

Do you know an easy way on how to change? I’ve used IP addresses when probing the master servers so dns isn’t an option.

Do I have to remove the second server and re-add it again or is their an easier way?

Thnx

Shawn Tan · 2012-07-02 at 19:21

I’m no glusterfs expert but if you have added things using an IP instead of HOSTNAME, then I guess you’ll need to add the new machine to the list of peers and to replace the old peer with the new one with ‘gluster volume replace-brick’ command.

That said, if your servers are geographically distributed, you may want to consider using a different mechanism such as geo-replication.

Yorkim Parmentier (@trIx0r) · 2012-07-02 at 22:37

Yes this was my plan eventually, right now both servers are at the same location. Confirming that GlusterFS is really what I need.

However, by then end of August, both servers should be distributed geographically.

Could you tell me more on the geo-replication? As far as I understand, this is a feature that you can enable once you created your volume allowing you to use world wide web. Do you have to create your volumes the same way or do you need to create them specifically to allow Geo Replication?

Shawn Tan · 2012-07-04 at 17:43

Sorry, but you really should refer to GlusterFS for this info. Our blog is not a suitable channel to discuss this.