Skip to main content
13 May 2013

The Cluster is group of two or more server, which devides the task amongst the available cluster server. Cluster is required when you want to run your application in highly available manner with auto failover facility. Here in this post, I am going to show how you can build your Alfresco cluster, so that you can get both High Availability and Auto Failover. This post assumes that you have below mentioned environment.

Prerequesites

Softwares:
Alfresco Version: Enterprise v3.4.3
Operating System: CentOS 5.4 64bit
Web Server: Apache 2.2
Hardware Configuration:

  • Alfresco Cluster nodes:

    CPU: 2 Core CPU
    Memory: atleast 4 GB
    Disk space: atleast 40GB

  • SAN Storage
  • Web Server:

    CPU: 2 Core CPU
    Memory: atleast 4 GB
    Disk space: atleast 40GB

Here I have mentioned the whole configuration in two parts. The first part contains "Alfresco Cluster Configuration" and the second one for "Load Balancer" configuration which will do Auto Failover.

 Alfresco Cluster Configuration:

There are two approaches are available for Alfresco Cluster Configuration.

In the first Architecture, there will be two Alfresco nodes using single shared storage for content store. Both Alfresco nodes will have thier own copy of lucene Indexes. There will be a L2 cache replication between these two Alfresco nodes. Both Alfresco will point to single Database server. Please refer the below shown diagram.

Alfresco Cluster Method 1

 

In the second approach, Two alfresco nodes will be having their own content store along with one additional shared content store. Both Alfresco nodes will have their own lucene indexes. And there will be a L2 cache replication between two Alfresco nodes. In this method also, we can use single Database server. But if you are planing to build Database cluster then there would be some changes Database connection configuration. Please refer to below shown diagram.

 Alfresco Cluster Method2

 

So here I will show how you can do Alfresco Clustering, using above method. In this method, both Alfresco nodes will have their own Content store + one additonal shared content store. The shared content store is used, to replicate data between two Local content store.

 Please consider below mentioned paths for content store for both Alfresco nodes.

Alfresco Local Content Store: /opt/alfresco/alf_data

Alfresco Lucene Index Path: /opt/alfresco/alf_data/lucene-indexes

Alfresco Backup Lucene Indexes: /opt/alfresco/alf_data/backup-lucene-indexes

Alfresco Shared Content Store mounted at: /opt/alfresco/alf_data_shared

Here we will consider below mentioned IP Addresses for our Alfresco nodes.

Alfresco server 1 IP: 192.168.14.136

Alfresco server 1 IP: 192.168.14.138

So now please follow below mentioned configuration in both nodes.

(1) First we will have to define few cluster properties in both Alfresco nodes.

On node1:

alfresco.jgroups.bind_address=192.168.14.136
alfresco.jgroups.bind_interface=eth0
alfresco.cluster.name=mycluster
alfresco.jgroups.defaultProtocol=TCP
alfresco.tcp.start_port=7800
alfresco.tcp.initial_hosts=192.168.14.136[7800],192.168.14.138[7800]
index.recovery.mode=AUTO

alfresco.ehcache.rmi.hostname=192.168.14.136
alfresco.ehcache.rmi.port=40001
alfresco.ehcache.rmi.remoteObjectPort=45001

On node2:

alfresco.jgroups.bind_address=192.168.14.138
alfresco.jgroups.bind_interface=eth0
alfresco.cluster.name=mycluster
alfresco.jgroups.defaultProtocol=TCP
alfresco.tcp.start_port=7800
alfresco.tcp.initial_hosts=192.168.14.136[7800],192.168.14.138[7800]
index.recovery.mode=AUTO

alfresco.ehcache.rmi.hostname=192.168.14.138
alfresco.ehcache.rmi.port=40001
alfresco.ehcache.rmi.remoteObjectPort=45001

(2) Now edit /opt/alfresco/tomcat/conf/web.xml file and add <distributable/> at end before </web-app>.

(3) Now set below mentioned path of Alfresco content stores in both nodes.

dir.root=/opt/alfresco/alf_data

(4) Now copy /tomcat/shared/classes/alfresco/extension/ehcache-custom.xml.sample.cluster to /tomcat/shared/classes/alfresco/extension/ehcache-custom.xml.

Now open ehcache-custom.xml file and remove the following default definition of "cacheManagerPeerListenerFactory"

<cacheManagerPeerListenerFactory 
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory" 
properties="socketTimeoutMillis=10000" 
/>

And uncomment the extended definition by removing the comment lines <!-- and --!> before and after the following section.

<cacheManagerPeerProviderFactory
class="org.alfresco.repo.cache.AlfrescoCacheManagerPeerProviderFactory"
properties="heartbeatInterval=5000,
peerDiscovery=automatic,
multicastGroupAddress=230.0.0.1,
multicastGroupPort=4446"
/>

(5) Now we need to enable content store replication. Copy /tomcat/shared/classes/alfresco/extension/replicating-content-services-context.xml.sample to /tomcat/shared/classes/alfresco/extension/replicating-content-services-context.xml. 

Now change Local content store path and shared content store path.

<bean id="localDriveContentStore" class="org.alfresco.repo.content.filestore.FileContentStore">
<constructor-arg>
<value>/opt/alfresco/alf_data/contentstore</value>
</constructor-arg>
</bean>

<bean id="networkContentStore" class="org.alfresco.repo.content.filestore.FileContentStore">
<constructor-arg>
<value>/opt/alfresco/alf_data_shared/contentstore</value>
</constructor-arg>
</bean>

Once you are done with changes, save the file.

(6) Now open tomcat/conf/server.xml file and change jvmRoute in both nodes as shown below.

On node 1:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm1">   

On node 2:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm2">    

(7) Once you are done with these changes, restart Alfresco and folllow below steps on Apache Web Server.

Apache Web Server Configuration for Load Balancer and Auto Failover

Use below configuration for Apache. You may need to enable mod_proxy, mod_proxy_balancer, mod_proxy module for that. 

Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
<Proxy balancer://mycluster>
BalancerMember ajp://192.168.14.136:8009 route=1
BalancerMember ajp://192.168.14.138:80009 route=2
ProxySet stickysession=ROUTEID
</Proxy>
ProxyPass / balancer://mycluster

Once you are done with above changes, you can restart apache and check the Alfresco cluster configuration. 

Validating Alfresco Cluster Setup:

Testing the cluster

Before you start testing Alfresco Cluster Setup, please make sure you have gone through Alfresco Cluster Environment Validation.

There are a set of steps that can be done to verify that clustering is working for the various components involved. You will need direct web client access to each of the machines in the cluster. The operation is done on machine Server 1 and verified on the other machines Server 2. The process can be switched around with any machine being chosen as Server 1.

Cache clustering

Server 1: Login as admin.

Server 1: Browse to the Guest Home space, locate the tutorial PDF document and view the document properties.

Server 2: Login as admin.

Server 2: Browse to the Guest Home space, locate the tutorial PDF document and view the document properties.

Server 1: Modify the tutorial PDF's description field, adding 'abcdef'.

Server 2: Refresh the document properties view of the tutorial PDF document.

Server 2: The description field must have changed to include 'abcdef'.

Index clustering

... Repeat Cache Clustering.

Server 1: Perform an advanced search of the description field for 'abcdef' and verify that the tutorial document is returned.

Server 2: Search the description field for 'abcdef'. As long as enough time was left for the index tracking (10s or so), the document must show up in the search results.

Content replication/sharing

Server 1: Add a text file to Guest Home containing 'abcdef'.

Server 2: Refresh the view of Guest Home and verify that the document is visible.

Server 2: Open the document and ensure that the correct text is visible.

Server 2: Perform a simple search for 'abcdef' and ensure that the new document is retrieved. This relies on index tracking, so it may take a few seconds for the document to be visible.