Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

Question about replication and scaling down

running map-reduce queries in Clojure
(24 lines)
Clarification riak mapred and local_client:get
(11 lines)
Aug 20, 2010
Tony Novak
Tony Novak
Hi there,

I'm new to Riak and starting to evaluate it for an application we're
building.  So far, I'm quite impressed.  But I'm having some trouble
wrapping my head around what happens exactly in the event of a node
failure.
 In this section of the wiki:

https://wiki.basho.com/display/RIAK/R...oesN=3reallymean?

it states that there's no guarantee that the replicas will reside on
different machines.  Does this mean that it's possible for all N copies of
a
given datum to end up on the same machine?  Is there any way to avoid
this,
so that a single node failure won't result in data loss?

Thanks!
Tony

Reply
Tags: failurenodeevent
Messages in this thread
Question about replication and scaling down
Similar Threads
RE: question for sso session replication in tomcat 6.0.26
Hi Pid

My colleague found that a bunch of undocumented attributes for
DeltaManager, and he started using them, then he told me that it looked
like sso failover started working.  So, I went to tomcat source code
(v6.0.26) and checked the options he used.

It turned out that sendAllSessionsWaitTime attribute needs to be set to
-1.
But this trick did not always work. sso failover still does not work in
my test cases.  Could you tell me the implications for this flag when we
set to -1?  It looks like DeltaManager will send replication messages
all the time continuously.  I am not sure if we want to set this flag on
production since it still fails sso failover and may cause network
issues.

Have you found anything yet?

yasushi

-----Original Message-----
From: Pid [mailto:p### @pidster.com] 
Sent: Wednesday, June 23, 2010 1:06 AM
To: Tomcat Users List
Subject: Re: question for sso session replication in tomcat 6.0.26

I'll have to look at the code, but maybe you're being affected by a
recent bug whereby the session id changes after login but isn't then
replicated.

You might search bugzilla to see if this applies to 6.0.26.


p

On 22 Jun 2010, at 22:41, "Okubo, Yasushi (TSD)"
<Yasush### @takedasd.com> wrote:

 
 Hi
 
 There were two cookies created by Tomcat 6.0.26. One is for SSO, and
the
 other is for regular session between client and tomcat.  JSESSIONID
is
 working fine : it means session replication and failover, but not
 JSESSIONIDSSO.  JSESSIONIDSSO is updated with new value upon relogin.
 
 yasushi
 
 
 JSESSIONIDSSO
 65110434847FE0AA1F1EBF0EF0871D25
 
 
 JSESSIONID
 5CFE92814875C4DEFC554526147698A3.jvm2
 
 -----Original Message-----
 From: Jon Brisbin [mailto:jon.br### @npcinternational.com] 
 Sent: Tuesday, June 22, 2010 2:17 PM
 To: Tomcat Users List
 Cc: Okubo, Yasushi (TSD)
 Subject: Re: question for sso session replication in tomcat 6.0.26
 
 Are you using a "jvmRoute" setting on your BalancerMember definition
in
 mod_proxy config and on the <Engine/> element in server.xml?
Your
cookie
 would have the jvmRoute property added to the end of it (e.g.
 ALONGMD5HASH.server1) if so.
 
 From the Almighty Google:
 http://community.jboss.org/wiki/usingmodproxywithjboss
 
 Jon Brisbin
 Portal Webmaster
 NPC International, Inc.
 
 
 
 On Jun 22, 2010, at 3:48 PM, Okubo, Yasushi (TSD) wrote:
 
> Hi
> 
> I downloaded apache apache v2.2.15 and compiled and installed,
but
the
> result was the same.
> 
> Session sso replication looked like failed.  Upon shutting down
the
> node, it kicked me out of password protected area and needed to
 re-loin
> on the second node.
> 
> On apache, I installed/enabled all modules including basic
> authentication etc.  Is there any requirement on apache side or
how
 the
> virtual host should be set up in httpd.conf to make sso failover
work?
> 
> Thanks,
> yasushi
> 
> -----Original Message-----
> From: Pid [mailto:p### @pidster.com] 
> Sent: Tuesday, June 22, 2010 8:04 AM
> To: Tomcat Users List
> Subject: Re: question for sso session replication in tomcat
6.0.26
> 
> On 22/06/2010 15:56, Okubo, Yasushi (TSD) wrote:
>> Hi Andrew
>> 
>> In case of no failover, SSO works for all web applications on
the
 same
> host.  Upon failover [shutting down one node], a user is routed
to
the
> other node, and TC is asking for a user to re-login when he/she
tried
 to
> access password protected area.  
>> 
>> I have checked many times on server.xml and session
replication is
> working fine upon failover, so I cannot think any
misconfiguration on
> server.xml
>> The issue is SSO failover is not working.  I think it might
be
 related
> to my apache virtual host setup, but could not figure it out.
>> 
>> Thanks for your help,
>> yasushi
>> 
>> I am using mod_proxy_ajp, mod_proxy_balancer [v2.2.3]
> 
> mod_proxy_ajp appeared in 2.2.3 for the first time, it was
functional
> but not perfect & there are many bugfixes and improvements
since
then,
> you should upgrade HTTPD.
> 
> 
> p
> 
>> OS : Redhat Linux 64bit  RHEL v5.5
>> JDK : 1.6.0.20 
>> 
>> === I created virtual host on port 9050 ==
>> Httpd.conf
>> 
>> <VirtualHost 10.250.200.57:9050>
>> ServerAdmin xyz
>> ServerName webclust1.xyz.com
>> ServerAlias webclust1
>> ErrorLog logs/webclust_cluster_error.log
>> CustomLog logs/webclust-cluster-access_log common
>> 
>> <Location /balancer-manager>
>> SetHandler balancer-manager
>> 
>> Order Deny,Allow
>> Deny from all
>> Allow from all
>> </Location>
>> 
>> ProxyRequests off
>> <Proxy balancer://webclust>
>> BalancerMember ajp://10.250.200.57:9001 loadfactor=10 max=150
 smax=145
> route=jvm1
>> BalancerMember ajp://10.250.200.57:9002 loadfactor=10 max=150
 smax=145
> route=jvm2
>> BalancerMember ajp://10.250.200.57:9003 loadfactor=10 max=150
 smax=145
> route=jvm3
>> Order Deny,Allow
>> Allow from all
>> </Proxy>
>> 
>> #Do not proxy balancer-manager
>> ProxyPass /balancer-manager !
>> 
>> <Location /examples>
>> ProxyPass balancer://webclust/examples
> stickysession=JSESSIONID|jsessionid
>> ProxyPassReverse balancer://webclust/examples
>> Order Deny,Allow
>> Allow from all
>> </Location>
>> 
>> <Location / >
>> ProxyPass balancer://webclust/
stickysession=JSESSIONID|jsessionid
>> ProxyPassReverse balancer://webclust/
>> Order Deny,Allow
>> Allow from all
>> </Location>
>> 
>> 
>> === server.xml ===
>>   <!-- Define an AJP 1.3 Connector on port 8009 -->
>>   <Connector port="9002" protocol="AJP/1.3"
redirectPort="8443" />
>> 
>> <Engine name="Catalina" defaultHost="localhost"
jvmRoute="jvm1">
>> 
>> <Host name="localhost"  appBase="webapps"
>>           unpackWARs="true" autoDeploy="true"
>>           xmlValidation="false" xmlNamespaceAware="false">
>> 
>>       <Cluster
> className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
>>                channelSendOptions="4">
>> 
>>         <Manager
> className="org.apache.catalina.ha.session.DeltaManager"
>>                          name="node2"
>>                  expireSessionsOnShutdown="false"
>>                  notifyListenersOnReplication="true"/>
>> 
>>         <Channel
> className="org.apache.catalina.tribes.group.GroupChannel">
>>           <Membership
> className="org.apache.catalina.tribes.membership.McastService"
>>                       address="228.0.0.5"
>>                       port="45564"
>>                       frequency="500"
>>                       dropTime="3000"/>
>>           <Receiver
> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>>                     address="auto"
>>                     port="4020"
>>                     autoBind="100"
>>                     selectorTimeout="5000"
>>                     maxThreads="12"/>
>> <Sender
> 

className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
>>             <Transport
> 

className="org.apache.catalina.tribes.transport.nio.PooledParallelSender
> "/>
>>           </Sender>
>>           <Interceptor
> 

className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetec
> tor"/>
>>           <Interceptor
> 

className="org.apache.catalina.tribes.group.interceptors.MessageDispatch
> 15Interceptor"/>
>>               <Interceptor
> 

className="org.apache.catalina.tribes.group.interceptors.ThroughputInter
> ceptor"/>
>>         </Channel>
>> 
>>         <Valve
> className="org.apache.catalina.ha.tcp.ReplicationValve"
>> 
> 

filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;
> .*\.xls;.*\.sdf;.*\.xml;"/>
>>             <!-- only with jk_mod failover-->
>>         <Valve
> className="org.apache.catalina.ha.session.JvmRouteBinderValve"
>>                enabled="true"
sessionIdAttribute="takeoverSessionid"
> />
>> <!--
>>         <Deployer
> className="org.apache.catalina.ha.deploy.FarmWarDeployer"
>>                   tempDir="/tmp/war-temp/"
>> 
> deployDir="/usr/local/apache/node2-tomcat-6.0.26/webapps"
>>                   watchDir="/tmp/war-listen/"
>>                                      
watchEnabled="true"/>
>> -->
>>                 <!-- only with jk_mod and
jvmroutebindervalve--> 
>>         <ClusterListener
> 

className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListene
> r"/>
>>         <ClusterListener
>
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
>>       </Cluster>
>> 
>> <Valve
>
className="org.apache.catalina.ha.authenticator.ClusterSingleSignOn"
 />
>> 
>> <Valve
className="org.apache.catalina.valves.AccessLogValve"
> directory="logs"  
>>              prefix="webappqa_node2_access_log."
suffix=".log"
> pattern="common" resolveHosts="false"/>
>> 
>>     </Host>
>> </Engine>
>> 
>> 
>> -----Original Message-----
>> From: Andrew Bruno [mailto:andrew.### @gmail.com] 
>> Sent: Monday, June 21, 2010 10:09 PM
>> To: Tomcat Users List
>> Subject: Re: question for sso session replication in tomcat
6.0.26
>> 
>> Oh sorry, I re-read your answer.  Not sure why SSO is not
working,
be
>> interested to find out though..
>> 
>> AB
>> 
>> On Tue, Jun 22, 2010 at 3:04 PM, Andrew Bruno
 <andrew### @gmail.com>
> wrote:
>>> Hi Yasushi
>>> 
>>> In your serverl.xml have you added the jvmroute to the
Engine?
>>> 
>>> i.e.
>>> 
>>> <Engine name="Catalina" defaultHost="localhost"
jvmRoute="1">
>>> 
>>> Andrew
>>> 
>>> On Tue, Jun 22, 2010 at 2:50 PM, Okubo, Yasushi (TSD)
> <Yasush### @takedasd.com> wrote:
>>>> Hi Andrew
>>>> 
>>>> Thank for your post.  When I checked the session id
from firefox,
> sso session id [jsessionidsso] does not have jvmroute info, but
only
> jsessionid has jvmroute.  So, session replication upon failover
is
> working fine, but singlesionon upon failover is not working on
tomcat
> 6.0.x (including 6.0.26).
>>>> 
>>>> yasushi
>>>> 
>>>> -----Original Message-----
>>>> From: Andrew Bruno [mailto:andrew.### @gmail.com]
>>>> Sent: Monday, June 21, 2010 9:18 PM
>>>> To: Tomcat Users List
>>>> Subject: Re: question for sso session replication in
tomcat 6.0.26
>>>> 
>>>> Looking at the code I think this is wrong
>>>> 
>>>> if (!_ssoSessionId.contains("." + jvmRoute)) {
>>>> _ssoSessionId += "." + jvmRoute;
>>>> response.addCookie(new
Cookie(_SSO_SESSION_COOKIE_NAME,
> _ssoSessionId));
>>>> }
>>>> 
>>>> The original sessionId will already have the
 "."+_any_other_jvmRoute
>>>> included, so you need to substring it, and append the
new
jvmRoute.
>>>> 
>>>> _ssoSessionId= _ssoSessionId.substring(0,
> _ssoSessionId.indexOf("."))
>>>> 
>>>> and then add
>>>> 
>>>> _ssoSessionId += "." + jvmRoute;
>>>> 
>>>> AB
>>>> 
>>>> On Tue, Jun 22, 2010 at 1:03 PM, Okubo, Yasushi (TSD)
>>>> <Yasushi### @takedasd.com> wrote:
>>>>> Hi experts
>>>>> 
>>>>> 
>>>>> 
>>>>> I found this old email from archive in TC 5.5.23.
>>>>> 
>>>>> Does this problem still exist in tomcat 6.0.x or
6.0.26?
>>>>> 
>>>>> 
>>>>> 
>>>>> When failover occurs, sso session id is updated
with new number
> after
>>>>> forcing a user to relogin to the application
since sso session id
> is not
>>>>> replicated and rewritten correctly.  Could
someone explain what
is
>>>>> expected in current tomcat 6.0.x cluster upon
failover?  Should
 sso
>>>>> session id is replicated correctly in tomcat
6.0.x?
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> yasushi
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ROOKIE wrote:
>>>>> Hi,
>>>>> I have a problem with tomcat cluster + mod_proxy
load balancer :
>>>>> 
>>>>> We have a main app which authenticate itself to a
webapp and from
> this
>>>>> app one
>>>>> can launch embedded apps which use the SSO cookie
to access other
>>>>> webapps on
>>>>> the server (Single-Sign-On for the user).
>>>>> 
>>>>> Things are working perfectly for the normal
cookie but not for
the
> sso
>>>>> cookie.
>>>>> 
>>>>> 
>>>>> The problem I have is that tomcat does not
replicate SSO sessions
> so
>>>>> when these embedded apps route through the load
balancer we get
> 401s on
>>>>> all the other cluster members except the one
which actually
> generated
>>>>> the SSO cookie.
>>>>> 
>>>>> I wanted to know if we can edit the SSO cookie
generated by
tomcat
> to
>>>>> also
>>>>> contain the jvmRoute parameter so that the load
balancer directly
> goes
>>>>> to the
>>>>> correct cluster member.
>>>>> 
>>>>> 
>>>>> I tried doing this in my code by fetching the SSO
cookie and
> appending
>>>>> to it
>>>>> the jvmRoute as follows :
>>>>> 
>>>>>      HttpServletRequest request =
>>>>>
(HttpServletRequest)Security.getContext(HttpServletRequest.class);
>>>>>      HttpServletResponse response =
>>>>> 
>
(HttpServletResponse)Security.getContext(HttpServletResponse.class);
>>>>>      if(request != null) {
>>>>>          String jvmRoute = "Vinod_Cluster_1";   
// as mentioned
> in
>>>>> server.xml
>>>>>          Cookie[] cookies = request.getCookies();
>>>>>          for(int nc=0; cookies != null &&
nc < cookies.length;
> nc++)
>>>>> {
>>>>> 
> if(_SESSION_COOKIE_NAME.equals(cookies[nc].getName())) {
>>>>>                  _sessionId =
cookies[nc].getValue();
>>>>>              }
>>>>> 
>>>>> else
if(_SSO_SESSION_COOKIE_NAME.equals(cookies[nc].getName())) {
>>>>> 
>>>>>                  _ssoSessionId =
cookies[nc].getValue();
>>>>>                  if (!_ssoSessionId.contains("."
+ jvmRoute)) {
>>>>>                      _ssoSessionId += "." +
jvmRoute;
>>>>> 
>>>>> response.addCookie(new
Cookie(_SSO_SESSION_COOKIE_NAME,
> _ssoSessionId));
>>>>> }
>>>>> 
>>>>> 
>>>>>              }
>>>>> 
>>>>> 
>>>>> But after this I started getting 401s from even
the correct
 cluster
>>>>> member. My guess is addCookie doesnt update the
cookie in
tomcat's
> cache
>>>>> which is reasonable.
>>>>> 
>>>>> Other thought was to edit tomcat's sso cookie
generation code to
> append
>>>>> the
>>>>> jvmRoute to the sso cookie.
>>>>> 
>>>>> 
>>>>> Is there an better way to achieve this in my code
base ?
>>>>> 
>>>>> Thanks In Advance,
>>>>> Vinod
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
> 
question for sso session replication in tomcat 6.0.26
Hi experts

 

I found this old email from archive in TC 5.5.23.

Does this problem still exist in tomcat 6.0.x or 6.0.26?

 

When failover occurs, sso session id is updated with new number after
forcing a user to relogin to the application since sso session id is not
replicated and rewritten correctly.  Could someone explain what is
expected in current tomcat 6.0.x cluster upon failover?  Should sso
session id is replicated correctly in tomcat 6.0.x?

 

Thanks,

yasushi

 

 

 

ROOKIE wrote:
Hi,
I have a problem with tomcat cluster + mod_proxy load balancer :
 
We have a main app which authenticate itself to a webapp and from this
app one 
can launch embedded apps which use the SSO cookie to access other
webapps on 
the server (Single-Sign-On for the user).
 
Things are working perfectly for the normal cookie but not for the sso
cookie.
 

The problem I have is that tomcat does not replicate SSO sessions so
when these embedded apps route through the load balancer we get 401s on
all the other cluster members except the one which actually generated
the SSO cookie. 

I wanted to know if we can edit the SSO cookie generated by tomcat to
also 
contain the jvmRoute parameter so that the load balancer directly goes
to the 
correct cluster member.
 
 
I tried doing this in my code by fetching the SSO cookie and appending
to it 
the jvmRoute as follows :
 
        HttpServletRequest request = 
(HttpServletRequest)Security.getContext(HttpServletRequest.class);
        HttpServletResponse response = 
(HttpServletResponse)Security.getContext(HttpServletResponse.class);
        if(request != null) {
            String jvmRoute = "Vinod_Cluster_1";    // as mentioned in 
server.xml
            Cookie[] cookies = request.getCookies();
            for(int nc=0; cookies != null && nc <
cookies.length; nc++)
{
                if(_SESSION_COOKIE_NAME.equals(cookies[nc].getName())) {
                    _sessionId = cookies[nc].getValue();
                }

else if(_SSO_SESSION_COOKIE_NAME.equals(cookies[nc].getName())) { 

                    _ssoSessionId = cookies[nc].getValue();
                    if (!_ssoSessionId.contains("." + jvmRoute)) {
                        _ssoSessionId += "." + jvmRoute;

response.addCookie(new Cookie(_SSO_SESSION_COOKIE_NAME, _ssoSessionId));
} 

 
                }
 

But after this I started getting 401s from even the correct cluster
member. My guess is addCookie doesnt update the cookie in tomcat's cache
which is reasonable. 

Other thought was to edit tomcat's sso cookie generation code to append
the 
jvmRoute to the sso cookie.
 

Is there an better way to achieve this in my code base ? 

Thanks In Advance,
Vinod

 



backlogging and scaling
I'm curious if there are any efforts ongoing to amortize the
background tasks in Cassandra over time?
Specifically, the cost of compaction and AE, rebalancing, etc seems to
be a problem for some users when they are expecting more steady-state
performance. While this may sometimes be the result of a cluster which
is at its marginal capacity, users are still surprised with the
performance hit or downtime required for common operations. Making the
cluster able to make finer-grained and measurable progress towards the
ideal state may help other users, too.

Is there a feasible design or enhancement which may allow these types
of background tasks to be broken apart into smaller pieces without
compromising overall consistency?
It would be excellent if the user could see the over-all state of the
storage cluster, and to choose the proportion of resources allocated
to recovering backlog vs servicing clients, etc.
Even better, if there were some basic heuristics which worked well for
the general case, and users would only have to see the scheduling plan
in special situations.

How would you go about doing that? Does the current architecture lend
itself to this type of optimization, or otherwise?


Re: Cassandra Scaling Questions
 
 1.) What have you found to be the best ratio of Cassandra row cache
to memory
free on the system for filesystem cache?  Are you tuning it like an RDBMS
so
Cassandra has the vast majority of the RAM in the system or are you
letting the
filesystem cache do some of the work?

This depends on your exact case: how much rows are in a hot set. Throwing
too
much memory to JVM cache results in slower garbage collection with no
effect on
performance. There are cases (for ex, large rows, which are read mostly
partially using get_slice), for which row cache will do things worse. I
did a
try and watch approach, changing size of row cache and watching for row
cache
hit ratio and op/s. Hit ratio of 0.9 was enough for my case.

 
 2.) Is the Cassandra cache write-through (ie are new records held in
the row
cache as they're written to disk?

Not exactly. Cassandra keeps recent writes (not rows) in memory, but after
flushing memtable, it will reread from disk (and reconstruct) whole row to
row
cache on 1st read if data. 

 
 3.) When using the random partitioner how much difference should be
expected
(or has been observed) between nodes?  2%? 10%?

This depends on data. It will distribute keys almost equal between nodes,
nut
sizes of row data could be different for different keys. In my case it was
about
0.2% 








Cassandra Scaling Questions
Hi All,
I've got a couple questions that have come up about how Cassandra works
and
what others are seeing in their environments.  Here goes:

1.) What have you found to be the best ratio of Cassandra row cache to
memory free on the system for filesystem cache?  Are you tuning it like an
RDBMS so Cassandra has the vast majority of the RAM in the system or are
you
letting the filesystem cache do some of the work?

2.) Is the Cassandra cache write-through (ie are new records held in the
row
cache as they're written to disk?

3.) When using the random partitioner how much difference should be
expected
(or has been observed) between nodes?  2%? 10%?

3.5) Can a load balance be expected to bring the data distribution pretty
close to even among all nodes in the ring?  Is the correct process for a
loadbalance to run the loadbalance operation on each node in the ring?


Thanks!  I'm curious to hear what other's have observed.
-Aaron


Replication
  Hi all.

Is there any documentation describe replication setup, because i have 
try two times and it doesnt work. I have added this:
<property>
<name>hbase.replication.enabled</name>
<value>true</value>
</property>

to hbase-site.xml and changed REPLICATION_SCOPE to '1' in all columns 
but it still doesnt work.

Thanks in advance.
S.Bauer


rebalancing replication help
Looks like there is not much activity in the hdfs-user list. So, am
reposting it in the general list.

Hi guys.
  I have a few related questions. I am going to layout the steps I have
taken. Please comment on what I can do better.

  I was trying to to add 5 nodes to my existing 10 node cluster and also
increase the replication factor from 2 to 3.
I thought I don't have to run the balancer cause it would most likely put
the new replicas into the new nodes.

There are about 500k blocks.
I wanted to get it all stabilized(replication and balancing) within 24
hours. Its more than 24 hours now and fsck reports 30% under replication.
Is there a way to force hdfs to use balance/replicate more aggressively.

It would be great if someone explained what/when things happen to blocks
in the context of

1)      Rebalancing

2)      -setrep

3)      Restarting cluster with a higher/lower replication factor.

A few questions and a few issues here.

1)      When you restart the cluster with a higher than previous
replication value. Does it also apply to existing blocks or only to new
blocks being created ?

2)      Does the balancer take into account under replication of blocks or
does it blindly start moving existing blocks to reach threshold ?


A very specific problem .  I am having this strange problem where the
-setrep hangs on one particular block for hours. Is this because its
corrupt ?. But, fsck said its healthy.


Thanks
Arun



Replication startup
Using the latest snapshot (1.5.7), I'm trying to start up two
replication servers thusly on the same machine:
mongod --shardsvr --port 27018 --replSet mongo-dev1/10.1.1.233:27021 --
bind_ip 10.1.1.233 --dbpath /mnt/mongo-data-27018
mongod --shardsvr --port 27021 --replSet mongo-dev1/10.1.1.233:27018 --
bind_ip 10.1.1.233 --dbpath /mnt/mongo-data-27021

this results in copious log messages of the following, which I
expected:
Fri Jul 30 11:38:34 [startReplSets] replSet can't get
local.system.replset config from self or any seed.
Fri Jul 30 11:38:34 [startReplSets] replSet   sleeping 20sec and will
try again.

But when I connect to the 27018 server, and run these commands:
var config = {
  _id : "mongo-dev1",

  members: [
    {
      _id : 0,
      host : "10.1.1.233:27018"
    }
    ,
    {
      _id : 1,
      host : "10.1.1.233:27021"
    }
  ]
};
rs.initiate(config);

...I get this error:
{
	"startupStatus" : 4,
	"info" : "mongo-dev1/10.1.1.233:27018",
	"errmsg" : "all members and seeds must be reachable to initiate set",
	"ok" : 0
}


Is there something I need to do differently for the first member of
the set?





Comulative Replication
Is there a way to do something like "commutative" replication by
replicating and simultaneously appending and merging the target document
with the new fields from the source doc thus avoiding the simple override
of the target document with the current version of the source. I call this
"commutative" just to emphasize on the act of appending the new information
and overriding the existed fields but not the whole document itself. 


Thank you!
Nikolai






Monitoring Replication
Hi List

my last mail was marked to have been rejected so i resend it again. hope
that it will get through this time. if the first one has been received.
sorry for the inconvenience.

Greets
Moritz


---------- Forwarded message 
How to force the replication ?
Hi all
I change the replication of file by using command : bin/hadoop setrep
But when I use fsck to check the status of the file, it always shows
that the one replica of this file is missing. I know that setrep
command only change the metadata of NameNode, so when will it affect
the data node eventually ? And Can I force the procedure ?



Jeff Zhang


Why is replication so slow?
Hi,

I have three equal machines with Pentium(R) D CPU 3.20GHz, 2GiB RAM, 
FreeBSD 8, Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:2:2] [rq:2] 
[async-threads:0] [hipe] [kernel-poll:false], and CouchDB 1.0.0.

I would like to replicate documents between the three (even more 
machines later) in a fully meshed replica agreement (every node 
replicates from/to every other to ensure that there is no SPoF and every 
document gets to others ASAP). The nodes would store small, but quickly 
changing documents (application no. 1) and larger (from several kBs to 
several GBs) binary attachments (application no. 2). The applications 
are not mixed on the same CouchDB instance (even the machines).

I've experimented with the first and noticed that no matter how fast 
insert documents (BTW, I could achieve about 230 inserts per second, 
parallel connections, no bulk inserts) the traffic between the machines 
doesn't go beyond about 500 kBps and the replicas lag behind the written 
node (a lot!).

Based on this, I've started another test, now with smaller binary 
attachments. The first run did this:
for i in `jot 128`
do
curl -X PUT http://localhost:5984/testdb/$i/file -H "Content-Type: 
application/octet-stream" --data-binary @bin1
done

That is, it uploads 128 MB of data (bin1 is 1MB of size).

Without replication, it runs in 8.64 seconds (14.81 MBps, not that fast 
either, but hey, it's erlang :). If I run it with background curl 
processes (maximum 128 parallel uploads), the script runs in 6.74s 
(18.99 MBps).

Now if I make a one way replica to another node (connected with gigabit 
ethernet), the run time slightly increases to 7.04s on the master node, 
but it takes 42 seconds (3.04 MBps) for all the 128 documents to reach 
the slave node.
Things get worse when I make a two way replication between the two 
nodes, this time the upload on the "master" node takes 7.4 seconds, but 
75 seconds are needed for the two nodes to become consistent. The erlang 
processes on both sides eat more resources, so this slowdown is 
completely visible, not network bound (of course).

If I make two one way replications (A->B, A->C node), the times look

like this: time needed to upload on the master (A) node: 6.52s, time 
needed for the slave (B, C) nodes to get consistent with A: 44s (A->B),

39s (A->C).
BTW, I calculate this from the start of the script (I'm not writing the 
data on A and then set up replication).

With the following replications defined: A<->B, A<->C, I get
these: 
uploading to A: 7.34s, A->B consistency: 72s , A->C consistency: 72s
During the process I saw this on node A:
   PID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
COMMAND
15427 couchdb     11 109    0   217M   149M CPU0    0  14:44 135.94% 
beam.smp
and this was after the upload has been done, so this is what CouchDB 
eats when doing bilateral replication towards two nodes.

And now the full mesh (A<->B, A<->C, B<->C):
CouchDB resource usage tops:
15427 couchdb     11 110    0   270M   202M CPU1    1  18:44 140.14% 
beam.smp

and the consistency times also: A->B: 125s, A->C: 107s.
BTW, the upload lasted for 7.59s.

Summary: it seems unilateral replication is consistent in it's resource 
usage, and it's pretty slow (7s on localhost write vs. 42s of 
replication to the remote node). If I define a bilateral replication it 
slows down further, nearly to the half. Every bilateral agreement 
introduces this slowdown, so one unilateral: 42s, one bilateral: 72s, 
two bilaterals: 125s.

I'm sure it's not about waiting for the network or disk, it seems to be 
pure resource usage problem. Is this known? Will it be fixed?

Thanks,


Securing replication
I have a webapp acting as a security gateway/reverse proxy for other
webapps.  Its data therefore includes authorization data which itself
needs to be protected from unauthorized access.  Currently that data
lives in a localhost-only CouchDB, but I now want to distribute the
application across more than one server, so I'm going to have to open
up a hole to allow replication.  My current thoughts are:

1. Keep the Couch instances listening on localhost only, but open SSH
tunnels between them to use for the replication.
    Based on past experience with SSH tunneling, this seems somewhat
fragile, and probably more complex than is warranted.

2. Use iptables to lock down access on each server so that only the
other server can connect to the Couch instance.
   This is still moderately increases the complexity (there's an
external configuration to keep in synch with the Couch configs), but
it seems simpler and less likely to break than the SSH solution, while
still offering moderate security (in combination with configuring
Couch to require admin credentials).

Any other thoughts?  Advice welcome.





replication users
 Are there any special considerations when replicating the _users
database as opposed to normal databases?  Is this a good way to share
users between servers that should share users and trust one another?


Replication failure
Couchers,

I'm trying to replicate a database down to my local workstation, but
the replication seems to fail after a certain point:

$ curl -X POST http://*******:*******@localhost:5984/_replicate -d
'{"source":"http://*******:*******@couch.mydomain.com:5984/backlogger","target":"backlogger","create_target":true}'

{"error":"json_encode","reason":"{bad_term,{couch_rep_reader,'-open_doc_revs/3-fun-1-',\n
                          
[{[{<<\"error\">>,<<\"unauthorized\">>},\n
                             {<<\"reason\">>,\n
        <<\"You are not authorized to access this
db.\">>}]},\n

{http_db,\"http://*******:*******@couch.mydomain.com:5984/backlogger/\",\n
                                     [],[],\n
            [{\"User-Agent\",\"CouchDB/1.0.0\"},\n
                  {\"Accept\",\"application/json\"},\n
                      {\"Accept-Encoding\",\"gzip\"}],\n
                       [],get,nil,\n
   [{response_format,binary},\n
{inactivity_timeout,30000},\n
{max_sessions,10},\n
{max_pipeline_size,10}],\n
10,500,nil}]}}"}

Note that it did create the database locally, and that 130 out of the
137 docs seem to have made it over:

$ curl localhost:5984/backlogger

{"db_name":"backlogger","doc_count":130,"doc_del_count":32,"update_seq":162,"purge_seq":0,"compact_running":false,"disk_size":348249,"instance_start_time":"1279204723544778","disk_format_version":5}

$ curl http://*******:*******@couch.mydomain.com:5984/backlogger

{"db_name":"backlogger","doc_count":137,"doc_del_count":39,"update_seq":402,"purge_seq":0,"compact_running":false,"disk_size":290909,"instance_start_time":"1279204391630006","disk_format_version":5}

I re-ran the replication, and the number of docs locally went up to
135, but it still failed with the same error message. After that,
re-running the replication seemed to cause no further changes to my
local database.

Any ideas how I can complete the replication?

FYI, I'm running 1.0.0 locally and 0.11.0 on the remote server.


Cheers,

Zach


repo replication
Dear All,
I'd have a question regarding jackrabbit replication. The situation is
the following:
We are using jboss drools that has jackrabbit 1.3 as the repository.
For business continuity
reasons we have 2 servers though only one of them is running (i mean
the application server (jboss)) while
the DB of the running one is replicated to the DB of the standby one.
I have to solve that in case of emergeny when we need to start the
standby one it has the same content (the rules)
as the one that runs usually. Do you have any ideas how to replicate
the data provided that
the standby application is not running (and its jackrabbit either).
I though of:
1. using DB (that is replicated) as a datastore of jackrabbit
2. replicating the repository folder somehow...
I'm not sure of any of the 2 would work.
If you have any experience with such problems and could also share it,
I'd be glad.
Thanks,
Daniel


Comulative Replication
Is there a way to do something like "commutative" replication by
replicating and simultaneously appending and merging the target document
with the new fields from the source doc thus avoiding the simple override
of the target document with the current version of the source. I call this
"commutative" just to emphasize on the act of appending the new information
and overriding the existed fields but not the whole document itself. 


Thank you!
Nikolai


Duplicate a node (replication).
Hi.

  I have a cluster with only 1 node with a lot of datas (500 Go) .
  I want add a new node with the same datas (with a ReplicationFactor 2)

The method normal is :
stop node.
add a node.
change replication factor to 2.
start nodes
use nodetool repair

  But , I didn't know if this other method is valid, and if it's can be
faster :
stop nodes.
copy all SSTables
change replication factor.
start nodes
and
use nodetool repair

  Have you an idea for the faster valid method ?

Thx.