Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

Rebalance / fsck

PHP model/mapper approach to MogileFS
(10 lines)
checksums update
(39 lines)
Jan 16, 2012
Arkadi Colson
Arkadi Colson
This is a multi-part message in MIME format.Hi

Our MogileFS setup has 2 strange problems.

We are running mogileFS on 2 locations: GS and DCG (as you can see in 
the hostnames of the storage hosts). There are as many hosts on GS as on 
DCG. Also the disk space is equal. However 'mogadm check' is showing us 
that there is more space used on GS then on DCG.

To solve this problem, I started a rebalance. After that I saw that some 
filecounts were to high for some domains so I ran an fsck again which 
resulted in an unbalanced situation again. Maybe there's a problem with 
the "HostsPerNetwork" plug-in?

Anyone has a similar behavior?

Thx!

Arkadi

mogfs2-dcg:~# mogadm check
Checking trackers...
   10.0.1.97:7001 ... OK
   10.0.1.81:7001 ... OK
   10.0.1.82:7001 ... OK
   10.0.1.98:7001 ... OK

Checking hosts...
   [ 3] mogstore1-gs ... OK
   [ 4] mogstore1-dcg ... OK
   [ 5] mogstore2-gs ... OK
   [ 6] mogstore2-dcg ... OK
   [ 7] mogstore3-dcg ... OK
   [ 8] mogstore4-dcg ... OK
   [ 9] mogstore3-gs ... OK
   [10] mogstore4-gs ... OK
   [11] mogstore5-gs ... OK
   [12] mogstore6-gs ... OK
   [13] mogstore5-dcg ... OK
   [14] mogstore6-dcg ... OK
   [15] mogstore7-dcg ... skipping; status = dead
   [16] mogstore7-gs ... skipping; status = dead
   [17] mogstore8-dcg ... OK
   [18] mogstore8-gs ... OK

Checking devices...
   host device         size(G)    used(G)    free(G)   use%   ob state   
I/O%
   ---- ------------ ---------- ---------- ---------- ------ ---------- 
-----
   [ 3] dev211         693.814    673.524     20.290  97.08%  
writeable   2.0
   [ 3] dev212         693.814    673.243     20.571  97.04%  
writeable   2.0
   [ 3] dev213         693.814    672.518     21.296  96.93%  
writeable   2.0
   [ 3] dev214         693.814    676.423     17.391  97.49%  
writeable   2.0
   [ 4] dev111         693.814    658.032     35.782  94.84%  writeable  
14.4
   [ 4] dev112         693.814    658.052     35.762  94.85%  writeable  
14.4
   [ 4] dev113         693.814    658.030     35.784  94.84%  writeable  
14.4
   [ 4] dev114         693.814    658.031     35.783  94.84%  writeable  
14.4
   [ 5] dev221         693.814    672.216     21.598  96.89%  writeable  
99.2
   [ 5] dev222         693.814    671.336     22.478  96.76%  writeable  
99.2
   [ 5] dev223         693.814    676.444     17.370  97.50%  writeable  
99.2
   [ 5] dev224         693.814    672.799     21.015  96.97%  writeable  
99.2
   [ 6] dev121         693.814    658.054     35.760  94.85%  writeable  
12.8
   [ 6] dev122         693.814    658.030     35.784  94.84%  writeable  
12.8
   [ 6] dev123         693.814    657.987     35.827  94.84%  writeable  
12.8
   [ 6] dev124         693.814    658.043     35.771  94.84%  writeable  
12.8
   [ 7] dev131        1396.941   1324.878     72.063  94.84%  writeable  
23.2
   [ 7] dev132        1396.941   1324.888     72.053  94.84%  writeable  
26.8
   [ 7] dev133        1396.941   1324.913     72.027  94.84%  writeable  
18.4
   [ 7] dev134        1396.941   1324.868     72.072  94.84%  writeable  
26.0
   [ 8] dev141        1396.941   1324.896     72.045  94.84%  writeable  
22.4
   [ 8] dev142        1396.941   1324.892     72.049  94.84%  
writeable   1.6
   [ 8] dev143        1396.941   1324.935     72.006  94.85%  
writeable   6.0
   [ 8] dev144        1396.941   1324.898     72.043  94.84%  writeable  
13.2
   [ 9] dev231        1396.941   1361.915     35.025  97.49%  
writeable   0.0
   [ 9] dev232        1396.941   1361.917     35.023  97.49%  
writeable   0.0
   [ 9] dev233        1396.941   1361.916     35.025  97.49%  
writeable   0.0
   [ 9] dev234        1396.941   1361.916     35.025  97.49%  
writeable   4.8
   [10] dev241        1396.941   1361.920     35.021  97.49%  
writeable   0.0
   [10] dev242        1396.941   1361.920     35.021  97.49%  
writeable   0.4
   [10] dev243        1396.941   1361.921     35.020  97.49%  
writeable   0.0
   [10] dev244        1396.941   1361.917     35.024  97.49%  
writeable   4.4
   [11] dev251        1731.932   1688.430     43.502  97.49%  
writeable   0.0
   [11] dev252        1740.630   1696.996     43.634  97.49%  
writeable   0.0
   [11] dev253        1740.630   1696.902     43.728  97.49%  writeable  
10.4
   [11] dev254        1740.630   1696.986     43.644  97.49%  
writeable   0.0
   [12] dev261        1731.932   1688.511     43.421  97.49%  
writeable   1.2
   [12] dev262        1740.630   1696.956     43.674  97.49%  
writeable   0.0
   [12] dev263        1740.630   1696.982     43.648  97.49%  
writeable   2.4
   [12] dev264        1740.630   1696.992     43.637  97.49%  
writeable   0.0
   [13] dev151        1731.932   1642.595     89.337  94.84%  
writeable   4.4
   [13] dev152        1740.630   1650.866     89.764  94.84%  
writeable   0.0
   [13] dev153        1740.630   1650.869     89.760  94.84%  
writeable   7.2
   [13] dev154        1740.630   1650.271     90.359  94.81%  
writeable   8.4
   [14] dev161        1731.932   1642.605     89.327  94.84%  
writeable   6.4
   [14] dev162        1740.630   1650.850     89.780  94.84%  writeable  
13.6
   [14] dev163        1740.630   1650.863     89.767  94.84%  
writeable   2.4
   [14] dev164        1740.630   1650.847     89.783  94.84%  
writeable   2.0
   [17] dev181        1740.285   1259.220    481.065  72.36%  
writeable   4.5
   [17] dev182        1740.285   1256.951    483.333  72.23%  
writeable   4.2
   [17] dev183        1740.285   1260.386    479.898  72.42%  
writeable   4.6
   [17] dev184        1740.285   1259.145    481.139  72.35%  
writeable   4.2
   [17] dev185        1740.285   1261.260    479.024  72.47%  
writeable   4.3
   [17] dev186        1740.285   1256.720    483.565  72.21%  
writeable   4.6
   [17] dev187        1740.285   1256.491    483.794  72.20%  
writeable   4.4
   [17] dev188        1740.285   1259.831    480.454  72.39%  
writeable   4.3
   [18] dev281        1740.285   1302.570    437.714  74.85%  writeable  
35.6
   [18] dev282        1740.285   1305.946    434.338  75.04%  writeable  
77.6
   [18] dev283        1740.285   1305.258    435.027  75.00%  
writeable   7.2
   [18] dev284        1740.285   1306.109    434.175  75.05%  
writeable   8.0
   [18] dev285        1740.285   1303.305    436.980  74.89%  writeable  
80.0
   [18] dev286        1740.285   1304.394    435.891  74.95%  writeable  
94.8
   [18] dev287        1740.285   1306.500    433.785  75.07%  writeable  
53.2
   [18] dev288        1740.285   1303.014    437.270  74.87%  writeable  
12.8
   ---- ------------ ---------- ---------- ---------- ------
              total: 89111.917  79402.897   9709.020  89.10%

mogfs2-dcg:~# mogadm class list
  domain               class                mindevcount   replpolicy
-------------------- -------------------- ------------- 
Reply
Tags: strange problemshostnameshostsmogilefsstorage
Messages in this thread
Rebalance / fsck
reply Re: Rebalance / fsck
(214 lines) Jan 16, 2012 12:59
Strange problem with mindevcount=2 during FSCK after rebalance
April 27, 2011 03:22:22 AM
Hi all, Currently I'm on version 2.46 with mindevcount=2 for all my classes across 3 hosts (each has 3 devices). I used to have only 2 hosts and everything works perfectly. Recently I added the 3rd host and did a rebalance to move some files from…
Devcount = 1 when files don't exist even after fsck && rebalance has replicated files too mu
April 27, 2011 04:53:23 AM
Hi all, I have zoning setup between 2 datacentre locations and mindevcount of 1 (to get 1 file in each location). Here are my settings: ~# mogadm settings list network_zones = zone1,zone2 schema_version = 13 …
Created: (HDFS-1307) Add start time, end time and total time taken for FSCK to FSCK report
July 16, 2010 04:40:50 PM
Add start time, end time and total time taken for FSCK to FSCK report
Neo Rebalance!
September 19, 2010 01:58:50 AM
Yo, I've finally uploaded a working version of the rebalance implementation I've been thinking about for forever. It's inteface is very raw, and I'd like to hear feedback about ways to make it easier to use. I'll be filling in a wiki page with…
t/30-rebalance.t failure with IO::AIO 4.11
November 29, 2011 02:51:45 PM
t/30-rebalance.t fails on my Debian testing laptop, which has libio-aio-perl 4.11-2+b1. Towards the end of the test, I get a lot of 404 errors from DELETE. The 404 errors shuffle the "ok 48" message into the middle of the line and hides it from…
Rebalance hangs
August 31, 2011 07:11:29 PM
I'm using 2.53 and rebalance is hanging. I'm not exactly sure where to poke to figure out why. Nothing indicitave of a problem is showing up in !watch or syslog. Any suggestions? To reproduce the problem I run the following commands: mogadm…
Re: Rebalance error
June 16, 2011 01:43:20 AM
No. Just a few per minute... On 06/16/2011 08:39 AM, dormando wrote: > > On Thu, 16 Jun 2011, Arkadi Colson wrote: > >> Can this be the reason that rebalance does not cleanup the over-replicated >> files? > How many of…
Rebalance error
June 15, 2011 10:04:39 AM
Hi, !watch is reporting this while running rebalance: :: [replicate(11617)] Rebalance for DevFID[d=133;f=7675034] (http://10.0.1.103:7500/dev133/0/007/675/0007675034.fid) failed: HTTP delete (due to 'did_rebalance;ret=1') failed: can't connect…
Rebalance.pm patch
October 28, 2010 05:39:20 AM
Hi, I'm finally trying the new Rebalance stuff and stubbled upon an error when using a global limit. When the JobMaster tries to load the saved rebal_state, Rebalance.pm croaks because it doesn't recognize the saved setting 'limit'. bye, Martijn.…
Rebalance Stuck
May 23, 2011 01:33:56 PM
We have a mogilefs cluster with a stuck rebalance. It won't stop, I can't change settings, and because of this I can't do a fsck. kel### @mogtracker1prod:~$ mogadm rebalance settings rebal_host = mogtracker2prod …
mogilefs rebalance
May 9, 2011 03:58:35 AM
Dear Listmembers, I have a question about mogilefs rebalance. We're using mogilefs in a virtual machine, and we have to migrate it. The question is, that if the rebalance is too slow for us, can we use rsyncing instead? After rsync, we simply…
Rebalance stuck?
September 8, 2010 05:46:17 AM
Hi, I have a problem with enable_rebalance. It seems that after a while it stopped working. Now i read in Store.pm about List::Util::shuffle() not being really random, is this still true? I've also setup a test environment to analyse the…
Rebalance does not work well
October 17, 2011 09:54:37 PM
Hi all,I have raised a issue ,see https://issues.apache.org/jira/browse/AMQ-3544.
Rebalance cluster
January 11, 2012 01:32:49 PM
Hi All, We have 5 nodes cluster(on 0.8.6), but two machines are slower and have less memory, so the performance was not good on those two machines for large volume traffic.I want to move some data from slower machine to faster machine to ease…
Re: mogile rebalance
December 19, 2011 10:49:41 AM
> this is the policy I ran ..to test > > mogadm rebalance policy --options="from_devices=dev3,dev4 to_devices=dev2,dev5 limit_type=device limit_by=size limit=10g" > > mogadm rebalance test > Tested rebalance policy... >…
Re: Does anyone actively use rebalance?
July 31, 2010 03:33:06 AM
Rebalance has been immensely useful to us. When adding new storage nodes, we are able to mitigate the damage from HD failures, are able to add nodes more sparingly, and just generally balances out load across storage nodes. ( 12 storage nodes, 6…
Re: mogile rebalance
December 19, 2011 07:56:21 AM
Hi , after adding the new device into the cluster i believe mogile completed replicating the files to the new device dev5 and now I wanted to run rebalance in order to distribute the file system load equally .. this is how the devices look now…
Failure Detector and Rebalance
November 7, 2011 03:01:34 PM
Does failure-detector (any implementation) is rebalance-aware? If a node is added to the cluster, Is this node part of the nodes being checked in any of the failure detector implementations? thanks, --john
Reproductive for issue of rebalance
October 15, 2011 03:00:03 AM
Hi all,assuming there are two networked brokers A and B in both direction . Now setting up a client to connect to Broker A and making sure that the client has connected to Broker A via log information. Then shutdown Broker A and we expect for…
Rebalance docs, Roadmap, etc
September 19, 2010 07:51:34 PM
First off, I have posted initial documentation for the unreleased rebalance overhaul: http://code.google.com/p/mogilefs/wiki/Rebalance Next, I have posted a quick Roadmap of all the major features I've been planning on working on: …
How can I get Cassandra to automatically rebalance?
April 24, 2010
I am looking at a cassandra cluster, but the administration effort seems to be quite high. Is there any way I can configure Cassandra to rebalance…