Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

Rebalance Stuck

Users wiki page - WordPress.com
(50 lines)
May 23, 2011
Kelp
Kelp
We have a mogilefs cluster with a stuck rebalance. It won't stop, I
can't change settings, and because of this I can't do a fsck.

kel### @mogtracker1prod:~$ mogadm rebalance settings
               rebal_host = mogtracker2prod
             rebal_policy = from_percent_used=90 to_percent_free=50
limit=200g limit_by=size limit_type=device
             rebal_signal = stop
kel### @mogtracker1prod:~$ mogadm rebalance status
Rebalance is running
Rebalance status:
             bytes_queued = 5060425864515
           completed_devs = ,
90,118,102,125,95,109,89,93,106,114,101,129,2,110,112,104,124,121,96,126,98,117,119,99
              fids_queued = 10480940
                    limit = 0
             sdev_current = 108
             sdev_lastfid = 3168546
               sdev_limit = 114415671912
              source_devs =
115,92,103,113,91,107,123,97,116,100,128,120,130,122,105,3,94,111
            time_finished = 0
             time_started = 1298078353
             time_stopped = 0
kel### @mogtracker1prod:~$ mogadm rebalance stop
ke### @mogtracker1prod:~$ mogadm rebalance status
Rebalance is running
Rebalance status:
             bytes_queued = 5060425864515
           completed_devs = ,
90,118,102,125,95,109,89,93,106,114,101,129,2,110,112,104,124,121,96,126,98,117,119,99
              fids_queued = 10480940
                    limit = 0
             sdev_current = 108
             sdev_lastfid = 3168546
               sdev_limit = 114415671912
              source_devs =
115,92,103,113,91,107,123,97,116,100,128,120,130,122,105,3,94,111
            time_finished = 0
             time_started = 1298078353
             time_stopped = 0
kel### @mogtracker1prod:~$ mogadm fsck start
rebal_running rebalance running; cannot run fsck at same time

It stays on that same device and fid. It's been stuck this way for
weeks, but otherwise the mogilefs cluster seems to work fine. It's
serving requests.

What could cause this? How can we make it stop so I can run a fsck?

Thanks!


Reply
Tags: rebalancefsckkel
Messages in this thread
Rebalance Stuck
reply Re: Rebalance Stuck
(22 lines) May 23, 2011 13:39
reply Re: Rebalance Stuck
(46 lines) May 23, 2011 17:44
Rebalance stuck?
September 8, 2010 05:46:17 AM
Hi, I have a problem with enable_rebalance. It seems that after a while it stopped working. Now i read in Store.pm about List::Util::shuffle() not being really random, is this still true? I've also setup a test environment to analyse the…
Issue 535 in redis: Redis-server stuck in D state when logfile volume stuck/locked
April 23, 2011 07:11:18 AM
Status: New Owner: ---- Labels: Type-Defect Priority-Medium New issue 535 by jean.p### @gmail.com: Redis-server stuck in D state when logfile volume stuck/locked http://code.google.com/p/redis/issues/detail?id=535 What version of Redis you are…
Neo Rebalance!
September 19, 2010 01:58:50 AM
Yo, I've finally uploaded a working version of the rebalance implementation I've been thinking about for forever. It's inteface is very raw, and I'd like to hear feedback about ways to make it easier to use. I'll be filling in a wiki page with…
Rebalance.pm patch
October 28, 2010 05:39:20 AM
Hi, I'm finally trying the new Rebalance stuff and stubbled upon an error when using a global limit. When the JobMaster tries to load the saved rebal_state, Rebalance.pm croaks because it doesn't recognize the saved setting 'limit'. bye, Martijn.…
Re: Does anyone actively use rebalance?
July 31, 2010 03:33:06 AM
Rebalance has been immensely useful to us. When adding new storage nodes, we are able to mitigate the damage from HD failures, are able to add nodes more sparingly, and just generally balances out load across storage nodes. ( 12 storage nodes, 6…
mogilefs rebalance
May 9, 2011 03:58:35 AM
Dear Listmembers, I have a question about mogilefs rebalance. We're using mogilefs in a virtual machine, and we have to migrate it. The question is, that if the rebalance is too slow for us, can we use rsyncing instead? After rsync, we simply…
Rebalance docs, Roadmap, etc
September 19, 2010 07:51:34 PM
First off, I have posted initial documentation for the unreleased rebalance overhaul: http://code.google.com/p/mogilefs/wiki/Rebalance Next, I have posted a quick Roadmap of all the major features I've been planning on working on: …
source socket, rebalance issues
August 19, 2010 02:18:34 PM
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm running mogilefs 2.30 with 14 hosts with 4 devices each. When I !watch on one of my trackers, I see lots (several to many a minute) of messages like these: :: [replicate(9979)] Unable to create…
Re: source socket, rebalance issues
August 23, 2010 06:28:41 PM
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 8/23/10 12:24 PM, dormando wrote: >> It looks like the maxconns default is 10K and I don't set it explicitly. >> I assume that is connections that would show up in netstat? I'm seeing…
Strange problem with mindevcount=2 during FSCK after rebalance
April 27, 2011 03:22:22 AM
Hi all, Currently I'm on version 2.46 with mindevcount=2 for all my classes across 3 hosts (each has 3 devices). I used to have only 2 hosts and everything works perfectly. Recently I added the 3rd host and did a rebalance to move some files from…
python client server side routing issue after rebalance
April 13, 2011 08:31:21 PM
Good Afternoon, Voldemort developers, I recently expanded my V cluster from 4 node to 8 nodes. We have just under 5 million keys and the rebalance took about 6 hours without issues. I had started the four new servers with blank partitions in the…
Cluster is in inconsistent state got conflicting clocks version when doing a rebalance
May 5, 2011 02:18:40 AM
All, I'm attempting to shrink my cluster from 8 to 4 nodes (we replaced the first four nodes with new hardware which are much faster). However when I ran voldemort-rebalance.sh I got the following error: $ ./bin/voldemort-rebalance.sh --url…
Devcount = 1 when files don't exist even after fsck && rebalance has replicated files too mu
April 27, 2011 04:53:23 AM
Hi all, I have zoning setup between 2 datacentre locations and mindevcount of 1 (to get 1 file in each location). Here are my settings: ~# mogadm settings list network_zones = zone1,zone2 schema_version = 13 …
MaxClients stuck at 10 bug
April 28, 2011 09:12:18 PM
This is a follow-up on my previous post, server limit stuck at 10. I gave a whole bunch of details about what I had done trying to cope with this problem, and got one reply saying suggesting I do what the error message said and set the server…
Re: MaxClients stuck at 10 bug
April 30, 2011 12:23:18 PM
I get this when I do a httpd -v << Server version: Apache/2.2.3 Server built: Jan 31 2011 17:49:25 >> I have attached my httpd.conf Thanks for your help. I appreciate it. If there's anything I can do, please let me know. Marion …
stuck confused an need help please
May 24, 2010 01:37:02 PM
I am trying to tweak my web application at thejarbar.org to be bug free. Currrntly here is an issue whn clicking the "tutorials button as you can view in your browser. I am getting some kind of exceptions as ell listed in the following stack trace…
Another moveChunk stuck
February 26, 2011 07:33:44 PM
Log is full of entries like: Sun Feb 27 01:25:10 [conn118] moveChunk data transfer progress: { active: true, ns: "pravdorub_production.answers", from: "saratov/moskau10:27018,cs3703:27018,cs396:27018", min: { uiq: 35035375 }, max: { uiq: 35390164…
FSCK stuck
October 20, 2010 01:55:49 PM
In order to resolve policy violations after disk crash, I did: mogadm fsck reset, mogadm fsck start This is my current status: > mogadm fsck status Running: Yes (on localhost) Status: 2340 / 1075779 (0.22%) Time: 41676m (0 fids/s;…
MaxClients stuck at 10 bug
April 30, 2011 03:28:08 PM
Here is some more info that might be useful. I got this from yum: << Installed Packages Name : httpd Arch : i386 Version : 2.2.3 Release : 45.el5.centos Size : 3.1 M Repo : installed Summary : Apache HTTP…
RE: Tomcat 5.0 which gets stuck
September 8, 2010 01:11:52 PM
> From: Sumeet Chitte [mailto:chitte### @gmail.com] > Subject: Re: Tomcat 5.0 which gets stuck > This server uses a third party tool, which uses Apache Tomcat 5.0. Please be aware that Tomcat 5.0 is deprecated. You are much more…
How can I get Cassandra to automatically rebalance?
April 24, 2010
I am looking at a cassandra cluster, but the administration effort seems to be quite high. Is there any way I can configure Cassandra to rebalance…
MailScanner stuck in endless loop
May 3, 2011
Hi, I'm trying to install MailScanner and it seems to be stuck in a infinite loop. It scans the same messages and when it's done, it doesn't requeue…
MySQL syntax error: I'm stuck
May 23, 2011
$queryStatus = mysql_query("INSERT into `database`.`users` (`first`, `last`, `pass`, `user`, `id`, `email`, `active`) VALUES ('$first', '$last',…
Stuck with creating rent table
March 18, 2011
hello, i want to create a php with mysql to do the following: lets say that i have a shop i want to rent, rent will be weekly or monthly. I'm…
Expression web ftp: Stuck at "Listing subsites"
January 10, 2011
When I try to use the Expression Web 4 built-in ftp I see the message ""Listing subsites in.." and soon afterward "passive ftp not available". If I…
Stuck again - Need to add another GROUP_CONCAT to MySQL query
January 28, 2011
Thank you all for you help on my first attempt I am trying to add a second group_concat to a query to return data to a VB.Net datagrid. I want to…
My MySQL transaction didn't complete, and it's stuck in the middle.
March 9, 2011
Now I'm getting errors like: OperationalError: (1205, 'Lock wait timeout exceeded; try restarting transaction')
Mysql instance config wizard getting stuck
May 12, 2011
when i gave command for MYSQL Instance config wizard with all the required parameters, it is getting stuck in between,not sure why this is…
Stuck in regular expression i dont know if it is even possible or not using php preg_match_all
April 28, 2011
i have a file out of which i want a specific data below is the sample data moduleHelper.addModule('REC'); moduleHelper.addModule('TOP'); What i want…
Apache x64 on windows stuck when loading is high
April 17, 2011
When there are about 30000 pages per hour, the "httpd.exe" stuck. When I visit my website if it has already stucked, the page is not showing but the…
Requests stuck in servicing state in tomcat
April 8, 2011
Hi.. On my tomcat server manager page, I see that there is a thread in Servicing state since 14536664ms and that request is coming from my ip. I…
Stuck in MYSQL procedure using 'order by variable'
April 4, 2011
this is paramater IN `_user_id` VARCHAR(50), IN `_page` INT, IN `_sort_type` VARCHAR(50), IN `_order_type` VARCHAR(50) and this is procedure BEGIN…
Mysql queries stuck in "sending data" state
February 26, 2011
I'm running a Litespeed web server and a database server (2 x Clovertown 5335) with MySQL 5.1.52-log (running on Cent OS 4.5 and 4.6 32bit…
Stuck trying to track down the culprit for high CPU usage on a LAMP server
April 27, 2011
I'm running Apache on RedHat Enterprise 5.5 with PHP 5.1 from the Rackspace IUS community repository. Occasionally, I have server spikes of 8+ load.…
Tomcat Connection pool creating too many connections, stuck in sleep mode
April 27, 2011
I'm using Tomcat 7's connection pool with MySQL. Testing my application, it doesn't reuse anything from the pool, but ends up creating a new pool,…
Utterly stuck, to short to explain in a title, covers c# asp.net controls and html code and a little javascript
March 30, 2011
Guys i think im lost in my own code, I will try explain exactly what im trying to do, I just hope you can follow and I hope there is a simple…
Stuck with Ingenius Optional URL-rewriting in htaccess with Apache: 1 Rule to Rule them all
March 20, 2011
Dear folks, examine my current url rewrites for loading pictures: // Possibiities: 1 letter char, followed by `=` then the value (numbers/letters)…