Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

Region server blocking updates for a region for 90s

No server address listed in .META.
(35 lines)
Regions loading too fast
(9 lines)
Sep 24, 2010
Dan Harvey
Dan Harvey
Hey,

We seem to have come across a bug in hbase with how it is flushing the
memstore when it is full, I think it's related to this
https://issues.apache.org/jira/browse...ction_12672717but
I'm not sure.

We are currently writing updates to a lot of the rows in a table
and unfortunately are writing to the keys in order so a region gets a lot
of
writes at a short time, I know this isn't good practice but we didn't
realise it wouldn't happen in this task!

So after quite a few writes we get :-

2010-09-24 16:02:16,619 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 13 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:16,652 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 99 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:16,701 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 25 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,197 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 89 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,269 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 73 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,318 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 92 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,357 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 95 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,12849849064
77: memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,544 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 18 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477:
memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,574 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 56 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477:
memstore size 130.2m is >= than blocking 128.0m size
2010-09-24 16:02:17,722 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 74 on 60020' on region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477:
memstore size 130.2m is >= than blocking 128.0m size

Looking to the code the first put that gets blocked sets the memstore to be
flushed but nothing seems to happen until after 90 seconds which seems to
be
a coded time out :-

2010-09-24 16:03:42,984 WARN
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Tried to hold up
flushing for compactions of region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 but
have waited longer than 90000ms, continuing

Then the memstore for that region gets flushed in 2s and all the puts are
unblocked.

2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 74 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 56 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 92 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 18 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 25 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 89 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 73 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 95 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 13 on 60020'
2010-09-24 16:03:44,972 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Unblocking updates for region
canonical_documents,aaebeb30-b624-11df-a52e-0024e8453de6,1284984906477 'IPC
Server handler 99 on 60020'

Is there a reason for hbase blocking for this long with the flushing? or
does it seem to be a bug?

If no one else is getting this is there maybe a way to reduce the chance of
this happening to a region?

Thanks,



Mendeley Limited | London, UK | www.mendeley.com
Registered in England and Wales | Company Number 6419015


Reply
Tags: hbase
Messages in this thread
Region server blocking updates for a region for 90s
Similar Threads
Re: bug report: opening hbase region takes too long , making the region not available for more than
Jonathan: We saw similar issue using HBASE 0.20.6 with HBASE-2473 Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in .META. for region HB_INC_POST_0818-ERROR_SAMPLES-1282193650093,,1282193650831 at…
Region server forced to shut down
Hi all: We applied hadoop/hbase for NLP analysis and intermediate data storage, and encountered a string of problems when the data scale is up. Attached is a region server log that shows how it crashed after running a mapreduce task with…
GC took 299 secs causing region server to die
I kept running into the stop-the-world GC during batch import of data into hbase. The configuration of a node in the 8-node cluster is as follows. * 4-core * 64-bit JVM * 8 GB of memory * CDH2 for hadoop and 0.20.5 for hbase * TT: 128 MB * DN:…
region server hosting .META. table
Hi, We are using HBase 0.20.5 We recently experienced loss of region server hosting .META. table If someone can point me to the code which handles this scenario, that would be great.
restarting region server which shutdown due to GC pause
Thanks for the answer. GC pause seems to be a major cause for region server to come down: 2010-07-21 09:07:14,138 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 291505ms, ten times longer than scheduled: 10000 Is it possible for HBase Master…
A data loss scenario with a single region server going down
Hi folks. I'd like to run the following data loss scenario by you to see if we are doing something obviously wrong with our setup here. Setup: - Hadoop 0.20.1 - HBase 0.20.3 - 1 Master Node running Nameserver, SecondaryNameserver,…
RE: Region servers down...
Thank you J-D. The out file is like this. It has an "NullPointerException" error. 2010-08-24 02:30:14.187::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2010-08-24 02:30:14.187::INFO: jetty-6.1.14 2010-08-24 02:30:14.122::INFO: …
Re: Region servers down...
I don't really see the cause of the shutdown in there, it seems it was already under way. Do you see messages starting with "We slept" and then telling how long it slept? It should be not very far from that in the log. J-D 2010/8/23 xiujin yang…
Re: Region servers down...
The last log to look at would be the .out file. J-D 2010/8/23 xiujin yang <xiujin### @hotmail.com>: > > Thank you J-D, > > I posted today's whole RS log: > http://pastebin.com/djGnNJxk > > GC log: >…
Region splits in 0.89...
My hbase table issued a mass split after I loaded regions with greater sizes than maxfilesize.. (my bad..) Now, when I try accessing the master through the web interface, it just hangs... And, if I scan the META, I get the parent regions set to…
Re: Region servers down...
It would be beneficial to separate the RS on 192.168.158.179 onto another machine. 2010/8/23 xiujin yang <xiujin### @hotmail.com> > > Hi > > My cluster is in this way. > Hadoop & HBase are deployed on different machine.…
region servers crashing
Hi all, We've been having issues for a few days with HBase region servers crashing when under load from mapreduce jobs. There are a few different errors in the region server logs - I've attached a sample log of 4 different region servers crashing…
Region servers exiting, not recovering
Hi, so in our production, we see temporary networking failures (we are not quite 100% sure what they are) but now and then region server's zookeeper session would get expired and in addition some ipc channels would throw 'channel closed'. This…
Initial region loads in hbase..
Hi guys, A couple of days back, I had posted a problem on regions taking too much time to load when I restart Hbase.. I have a table that has around 80 K regions on 650 nodes (!) .. I was checking the logs in the master and I notice that the…
online automatic region merge
Hi guys, It seems that there is no support for automatic region merge in the current implementation of HBase (0.20.5). After searching in jira, I only found a command line utility called onlinemerge in HBASE-1621. If so, any plan on automatic…
Zookeeper exceptions while starting up region
Hi, First of all I'm fairly new to HBase and have set up a small deployment of Hadoop and HBase (0.20.4) on two servers for the beginning in a fully distributed mode. HBase works fine on one server (client operations work perfectly), however…
Re: Region servers up and running, but Master reports 0
Matthew, Yes, the master won't start until the log splitting is done. By the looks of it (such a high number of logs when the max is 32 until the region server force flushes hlogs), it seems that your region servers in a previous run weren't able…
Re: Region servers up and running, but Master reports 0
For those who are curious, using rack awareness to speed up the process of adding and removing nodes did not work in my experiment. I set up a primary rack and a transient rack. The primary rack consists of permanent machines that always stay in…
Out Of Memory on region servers upon bulk import
Hi, I'm doing an experiment on an 8 node cluster, each of which has 6GB of RAM allocated to hbase region server. Basically, doing a bulk import processing large files, but some imports require to do gets and scans as well. In the master UI I see…
Question regarding region scans in HBase integration
I was trying to spend a little time this weekend catching up with the current state of HBase integration for Hive. One thing that I haven't seen mentioned is how exactly Hive scans an HBase table during a SELECT. Does Hive have logic that…