Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

How to specify HBase cluster end-points from HBase client code in HBase 0.20.0

HBase on same boxes as HDFS Data nodes
(24 lines)
columns.to_java_bytes undefined method in HBase.rb line 554
(19 lines)
Jul 7, 2010
Jun Li
Jun Li
Hello,

In my current application environment, I need to have two HBase
clusters running in two different racks, to form a fault-tolerant
group to tolerate power failure. Then I have an HBase client, which is
sitting outside of these two clusters,  to make invocation to the
these two HBase clusters.

In my previous work, I just need to simply use the class of “HTable”,
and passed in an instance of HBaseConfiguration. And To construct the
HBaseConfiguration instance, I just need to pass in the path
information of the “hbase-site.xml”. And in the hbase-site.xml, there
is only one parameter called “hbase.rootdir” that need to configure.

Before HBase0.20.0, there used to be a parameter called “hbase.master”
that I can specify. But in HBase0.20.0, I found that it does not work
any more, likely because that the HBase master is managed by the
Zookeeper, and the master node now becomes dynamic.

Could you show me what are the APIs that I need to use, in order for
me to specify the end-point address of the HBase cluster, for the
HBase client invocation?

Regards,

Jun


Reply
Tags: clustershbasesitting
Messages in this thread
How to specify HBase cluster end-points from HBase client code in HBase 0.20.0
Similar Threads
Smallest production HBase cluster
anyone able to share their experience, thoughts on the 'smallest'
production HBase cluster in operation?    Thinking there may be some point
in the # Nodes scale where one transitions from/to "that's silly" to
"that's actually more like it".

Anyone out there with a small HBase cluster in operation with < 10
nodes able to share any information?

I notice on http://wiki.apache.org/hadoop/Hbase/PoweredBy there are some
who have even just a 3 node cluster, perhaps that's out of date, but
curious to know from the community on where people think 'the line' needs
to be drawn on usage of Hbase.

To take things to an extreme, is there anyone actually running a _single_
HBase node... ? (one would hope that machine is actually designed to be a
bit more HA than normal) just to take advantage of a column-oriented store?

thanks,

Paul

Looking for a sample mapreduce code that does bulk imports to HBase
Hi Guys,

I am new to HBase. It's great to join this community with you people and I
hope we can all improve our hbase knowledge from each other. :)

So I have been trying to do bulk imports into HBase as described in
http://hbase.apache.org/docs/current/...summary.html#bulk

I wonder if anyone has any sample codes to offer so that I can get an idea
about how this works.  And any advice would be really appreciated. 

Thanks,

Han

Re: Handling downtime from hbase.client
That's probably temporary, else there's always google ;)

J-D

On Tue, Jul 13, 2010 at 10:58 AM, Justin Cohen
<justin### @teamaol.com> wrote:
 Where's the best place to search the mailing list?  The archives
here don't
 seem to work: http://mail-archives.apache.org/mod_mbox/hbase-user/

 Thanks!

 -justin

 On 7/13/10 10:47 AM, Jean-Daniel Cryans wrote:
>
> This kind of issue was discussed a couple of times on this
mailing
> list. Basically, you can play with hbase.client.pause and
> hbase.client.retries.number but you won't find this satisfactory,
> which is why we opened
> https://issues.apache.org/jira/browse/HBASE-2445
>
> J-D
>
> On Tue, Jul 13, 2010 at 10:38 AM, Justin
Cohen<justin### @teamaol.com>
>  wrote:
>
>>
>> Is there a way to configure hbase.client call timeouts?  If
we have a
>> network outage or if hbase/zk goes down for some reason, we
want our
>> table.puts to timeout reasonably (5-15 seconds) so we can
queue them up
>> for
>> later.  We also want scans and gets to timeout so we can
fail gracefully.
>>  I've played with some of the retries and pause configs but
I can't seem
>> to
>> get a consistent timeout. It's often over 30 seconds, and
occasionally it
>> never times out.
>>
>> Also, when hbase does come back, there seems to be a good
30-60 second
>> delay
>> before we can contact region servers.
>>
>> Right now I'm testing this using hbase in standalone mode,
bring up and
>> down
>> stop-hbase.sh and start-hbase.sh
>>
>> Any tips?
>> Justin
>>
>>



Handling downtime from hbase.client
Is there a way to configure hbase.client call timeouts?  If we have a 
network outage or if hbase/zk goes down for some reason, we want our 
table.puts to timeout reasonably (5-15 seconds) so we can queue them up 
for later.  We also want scans and gets to timeout so we can fail 
gracefully.  I've played with some of the retries and pause configs but 
I can't seem to get a consistent timeout. It's often over 30 seconds, 
and occasionally it never times out.

Also, when hbase does come back, there seems to be a good 30-60 second 
delay before we can contact region servers.

Right now I'm testing this using hbase in standalone mode, bring up and 
down stop-hbase.sh and start-hbase.sh

Any tips?
Justin


Using Pig with HBase
Greetings.

I'm trying to query HBase using Pig but do something wrong and cannot
figure out what exactly.

1. First, I create a table in HBase:

hbase(main):001:0> create 'test_table', 'test_family'

and add values to it:

hbase(main):002:0> put 'test_table', '1', 'test_family:body', 'body1'
hbase(main):003:0> put 'test_table', '1', 'test_family:value', 'value1'
hbase(main):009:0> scan 'test_table'

ROW                          COLUMN+CELL
 1                           column=test_family:body,
timestamp=1279710032517, value=body1
 1                           column=test_family:value,
timestamp=1279710094584, value=value1

So, now I have something in base.


2. After that, I try to get data from HBase using Pig:

grunt> A = load 'test_table' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('test_family:body
test_family:value');
grunt> DUMP A;

Then I get an error message:

2010-07-21 06:01:58,387 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 2118: Unable to create input splits for: test_table



Could you please help me to find where I keep screwing up?

Thank you.




Flume -> HBase
Hi,

Flume [1] looks nice.
Can it write to HBase out of the box?
I'm asking because I see HBase mentioned in its User Guide, but kind of in
passing.

[1] http://archive.cloudera.com/cdh/3/flume/UserGuide.html



Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



Fwd: hbase fsck
If you need hbase fsck for pre-0.89 release, see below.

---------- Forwarded message 
how hands off is hbase?
If I am a lone developer management both the develoment of an online
application + server administration, is hbase for me?

Is this a hands off type deployment or does it require a fair bit of
serving
monitoring etc?

Thanks.


transactional hbase
Hi,

I have a write API call that does 3 puts (2 puts  to one table and a
3rd put to a second table). How do I go about making sure that these
all happen or none happen at all?  In short, an atomic transaction.

I read up a little bit about the TransactionManager and that I need to
modify hbase-site.xml to make these TransactionalRegion Servers. Can
someone point me to some more info about the same? What versions, what
performance impacts etc? Are there some good urls that anyone can
share?

I am using HBase 0.20.3 at this time. I dont believe transactions are
supported in this version. If I were to go live and then decide to use
transactions later, how should I plan on an upgrade?

Thanks.


Hbase search
HI..

Am experimenting with HBase for a search application. Have done bulk
import
of input data into hbase table using mapreduce..

Input format:  <key> <value>
I used SampleUploader.java for populating the table. (from examples)
key is stored as row-key and value as column.

Would like to know how to perform search on this table for a specific
column
value...
I do not find any mapreduce examples that does search on the table. Please
let me know any pointers or how do we do this??

Should I use HBase client APIs to perform search on the table... from
DemoClient.java

Regards,
IlayaRaja




+91 97691 67921

HBase on Hadoop 0.21
Hi,

I've checked the Release Notes of HDFS 0.21 and saw two fixes from hadoop-
append included, other two not, but still some more that have to do with
sync 
stuff.
Is Hadoop-append for HBase made obsolete with HDFS 0.21?

Thank you,

Thomas Koch, http://www.koch.ro


hbase 0.20.5 and maven2
Hi,

Does anyone know where can I find hbase-0.20.5 in a maven repository? 

Thanks in advance!
Fabiano


zookeeper & HBase
 I'm trying to have our deployment layout..I read one of the articles/FAQ
(probably JG's)...that it's better to
have zookeeper on separate cluster/separate sets of machine..I'm assuming
that is the right approach..

 
All our transactions are HBase (inserts, mapreduce-table as input, another
table as output, other queries,..)
Based on other thread on locality..RegionServer & Datanode i'll put on
same hosts..

If these boxes have enough capacity, do we need to put zookeeper on
separate cluster?
If it is on a separate cluster, my understanding is zookeper has much
smaller memory footprint compared
to HRegionServer/Datanodes..& it shld need that much CPU as
well..correct?

Is there any suggested guidance on number of zookeeper vs number of
regionservers?..looking for some ratio..say 10 node cluster..
how many zookeeper..?

Please ignore responding to this ..if this is outside the etiquette
thanks
venkatesh 




HBase 0.89 and JDK version
Hi,

We recently upgraded our QA cluster to Cloudera Version 3 (CDH3) which has
Hbase 0.89. Our cluster is running on JDK 1.6.0_18 version. On trying to
start up Hbase it basically gives an error “you're running jdk 1.6.0_18
which has known bugs” even though Pig and Hive seems to work fine with
the version of JDK.

Any thoughts on why I am seeing this error ?

If there is a bug in this JDK version then what is recommended, upgrading
JDK to 19 or 20 or 21 (21 release this month) or downgrade the jdk version
?



Thanks for the support.



Regards

-SW



#java -version

java version "1.6.0_18"

Java(TM) SE Runtime Environment (build 1.6.0_18-b07)

Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)






 		 	   		  

stumbleupon and hbase
I realize that stumbleupon uses hbase for su.pr, and is currently using
hbase for new functionality but isn't necessarily going back and re-coding
everything to fit into the hbase model.

Having said that, do you guys think hbase could very well be used for
things
like:

1. when a user logs in, keep the user session in hbase?
2. for pages like:

http://www.stumbleupon.com/url/blogs....sselback-potatoes



So this involves all elements on the page, would this be possible and more
importantly make sense with hbase?

3.  What sort of things/functionality do you see NOT being suitable in
your
experiences?

Thanks for  your insights!


Using hbase 2473
Hi,
we're considering using hbase 2473.
It would be nice if someone can share experience about hbase version
that is used and key range determination strategy.

Thanks


Re: HBase 0.20.5 issues
Completely changed all hadoop configuration to almost default, PE completes
writing for 1000000 rows, but regions still come assigned to multiple RS's

hbase(main):001:0> status 'detailed'
version 0.20.5
0 regionsInTransition
6 live servers
    uasstse005.ua.sistyma.com:60020 1277985198620
        requests=0, regions=2, usedHeap=25, maxHeap=1196
        .META.,,1
            stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
        -ROOT-,,0
            stores=1, storefiles=3, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
    stas-node.ua.sistyma.com:60020 1277985198573
        requests=0, regions=1, usedHeap=22, maxHeap=1996
        .META.,,1
            stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
    uasstse004.ua.sistyma.com:60020 1277985198572
        requests=0, regions=1, usedHeap=23, maxHeap=1996
        .META.,,1
            stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
    uasstse006.ua.sistyma.com:60020 1277985198554
        requests=0, regions=0, usedHeap=33, maxHeap=1196
    uasstse002.ua.sistyma.com:60020 1277985198667
        requests=0, regions=1, usedHeap=34, maxHeap=1996
        .META.,,1
            stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
    uasstse003.ua.sistyma.com:60020 1277985198550
        requests=0, regions=1, usedHeap=22, maxHeap=1996
        .META.,,1
            stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
0 dead servers


On Wed, Jun 30, 2010 at 2:49 PM, Stanislaw Kogut
<skog### @sistyma.net> wrote:

 See clean logs from scratch for hadoop and hbase after start with
clean
 hbase rootdir.

 http://sp.sistyma.com/hbase_logs.tar.gz


 On Tue, Jun 29, 2010 at 8:46 PM, Stack <sain### @gmail.com>
wrote:

> Something is seriously wrong with your setup.  Please put your
master logs
> somewhere we can pull from.   Enable debug too.  Thanks
>
>
>
> On Jun 29, 2010, at 10:29 AM, Stanislaw Kogut
<skog### @sistyma.net> wrote:
>
> > 1. Stopping hbase
> > 2. Removing hbase.root.dir from hdfs
> > 3. Starting hbase
> > 4. Doing major_compact on .META.
> > 5. Starting PE
> >
> > 10/06/29 20:17:30 INFO hbase.PerformanceEvaluation: Table
{NAME =>
> > 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION
=> 'NONE',
> VERSIONS
> > => '3', TTL => '2147483647', BLOCKSIZE => '65536',
IN_MEMORY => 'false',
> > BLOCKCACHE => 'true'}]} created
> > 10/06/29 20:17:30 INFO hbase.PerformanceEvaluation: Start
class
> >
org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at
> offset
> > 0 for 1048576 rows
> > 10/06/29 20:17:42 INFO hbase.PerformanceEvaluation:
0/104857/1048576
> > 10/06/29 20:17:55 INFO hbase.PerformanceEvaluation:
0/209714/1048576
> > 10/06/29 20:18:13 INFO hbase.PerformanceEvaluation:
0/314571/1048576
> > 10/06/29 20:18:29 INFO hbase.PerformanceEvaluation:
0/419428/1048576
> > 10/06/29 20:22:37 ERROR hbase.PerformanceEvaluation: Failed
> > org.apache.hadoop.hbase.client.RetriesExhaustedException:
Trying to
> contact
> > region server  -- nothing found, no 'location' returned,
> > tableName=TestTable, reload=true -- for region , row
'0000511450', but
> > failed after 11 attempts.
> > Exceptions:
> > java.io.IOException: HRegionInfo was null or empty in .META.
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> >
> >    at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:1087)
> >    at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.access$200(HConnectionManager.java:240)
> >    at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.getRegionName(HConnectionManager.java:1183)
> >    at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1160)
> >    at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230)
> >    at
>
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:621)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:637)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:889)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.runNIsOne(PerformanceEvaluation.java:907)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:939)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.doCommandLine(PerformanceEvaluation.java:1036)
> >    at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:1061)
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
> >    at
> >
>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >    at
> >
>
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >
> >
> > On Tue, Jun 29, 2010 at 8:03 PM, Stack
<sta### @duboce.net> wrote:
> >
> >> For sure you are removing the hbase dir in hdfs?
> >>
> >> Try major compaction of your .META. table?
> >>
> >> hbase> major_compact ".META."
> >>
> >> You seem to be suffering  HBASE-1880 but if you are
removing the hbase
> >> dir, you shouldn't be running into this.
> >>
> >> St.Ack
> >>
> >>
> > --
> > Regards,
> > Stanislaw Kogut
> > Sistyma LLC
>



 --
 Regards,
 Stanislaw Kogut
 Sistyma LLC







About HDFS-630 and hbase 0.20.5
http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#overview_description

Here is states that it is recommended to use HDFS-630 patch for Hadoop. 
So, why does the hbase 0.20.5 contains a stock hadoop 0.20.2 jar? 
(Hadoop 0.20.2 does not have HDFS-630 fixed).

Secondly, what patch is the right on for Hadoop 0.20.2? There are 
several on this page  https://issues.apache.org/jira/browse/HDFS-630

Finally, perhaps there are other recommended patches as well?

Thanks.

Ferdy
<https://issues.apache.org/jira/browse/HDFS-630>


NPE in IHBase with HBase 0.20.5
Hi,

I found a bug today in IHBase when used with HBase 0.20.5 that triggers
NPEs
during an index scan.  I logged an issue on github and posted a patch in
the
comments (what's the preferred method of posting patches there?).

http://github.com/ykulbak/ihbase/issues/#issue/7

Are IHBase issues tracked in the HBase JIRA?  Should I have posted there
instead?  Also, the patch is trivial, but code reviews are always welcome.

Thanks,
James


how is facebook using hbase?
In what context/feature is facebook using hbase?