Best unofficial Apache Server developers community |
| |||||
| Jun 2, 2010 | |||||
|
Dhruba Borthakur |
|
||||
| Tags: | |||||
Similar Threads
HBase on Hadoop 0.21
Hi, I've checked the Release Notes of HDFS 0.21 and saw two fixes from hadoop- append included, other two not, but still some more that have to do with sync stuff. Is Hadoop-append for HBase made obsolete with HDFS 0.21? Thank you, Thomas Koch, http://www.koch.ro
HBASE/HADOOP Examples
I've found examples using the older mapred interface but not the newer mapreduce interface. I want to write a mapper that is configured to only pull out specific rows(which are the mapper's keys) and a specific column's value(which is the mapper's value). Is there any examples of something like this available? James Kilbride
Re: Rolling out Hadoop/HBase updates
Hey, We're using stock CHD2 without any patches so I'm not sure if we have hdfs630 or not. For HBase we're currently on 0.20.3 and will be testing and moving to 0.20.5 soon What I did with this rollout of just config changes was take one region server down at a time and restart the datanode on the same server. So what I gather I should have done was shutdown all the region servers before restarting any of the data nodes? I guess if I split it into different parts it would be :- - HBase Rolling update for point/config releases is supported - Update masters first - Then update region servers in turn - HDFS Data nodes don't support rolling updates? (Maybe better in the hdfs list I guess) - Take down HBase - Take down datanodes - Update all the datanodes code/configs - Start datanodes - Start HBase Would you be able to let me know which of these I've got right/wrong? Thanks, On 29 June 2010 15:50, Michael Segel <michae### @hotmail.com> wrote: Dan, I don't think you can do that because your 'new/updated' node will clash with the rest of the cloud. (We're talking code and not just cloud tuning parameters.) [Read different jars...] If you're going to push an update out, then it has to be an 'all or nothing' push. Since we're using Cloudera's release, moving from CDH2 to CDH3 represents a full backup, down the cloud, remove the software completely, and then then install new CDH3. Outside of that major switch, if we were going from one sub release to another, it would be just a $> yum update hadoop-0.20 call on each node. Again, you have to take the cloud down to do that. So the bottom line... if you're going to do upgrades, you'll need to plan for some down time. HTH -Mike > From: dan.ha### @mendeley.com > Date: Tue, 29 Jun 2010 14:43:26 +0100 > Subject: Rolling out Hadoop/HBase updates > To: us### @hbase.apache.org > > Hey, > > I've been thinking about how we do out configuration and code updates for > Hadoop and HBase and was wondering what others do and what is the best > practice to avoid errors with HBase. > > Currently we do a rolling update where we restart the services on one node > at a time, so shutting down the region server then restarting the datanode > and task trackers depending on what we are updating and what has change. But > with this I have occasional found errors with the HBase cluster afterwards > due to corrupt META table which I think could have been caused by restarting > the datanode, or maybe not waiting long enough for the cluster to sort out > loosing a region server before moving on to the next. > > The most resent error upon restarting a node was :- > > 2010-06-29 10:46:44,970 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: Error closing > files,3822b1ea8ae015f3ec932cafaa282dd211d768ad,1275145898366 > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:230) > > 2010-06-29 10:46:44,970 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: Shutting down > HRegionServer: file system not available > java.io.IOException: File system is not available > at > org.apache.hadoop.hbase.util.FSUtils.checkFileSystemAvailable(FSUtils.java:129) > > > Followed by this for every region being served :- > > 2010-06-29 10:46:44,996 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: Error closing > documents,082595c0-6d01-11df-936c-0026b95e484c,1275676410202 > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:230) > > > After updating all the nodes all the region server shut down after a > few minutes reporting the following :- > > 2010-06-29 11:21:59,508 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-1437671530216085093_2565663 bad datanode[0] > 10.0.11.4:50010 > > 2010-06-29 11:22:09,481 FATAL org.apache.hadoop.hbase.regionserver.HLog: > Could not append. Requesting close of hlog > java.io.IOException: All datanodes 10.0.11.4:50010 are bad. Aborting... > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2542) > > > 2010-06-29 11:22:09,482 FATAL > org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed with > ioe: > java.io.IOException: All datanodes 10.0.11.4:50010 are bad. Aborting... > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2542) > > 2010-06-29 11:22:10,344 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to close log in > abort > java.io.IOException: All datanodes 10.0.11.4:50010 are bad. Aborting... > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2542) > > > This was fixed by restarting the master and starting the region servers > again, but it would be nice to know how to roll out changes cleaner. > > How do other people here roll out updates to HBase / Hadoop? What order do > you restart services in and how long do you wait before moving to the next > node? > > Just so you know we currently have 5 nodes and are getting another 10 to add > soon. > > Thanks, > > -- > Dan Harvey | Datamining Engineer > www.mendeley.com/profiles/dan-harvey > > Mendeley Limited | London, UK | www.mendeley.com > Registered in England and Wales | Company Number 6419015
NoClassDefFoundError: org/apache/hadoop/hbase/rest/Main
I am trying to start and stop stargate rest server. I get
ClassNotFoundException intermittently.
I did perform these steps :
? Place the Stargate jar in either the HBase installation root
directory or lib/ directories.
? Copy the jars from contrib/stargate/lib/ into the lib/ directory of
the HBase installation.
:/usr/local/hbase-0.20.3 hadoop$./bin/hbase
org.apache.hadoop.hbase.stargate.Main -p 8080
2010-07-03 04:32:39.593::INFO: Logging to STDERR via
org.mortbay.log.StdErrLog
2010-07-03 04:32:39.633::INFO: jetty-6.1.14
2010-07-03 04:32:39.908::INFO: Started SocketC### @0.0.0.0:8080
^Z
[1]+ Stopped ./bin/hbase
org.apache.hadoop.hbase.stargate.Main -p 8080
:/usr/local/hbase-0.20.3 hadoop$bg
[1]+ ./bin/hbase org.apache.hadoop.hbase.stargate.Main -p 8080 &
:/usr/local/hbase-0.20.3 hadoop$./bin/hbase-daemon.sh start
org.apache.hadoop.hbase.rest.Main -p 8080
starting org.apache.hadoop.hbase.rest.Main, logging to
/var/hbase/logs/hbase--org.apache.hadoop.hbase.rest.Main-phxradar03.out
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/hbase/rest/Main
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.rest.Main
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: org.apache.hadoop.hbase.rest.Main. Program
will exit.
ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/mapreduce/TableInputFormat
Hi All, This is my first mail in the apache mailing list... please bear with me as I am absolutely new to Hadoop and its family. This is my question... I have some data on my hdfs in the following form. (number:int,word:chararray, word2:chararray,somethingelse:int) I want to get this data into a neatly formed HBase Table. I chose the simpler way instead of writing my own udf. I wanted to do this.... register ../hbase/hbase-0.20.4.jar; register ../hbase/hbase-0.20.4-test.jar; A = Load '/some_data'; B = STORE A into 'hbase://something' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage; dump B; but this is the error I get when I do that 2010-07-22 16:38:35,041 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://MyMachine01:9000 2010-07-22 16:38:35,550 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: MyMachine01:9001 2010-07-22 16:38:35,868 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/mapreduce/TableInputFormat I have checked my hbase-0.20.4.jar file and it does have a TableInputFormat class. I added the right path to hadoop-env.sh in the CLASSPATH field. I added the conf folder to the classpath and also the test jar. I don't know why it wouldn't work. My HBase installation went really smooth. I am able to check the status of the HBase in the hbase shell and still I get this error. I am totally lost at this point. I would really appreciate any help in this regard. Thanks a bunch. V.
How to specify HBase cluster end-points from HBase client code in HBase 0.20.0
Hello, In my current application environment, I need to have two HBase clusters running in two different racks, to form a fault-tolerant group to tolerate power failure. Then I have an HBase client, which is sitting outside of these two clusters, to make invocation to the these two HBase clusters. In my previous work, I just need to simply use the class of “HTable”, and passed in an instance of HBaseConfiguration. And To construct the HBaseConfiguration instance, I just need to pass in the path information of the “hbase-site.xml”. And in the hbase-site.xml, there is only one parameter called “hbase.rootdir” that need to configure. Before HBase0.20.0, there used to be a parameter called “hbase.master” that I can specify. But in HBase0.20.0, I found that it does not work any more, likely because that the HBase master is managed by the Zookeeper, and the master node now becomes dynamic. Could you show me what are the APIs that I need to use, in order for me to specify the end-point address of the HBase cluster, for the HBase client invocation? Regards, Jun
how I can do to configure/start a hadoop cluster(pseudo distributed) with the last hadoop trunk cod
All, I have followed the instructions on http://wiki.apache.org/hadoop/EclipseEnvironment to download the latest trunk source code and build .jar for common, hdfs and mapred. but how should I proceed to configure and start a hadoop cluster(psudo distributed) with these latest .jar? I knew how to configured/start the hadoop cluster with formal hadoop package(hadoop-*.tar.gz with all stuff of common, hdfs and mapred there). I googled but didn't find the related information, most information I got after compile is to run unit test. Can anyone help? Thanks for the help. Best Regards, Fred
Commented: (AVRO-493) hadoop mapreduce support for avro data
[
https://issues.apache.org/jira/browse...6#action_12881606
]
Iván de Prado commented on AVRO-493:
Commented: (AVRO-493) hadoop mapreduce support for avro data
[
https://issues.apache.org/jira/browse...5#action_12881235
]
Doug Cutting commented on AVRO-493:
Commented: (AVRO-493) hadoop mapreduce support for avro data
[
https://issues.apache.org/jira/browse...6#action_12880786
]
Iván de Prado commented on AVRO-493:
Commented: (AVRO-493) hadoop mapreduce support for avro data
[
https://issues.apache.org/jira/browse...3#action_12880743
]
Iván de Prado commented on AVRO-493:
Commented: (AVRO-493) hadoop mapreduce support for avro data
[
https://issues.apache.org/jira/browse...0#action_12880780
]
Harsh J Chouraria commented on AVRO-493:
Using Pig with HBase
Greetings.
I'm trying to query HBase using Pig but do something wrong and cannot
figure out what exactly.
1. First, I create a table in HBase:
hbase(main):001:0> create 'test_table', 'test_family'
and add values to it:
hbase(main):002:0> put 'test_table', '1', 'test_family:body', 'body1'
hbase(main):003:0> put 'test_table', '1', 'test_family:value', 'value1'
hbase(main):009:0> scan 'test_table'
ROW COLUMN+CELL
1 column=test_family:body,
timestamp=1279710032517, value=body1
1 column=test_family:value,
timestamp=1279710094584, value=value1
So, now I have something in base.
2. After that, I try to get data from HBase using Pig:
grunt> A = load 'test_table' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('test_family:body
test_family:value');
grunt> DUMP A;
Then I get an error message:
2010-07-21 06:01:58,387 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 2118: Unable to create input splits for: test_table
Could you please help me to find where I keep screwing up?
Thank you.
transactional hbase
Hi, I have a write API call that does 3 puts (2 puts to one table and a 3rd put to a second table). How do I go about making sure that these all happen or none happen at all? In short, an atomic transaction. I read up a little bit about the TransactionManager and that I need to modify hbase-site.xml to make these TransactionalRegion Servers. Can someone point me to some more info about the same? What versions, what performance impacts etc? Are there some good urls that anyone can share? I am using HBase 0.20.3 at this time. I dont believe transactions are supported in this version. If I were to go live and then decide to use transactions later, how should I plan on an upgrade? Thanks.
HBase 0.89 and JDK version
Hi, We recently upgraded our QA cluster to Cloudera Version 3 (CDH3) which has Hbase 0.89. Our cluster is running on JDK 1.6.0_18 version. On trying to start up Hbase it basically gives an error “you're running jdk 1.6.0_18 which has known bugs” even though Pig and Hive seems to work fine with the version of JDK. Any thoughts on why I am seeing this error ? If there is a bug in this JDK version then what is recommended, upgrading JDK to 19 or 20 or 21 (21 release this month) or downgrade the jdk version ? Thanks for the support. Regards -SW #java -version java version "1.6.0_18" Java(TM) SE Runtime Environment (build 1.6.0_18-b07) Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
stumbleupon and hbase
I realize that stumbleupon uses hbase for su.pr, and is currently using hbase for new functionality but isn't necessarily going back and re-coding everything to fit into the hbase model. Having said that, do you guys think hbase could very well be used for things like: 1. when a user logs in, keep the user session in hbase? 2. for pages like: http://www.stumbleupon.com/url/blogs....sselback-potatoes So this involves all elements on the page, would this be possible and more importantly make sense with hbase? 3. What sort of things/functionality do you see NOT being suitable in your experiences? Thanks for your insights!
Using hbase 2473
Hi, we're considering using hbase 2473. It would be nice if someone can share experience about hbase version that is used and key range determination strategy. Thanks
Re: HBase 0.20.5 issues
Completely changed all hadoop configuration to almost default, PE completes
writing for 1000000 rows, but regions still come assigned to multiple RS's
hbase(main):001:0> status 'detailed'
version 0.20.5
0 regionsInTransition
6 live servers
uasstse005.ua.sistyma.com:60020 1277985198620
requests=0, regions=2, usedHeap=25, maxHeap=1196
.META.,,1
stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
-ROOT-,,0
stores=1, storefiles=3, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
stas-node.ua.sistyma.com:60020 1277985198573
requests=0, regions=1, usedHeap=22, maxHeap=1996
.META.,,1
stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
uasstse004.ua.sistyma.com:60020 1277985198572
requests=0, regions=1, usedHeap=23, maxHeap=1996
.META.,,1
stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
uasstse006.ua.sistyma.com:60020 1277985198554
requests=0, regions=0, usedHeap=33, maxHeap=1196
uasstse002.ua.sistyma.com:60020 1277985198667
requests=0, regions=1, usedHeap=34, maxHeap=1996
.META.,,1
stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
uasstse003.ua.sistyma.com:60020 1277985198550
requests=0, regions=1, usedHeap=22, maxHeap=1996
.META.,,1
stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
0 dead servers
On Wed, Jun 30, 2010 at 2:49 PM, Stanislaw Kogut
<skog### @sistyma.net> wrote:
See clean logs from scratch for hadoop and hbase after start with
clean
hbase rootdir.
http://sp.sistyma.com/hbase_logs.tar.gz
On Tue, Jun 29, 2010 at 8:46 PM, Stack <sain### @gmail.com>
wrote:
> Something is seriously wrong with your setup. Please put your
master logs
> somewhere we can pull from. Enable debug too. Thanks
>
>
>
> On Jun 29, 2010, at 10:29 AM, Stanislaw Kogut
<skog### @sistyma.net> wrote:
>
> > 1. Stopping hbase
> > 2. Removing hbase.root.dir from hdfs
> > 3. Starting hbase
> > 4. Doing major_compact on .META.
> > 5. Starting PE
> >
> > 10/06/29 20:17:30 INFO hbase.PerformanceEvaluation: Table
{NAME =>
> > 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION
=> 'NONE',
> VERSIONS
> > => '3', TTL => '2147483647', BLOCKSIZE => '65536',
IN_MEMORY => 'false',
> > BLOCKCACHE => 'true'}]} created
> > 10/06/29 20:17:30 INFO hbase.PerformanceEvaluation: Start
class
> >
org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at
> offset
> > 0 for 1048576 rows
> > 10/06/29 20:17:42 INFO hbase.PerformanceEvaluation:
0/104857/1048576
> > 10/06/29 20:17:55 INFO hbase.PerformanceEvaluation:
0/209714/1048576
> > 10/06/29 20:18:13 INFO hbase.PerformanceEvaluation:
0/314571/1048576
> > 10/06/29 20:18:29 INFO hbase.PerformanceEvaluation:
0/419428/1048576
> > 10/06/29 20:22:37 ERROR hbase.PerformanceEvaluation: Failed
> > org.apache.hadoop.hbase.client.RetriesExhaustedException:
Trying to
> contact
> > region server -- nothing found, no 'location' returned,
> > tableName=TestTable, reload=true -- for region , row
'0000511450', but
> > failed after 11 attempts.
> > Exceptions:
> > java.io.IOException: HRegionInfo was null or empty in .META.
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> > org.apache.hadoop.hbase.TableNotFoundException: TestTable
> >
> > at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:1087)
> > at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.access$200(HConnectionManager.java:240)
> > at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.getRegionName(HConnectionManager.java:1183)
> > at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1160)
> > at
> >
>
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230)
> > at
>
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:621)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:637)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:889)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.runNIsOne(PerformanceEvaluation.java:907)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:939)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.doCommandLine(PerformanceEvaluation.java:1036)
> > at
> >
>
org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:1061)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
> > at
> >
>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
>
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >
> >
> > On Tue, Jun 29, 2010 at 8:03 PM, Stack
<sta### @duboce.net> wrote:
> >
> >> For sure you are removing the hbase dir in hdfs?
> >>
> >> Try major compaction of your .META. table?
> >>
> >> hbase> major_compact ".META."
> >>
> >> You seem to be suffering HBASE-1880 but if you are
removing the hbase
> >> dir, you shouldn't be running into this.
> >>
> >> St.Ack
> >>
> >>
> > --
> > Regards,
> > Stanislaw Kogut
> > Sistyma LLC
>
--
Regards,
Stanislaw Kogut
Sistyma LLC
About HDFS-630 and hbase 0.20.5
http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#overview_description Here is states that it is recommended to use HDFS-630 patch for Hadoop. So, why does the hbase 0.20.5 contains a stock hadoop 0.20.2 jar? (Hadoop 0.20.2 does not have HDFS-630 fixed). Secondly, what patch is the right on for Hadoop 0.20.2? There are several on this page https://issues.apache.org/jira/browse/HDFS-630 Finally, perhaps there are other recommended patches as well? Thanks. Ferdy <https://issues.apache.org/jira/browse/HDFS-630>
WH4L - current lack of support
Jun 5, 2010 Does Godaddy provide good hosting support Jul 24, 2010 Support Review of: OC3, PacificRack, ServInt, ThePlanet, and 100TB (midphase) Jul 27, 2010 High Risk Merchant Accounts with Hosting & Web Development Support Jun 19, 2010 ? [HighLayer] Premium cPanel Hosting ? 99.9% Uptime ? 24/7 Support ? Raid Protected ? Jun 1, 2010 | |||||