Best unofficial Apache Server developers community |
| |||||
| Aug 19, 2010 | |||||
|
Ted Yu |
|
||||
| Tags: | |||||
Similar Threads
region servers crashing
Hi all, We've been having issues for a few days with HBase region servers crashing when under load from mapreduce jobs. There are a few different errors in the region server logs - I've attached a sample log of 4 different region servers crashing within an hour of each other. Some details: - This happens when a full table scan from a mapreduce is in progress. - We are running HBase 0.20.3, with a 16-slave cluster, on EC2. - Some of the region server errors are NPEs which look a lot like https://issues.apache.org/jira/browse/HBASE-2077. I'm not sure if that is the exact problem or if this issue is fixed in 0.20.5. Is it worth upgrading to 0.20.5 to fix this? - Some of the region server errors are scanner lease expired errors: 2010-07-12 15:10:03,299 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 86246ms, ten times longer than scheduled: 1000 2010-07-12 15:10:03,299 WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x229c72b89360001 to sun.nio.ch.Se### @7f712b3a java.io.IOException: TIMED OUT at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906) 2010-07-12 15:10:03,299 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 1779060682963568676 lease expired 2010-07-12 15:10:03,406 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: org.apache.hadoop.hbase.UnknownScannerException: Name: 1779060682963568676 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1877) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) We tried increasing hbase.regionserver.lease.period to 2 minutes but that didn't seem to make a difference here. - Our configuration and table size haven't changed significantly in those days. - We're running a 3-node Zookeeper cluster collocated on the same machines as the HBase/Hadoop cluster. - Based on Ganglia output, it doesn't look like the regionservers (or any of the machines) are swapping. - At the time of the crash, it doesn't appear that the network was overloaded (i.e. we've seen higher network traffic without crashes). So it doesn't seem that this is a problem communicating with Zookeeper. - We have "-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode" enabled, so it doesn't seem like we should be pausing due to GC too much. Any thoughts? Thanks, - Dmitry
online automatic region merge
Hi guys, It seems that there is no support for automatic region merge in the current implementation of HBase (0.20.5). After searching in jira, I only found a command line utility called onlinemerge in HBASE-1621. If so, any plan on automatic region merge? Thanks.
Region server forced to shut down
Hi all: We applied hadoop/hbase for NLP analysis and intermediate data storage, and encountered a string of problems when the data scale is up. Attached is a region server log that shows how it crashed after running a mapreduce task with intensive read/writes. The FATAL error seems to occur at the following line: "2010-07-13 18:02:56,723 FATAL org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog required. Forcing server shutdown" But before this, there are many warning or errors on reading/creating new blocks. We have been struggling on this for a while, but could not identify how to fix the problem. Wonder if anybody can help us take a look at the log and check out the problem? Thanks in advance. Best, Arber
GC took 299 secs causing region server to die
I kept running into the stop-the-world GC during batch import of data into
hbase. The configuration of a node in the 8-node cluster is as follows.
* 4-core
* 64-bit JVM
* 8 GB of memory
* CDH2 for hadoop and 0.20.5 for hbase
* TT: 128 MB
* DN: 128 MB
* 2 Mappers at 512 MB each
* 2 Reducer at 512 MB each
* 1 regionserver at 4096 MB
The import job was a mapper only job so that only TT, DN, 2 mappers and
regionserver were running. Below is the JMX output for the dead
regionserver.
Time:
2010-07-29 12:25:47
Used:
224,949 kbytes
Committed:
670,728 kbytes
Max:
4,185,792 kbytes
GC time:
5 minutes on ParNew (2,126 collections)
0.000 seconds on ConcurrentMarkSweep (0 collections)
Clearly the regionserver was spent all GC time on ParNew, which was not
surprising as I was imported tons of data. But I could not figure out why
the same GC that usually take way less than a second, took 299 secs at
line
3. Any enlightenment is greatly appreciated.
I will change ParNew to 6M as documented in Performance Tuning page and
gave
it another shot.
010-07-28T12:06:57.249-0700: 2406.986: [GC 2406.986: [ParNew:
17786K->755K(19136K), 0.0015410 secs] 348288K->331394K(620416K)
icms_dc=27 ,
0.0016330 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2010-07-28T12:06:57.268-0700: 2407.004: [GC 2407.004: [ParNew:
17580K->761K(19136K), 0.0016710 secs] 348154K->331343K(620416K)
icms_dc=27 ,
0.0017610 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2010-07-28T12:06:57.288-0700: 2407.024: [GC 2407.088: [ParNew:
17564K->757K(19136K), 299.1513910 secs] 348081K->331283K(620416K)
icms_dc=27
, 299.1515120 secs] [Times: user=0.17 sys=0.04, real=299.23 secs]
2010-07-28T12:11:56.558-0700: 2706.294: [GC 2706.294: [ParNew:
17735K->925K(19136K), 0.0094600 secs] 348197K->331458K(620416K)
icms_dc=27 ,
0.0095670 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2010-07-28T12:11:56.606-0700: 2706.343: [GC 2706.343: [ParNew:
17940K->932K(19136K), 0.0085750 secs] 348473K->331474K(620416K)
icms_dc=27 ,
0.0086710 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
Zookeeper exceptions while starting up region
Hi, First of all I'm fairly new to HBase and have set up a small deployment of Hadoop and HBase (0.20.4) on two servers for the beginning in a fully distributed mode. HBase works fine on one server (client operations work perfectly), however starting the second RegionServer throws exceptions which I couldn't resolve. Servers run in a virtualized environment and I want to extend the deployment size as soon as possible to do some benchmarking for my particular purposes. I anonymized my server names a little bit and therefore you will encounter namings like xyz or xxx. My first suspect was /etc/hosts as I don't use a DNS server and Ubuntu adds the IP 127.0.1.1 to localhost by default (which I removed). I replicated the name resolution across the two servers: ***** /etc/hosts on the master and region1: ***** 127.0.0.1 localhost 9.2.18.168 master.x.y.z master 9.2.18.163 region1.x.y.z region1 ***** I get following exceptions on my region1 server: ***** 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.32-22-generic 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=xxx 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/xxx 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/xxx/hbase-0.20.4 2010-07-06 04:44:17,897 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=60000 watcher=org.apache.hadoop.hbase.r### @555c07d8 2010-07-06 04:44:17,899 INFO org.apache.zookeeper.ClientCnxn: zookeeper.disableAutoWatchReset is false 2010-07-06 04:44:20,650 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server localhost/127.0.0.1:2181 2010-07-06 04:44:20,656 WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x0 to sun.nio.ch.Se### @24148662 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933) 2010-07-06 04:44:20,658 WARN org.apache.zookeeper.ClientCnxn: Ignoring exception during shutdown input java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:656) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:378) at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) 2010-07-06 04:44:20,658 WARN org.apache.zookeeper.ClientCnxn: Ignoring exception during shutdown output java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:667) at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:386) at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) 2010-07-06 04:44:20,777 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to set watcher on ZNode /hbase/master org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.watchMasterAddress(ZooKeeperWrapper.java:366) at org.apache.hadoop.hbase.regionserver.HRegionServer.watchMasterAddress(HRegionServer.java:389) at org.apache.hadoop.hbase.regionserver.HRegionServer.reinitializeZooKeeper(HRegionServer.java:319) at org.apache.hadoop.hbase.regionserver.HRegionServer.reinitialize(HRegionServer.java:310) at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:280) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer.java:2443) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2511) ***** My hbase-site.xml looks as follows ***** <configuration> <property> <name>fs.default.name</name> <value>hdfs://master:54310</value> </property> <property> <name>mapred.job.tracker</name> <value>master:54311</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.datanode.max.xcievers</name> <value>2047</value> </property> <property> <name>hbase.rootdir</name> <value>hdfs://master:54310/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration> ***** regionservers ***** master region1 ***** Zookeeper Dump looks as follows ***** hbase(main):003:0> zk_dump HBase tree in ZooKeeper is rooted at /hbase Cluster up? true In safe mode? false Master address: 9.2.18.168:60000 Region server holding ROOT: 9.2.18.168:60020 Region servers: - 9.2.18.168:60020 Quorum Server Statistics: - localhost:2181 Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT Clients: /127.0.0.1:38162[1](queued=0,recved=10,sent=0) /127.0.0.1:55368[1](queued=0,recved=29,sent=0) /127.0.0.1:54149[1](queued=0,recved=255,sent=0) /127.0.0.1:38164[1](queued=0,recved=0,sent=0) Latency min/avg/max: 0/5/1192 Received: 294 Sent: 0 Outstanding: 0 Zxid: 0xb Mode: standalone Node count: 11 Thanks in advance!! /Samuru
region server hosting .META. table
Hi, We are using HBase 0.20.5 We recently experienced loss of region server hosting .META. table If someone can point me to the code which handles this scenario, that would be great.
restarting region server which shutdown due to GC pause
Thanks for the answer. GC pause seems to be a major cause for region server to come down: 2010-07-21 09:07:14,138 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 291505ms, ten times longer than scheduled: 10000 Is it possible for HBase Master to restart dead region server in this case ? On Wed, Jul 21, 2010 at 10:02 AM, Jean-Daniel Cryans <jdcry### @apache.org>wrote: HBaseAdmin.getClusterStatus().getServers() J-D On Wed, Jul 21, 2010 at 9:56 AM, Ted Yu <yuzh### @gmail.com> wrote: > Hi, > Is there API to query the number of live region servers ? > > Thanks >
querying number of live region servers
Hi, Is there API to query the number of live region servers ? Thanks
write-ahead log and failure recovery for region servers
Hi, I am in the middle of performance testing on my service using HBase as the backend, and I have difficulty to explain the write and read performance difference, which bring up the following questions. (1) write-ahead log In the default mode of HTable “put”, does the write-ahead log get turn on? If not, what is the API that allows me to turn on the write-ahead log? Without the write-ahead logging turned on, is it true that when the region server dies, all the data that has been written by the client but not flushed to the persistent HDFS will get lost? So far I have not seen any documentation on the HBase configuration parameters to specify where the write-ahead log is store. Does it get stored in a reliable shared file system that the system administrator needs to provide, for example, a SAN? Or HBase uses the HDFS as the reliable shared file system? Previously I read some articles saying that HBase is not using (or did not use) HDFS for write-ahead logging, as the operation of “append” provided from HDFS is not robust enough. (2) Failure recovery Related to the write-ahead log problem that I brought up, could you explain to me the current mechanism adopted in HBase, when a region server fails and the client is making a request to fetch a row to this particular region server? My understanding is that the master node will detect that the region server fails and then starts to migrate the data (including the write-ahead log) that is served by that region to the other region and update the meta data of the master node after the migration finishes. The concern that I have is that such migration takes quite some time, depending on the data size of the failed region server. For large size, it will take, say couple of minutes. During this time window, will it be the case that all the clients that try to access data rows on that failed region server will all fail? As a result, the HBase client will need to have a retry window that matches the region data’s migration time. Thank you very much for your information in advance! Jun
Avoiding OutOfMemory Java heap space in region servers
Hello,
I'm seeing errors like so:
010-08-10 12:58:38,938 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher: Got
ZooKeeper event, state: Disconnected, type: None, path: null
2010-08-10 12:58:38,939 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: Disconnected, type: None, path: null
2010-08-10 12:58:38,941 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
aborting.
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at
java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:942)
Then I see:
2010-08-10 12:58:39,408 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 79 on 60020, call close(-2793534857581898004) from
192.168.195.88:41233: error: java.io.IOException: Server not running,
aborting
java.io.IOException: Server not running, aborting
And finally:
2010-08-10 12:58:39,514 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stop requested,
clearing toDo despite exception
2010-08-10 12:58:39,515 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60020
2010-08-10 12:58:39,515 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60020: exiting
And the server begins to shut down.
Now, it's very likely these are due to retrieving unusually large cells -
in fact, that's my current assumption.. I'm seeing M/R tasks fail with
intermittently with the same issue on the read of cell data.
My question is why does this bring the whole regionserver down? I would
think the regionserver would just fail the Get(), and move on...
Am I misdiagnosing the error? Or is it the case that if I want different
behavior, I should pony up with some code? :)
Take care,
-stu
RE: Exception "Timed out trying to locate root region" when doing INSERT from Hive to HBa
That did the trick, thanks! Gal From: Ted Yu [mailto:yuzhi### @gmail.com] Sent: Tuesday, July 13, 2010 2:39 AM To: hive-use### @hadoop.apache.org Subject: Re: Exception "Timed out trying to locate root region" when doing INSERT from Hive to HBase Do you have the hbase conf directory on the HADOOP_CLASSPATH ? On Mon, Jul 12, 2010 at 7:57 AM, Gal Barnea <g### @eyeviewdigital.com<mailto:ga### @eyeviewdigital.com>> wrote: Hi everyone, I'm trying to setup HBase/Hive integration using HBase 0.20.3 and the latest HIVE source from svn. I have one machine acting as HBase Master+ZooKeeper with HIVE on it and two RegionServers. I am able to CREATE/DROP the external table in HIVE pointing to HBase, and SELECT from it after inserting data via hbase shell. However when trying to run the following I get the exception below: hive --auxpath $HIVE_SRC/build/hbase-handler/hive_hbase-handler.jar,$HIVE_SRC/hbase-handler/lib/hbase-0.20.3.jar,$HIVE_SRC/hbase-handler/lib/zookeeper-3.2.2.jar -hiveconf hbase.master=Master1:60000 CREATE TABLE pokes (foo STRING, bar STRING); LOAD DATA LOCAL INPATH '/root/hive/data/files/kv1.txt' OVERWRITE INTO TABLE pokes; CREATE TABLE hbase_hive(key string, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val"); insert overwrite table hbase_hive select * from pokes where foo="97"; This fails miserably, and in Hadoop JobTracker I can see java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:248) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:231) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:487) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:632) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:540) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:549) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:549) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:549) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:549) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:549) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:225) ... 4 more Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:976) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:625) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:607) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:738) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:634) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:607) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:738) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:638) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601) at org.apache.hadoop.hbase.client.HTable.(HTable.java:128) at org.apache.hadoop.hbase.client.HTable.(HTable.java:106) at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat.getHiveRecordWriter(HiveHBaseTableOutputFormat.java:75) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:240) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:228) ... 13 more Can someone please shed some light on this? Thanks in advance Gal
Server takes a long time to answer
Hi all, We have a cassandra installation with two nodes in a ring, replication factor = 2, some times cassandra becomes non-responsive, it takes about three minutes before answering to a get. Do you have any idea of what we should check when it happens ? Or what could cause the problem. We are using cassandra release 6.3. Thanks & regards. Jean-Yves
puppet dashboard takes a long time to display
I've been noticing my puppet dashboard is taking longer and longer to load up in my we browser the longer I've used it. I have roughly 1300 nodes being managed by puppet. It's almost as if there is too much data for it to process. I have about 1 month worth of data in my mysql database now. I've recently upgraded to version 1.0.3 but it hasn't improved the performance. It's taking typically around 60 - 90 seconds for the page to load...the browser sits waiting for a response from the server.
FTP connection timeout after 30 minutes even though set to 2 minutes
Hello, I am using camel 2.4 trying to deliver to an FTP endpoint with following parameters eagerDeleteTargetFile=false fileName=srml-5-2009-results.xml ftpClient.connectTimeout=120000 ftpClient.dataTimeout=15000 ftpClient.defaultTimeout=15000 password=aaaa soTimeout=60000 tempFileName=%24%7Bfile%3Aname.noext%7D.tmp camel route does not process anything for 30 MINUTES and then return following error: [2010-08-05 09:37:33,324][pool-1-thread-1][WARN ][org.apache.camel.component.file.remote.RemoteFileProducer][] Writing file failed with: File operation failed: null connect timed out. Code: 0 connectTimeout is set to 120000 so I am expecting it to timeout after 2 minutes... this endpoint is specified in a recipientList (if this could be of any help)
How to specify HBase cluster end-points from HBase client code in HBase 0.20.0
Hello, In my current application environment, I need to have two HBase clusters running in two different racks, to form a fault-tolerant group to tolerate power failure. Then I have an HBase client, which is sitting outside of these two clusters, to make invocation to the these two HBase clusters. In my previous work, I just need to simply use the class of “HTable”, and passed in an instance of HBaseConfiguration. And To construct the HBaseConfiguration instance, I just need to pass in the path information of the “hbase-site.xml”. And in the hbase-site.xml, there is only one parameter called “hbase.rootdir” that need to configure. Before HBase0.20.0, there used to be a parameter called “hbase.master” that I can specify. But in HBase0.20.0, I found that it does not work any more, likely because that the HBase master is managed by the Zookeeper, and the master node now becomes dynamic. Could you show me what are the APIs that I need to use, in order for me to specify the end-point address of the HBase cluster, for the HBase client invocation? Regards, Jun
Minutes: JDO TCK Conference Call Friday, Aug 13, 9 am PDT
Attendees: Michael Bouschen, Craig Russell Agenda: 1. Maven2 upgrade https://issues.apache.org/jira/browse/JDO-647 the new pom.xml for api and tck projects have been posted for review. 2. Other issues Action Items from weeks past: [6 Aug 10] AI Craig follow up on missing DTD/XSD in process [6 Aug 10] AI Craig update the download page with JDO 3 artifacts done [6 Aug 10] AI Craig talk to Apache PR re publicity for 3.0 release nothing yet [6 Aug 10] AI Michael reply to a poster that we have a new release with a class missing from JDO 2.3 EA -- Michelle Craig L Russell Architect, Oracle http://db.apache.org/jdo 408 276-5638 mailto:Craig.### @oracle.com P.S. A good JDO? O, Gasp!
Contributor Meeting Minutes 05/28/2010
This month, the MapReduce + HDFS contributor meeting was held at Cloudera Headquarters. Announcements for contributor meetings are here: http://www.meetup.com/Hadoop-Contributors/ Minutes follow. No decisions were made at this meeting, but the following issues were discussed and may presage future discussion and decisions on these lists. Eli, I think you have all the slides. Would you mind sending them out? -C == 0.21 release update == * Continuing to close blockers, ping people for updates and suggestions * About 20 open blockers. Many are MapReduce documentation that may be pushed. Speak up if 0.21 is missing anything substantive. * Common/HDFS visibility and annotations are close to consensus; MapReduce annotations are committed to trunk and the 0.21 branch == HEP proposal == (what follows is the sketch presented at the meeting. A full proposal with concrete details will be circulated on the list) * Based on- and very similar to- the PEP (Python Enhancement Proposal) Process * Audience is HDFS and MapReduce; not necessarily adopted by other subprojects - Addresses the perception that there is friction between innovation/experimentation and stability * Not for small enhancements, features, and bug fixes. This should not slow down typical development or impede casual contribution to Hadoop * Primary mechanism for new features, collecting input, documenting design decisions * JIRA is good for details, but not for deciding on wide shifts in direction * Purpose is for author to build consensus and gather dissenting opinions. - All may comment, but Editors will review incoming HEP material - Editors determine only whether the HEP is complete, not whether they believe it is a sound idea - Editors are appointed by the PMC - Mechanism for appointing Editors and term of service TBD - Apache Board appoints Shepherds for projects somewhat randomly, to projects. A similar mechanism could work for incoming HEPs - Proposal *may* come with code, but not necessarily. Drafting/baking of the HEP occurs in public on a list dedicated to that particular proposal. Once Editors certify the HEP as complete, it is sent to gene### @ for wider discussion. - The discussion phase begins on gen### @. The mailing list exists to ensure the HEP is complete enough to present to the community. - Some discussion on the difference between posting to gen### @ and posting to the HEP list. Completeness is, of course, subjective. If the Editor and Author disagree whether the proposal affects an aspect of the framework enough to merit special consideration, it is not entirely clear how to resolve the disagreement. - In general, the role of the Editor in the community-driven process of Hadoop is not entirely clear. It may be possible to optimize it out. - Once discussion ends, the HEP is passed (or fails to pass) by a vote of the PMC (mechanics undefined). In Python, the result is committed to the repository. A similar practice would make sense in Hadoop. * Which issues require HEPs? - Discussion ranged. Append, backup namenode, edit log rewrite, et al. were examples of features substantial enough to merit a HEP. Pure Java CRC is an example of an enhancement that would not. Whether an explicit process must be in place to determine whether an issue requires a HEP is not clear. - Viewing HEPs as a way of soliciting consensus for an approach might be more accurate. Going through the HEP process should always improve the chances of a successful proposal * Evaluation - The proposal may be rejected if it is redundant with existing functionality, technically unsound, insufficiently motivated, no backwards compatibility story, etc. - Implementation is not necessary, and is lightly discouraged. Feedback is less welcome once code is in hand. - Purpose is to be clear about the acceptance criteria for that issue, e.g. concerns that the proposal may not scale or may harm performance - Dissenting opinions must be recorded accurately. Quoting would be a safe practice for the Author to encourage HEP reviewers not to block the product of the proposal. * The testing burden and completion strategy may be ambiguous - Whether the proposal affects scalability may not be testable by the implementer. Completing the proposal to address all use cases may require considerably more work than the Author is willing or motivated to invest. - The HEP discussion on gene### @ should explore whether such objections are merited and reasonable. For example, a particularly obscure/esoteric use case could be included as a condition for acceptance if the dissenter is willing to invest the resources to test/validate it. The process is flexible in this regard. - But it is not infinitely flexible. Backwards compatibility, performance regression, availability, and other considerations need not be called out in every HEP. - Traditional concerns need to be documented. Acceptance criteria should ideally be automated and reproducible in different organizations == Branching == * A patch and a branch are isomorphic from a policy perspective. Of course, they are functionally distinct: branches are easier to collaborate on and are, generally, longer-lived than are patches. But special policies need not be derived to account for these differences, which concern the production of the code, not its review and acceptance. * Some developers find branches to be easier to review than very large patches and easier to merge, given a toolchain that supports this. - Subversion currently is difficult to adapt to this model - Could be done on a HEP-by-HEP basis, as a condition for acceptance * Eclipse Labs - Branded version of Google Code (same functionality, w/ Eclipse brand) - Not official Eclipse projects, but associated with Eclipse - Apache/Hadoop may consider a similar strategy - Distinct from Apache Labs, as one need not be a committer, follow its rules for releases, etc. == Contrib == * Modules (such as fuse-dfs) are not actively maintained in the main repository and would benefit from a release schedule decoupled from the rest of Hadoop * With few exceptions, the contrib modules have smaller, often discrete groups of maintainers. It may be worth exploring whether these projects could live elsewhere
Review Request: HIVE-1512 : Fix Hive's HBase storage handler to work with both HBase 0.89.0 SNAPSHOT
DO NOT REPLY Stale LDAP connections take 15+ minutes to finish queries
https://issues.apache.org/bugzilla/show_bug.cgi?id=45834
Stefan Fritsch <s### @sfritsch.de> changed:
What |Removed |Added
Transactional JMS Route Constantly Opening/Closing Connections
Transactional JMS Route Opening/Closing Connections I would like to setup a route that utilizes transactional delivery of JMS messages to avoid dropping messages in the event of server failures. To do this I have utilized the Camel Transaction documentation to set up transactional routes (Camel version 2.2). This Camel Engine is running within ActiveMQ (5.3.1) to do some lightweight processing/routing/mediation of messages. Here is the camel configuration: <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd"> <camelContext id="camel" xmlns="http://camel.apache.org/schema/spring"> <route> <from uri="jmstx:example.A"/> <to uri="jmstx:example.B"/> </route> </camelContext> <!-- JmsComponent Transactional Delivery of JMS Messages --> <bean id="jmstx" class="org.apache.camel.component.jms.JmsComponent"> <property name="configuration" ref="jmsConfig" /> </bean> <bean id="jmsConfig" class="org.apache.camel.component.jms.JmsConfiguration"> <property name="connectionFactory" ref="jmsConnectionFactory"/> <property name="transactionManager" ref="jmsTransactionManager"/> <property name="transacted" value="true"/> </bean> <bean id="jmsTransactionManager" class="org.springframework.jms.connection.JmsTransactionManager"> <property name="connectionFactory" ref="jmsConnectionFactory" /> </bean> <bean id="jmsConnectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory"> <property name="brokerURL" value="vm://localhost?create=false&waitForStart=10000" /> <property name="userName" value="${activemq.username}" /> <property name="password" value="${activemq.password}" /> </bean> </beans> Before I even send any messages through this route I receive the following opening/closing of connetions (logging at Debug level): 2010-04-21 13:38:18,387 | DEBUG | Setting up new connection id: ID:tbarneswin7-60077-1271882288129-2:8, address: vm://localhost#14 | org.apache.activemq.broker.TransportConnection | VMTransport: vm://localhost#15 2010-04-21 13:38:18,387 | DEBUG | localhost adding consumer: ID:tbarneswin7-60077-1271882288129-2:8:-1:1 for destination: topic://ActiveMQ.Advisory.TempQueue,topic://ActiveMQ.Advisory.TempTopic | org.apache.activemq.broker.region.AbstractRegion | VMTransport: vm://localhost#15 2010-04-21 13:38:19,392 | DEBUG | ID:tbarneswin7-60077-1271882288129-2:8:1 Transaction Commit :null | org.apache.activemq.ActiveMQSession | DefaultMessageListenerContainer-1 2010-04-21 13:38:19,393 | DEBUG | localhost removing consumer: ID:tbarneswin7-60077-1271882288129-2:8:-1:1 for destination: topic://ActiveMQ.Advisory.TempQueue,topic://ActiveMQ.Advisory.TempTopic | org.apache.activemq.broker.region.AbstractRegion | VMTransport: vm://localhost#15 2010-04-21 13:38:19,393 | DEBUG | remove connection id: ID:tbarneswin7-60077-1271882288129-2:8 | org.apache.activemq.broker.TransportConnection | VMTransport: vm://localhost#15 2010-04-21 13:38:19,395 | DEBUG | Stopping connection: vm://localhost#14 | org.apache.activemq.broker.TransportConnection | ActiveMQ Task 2010-04-21 13:38:19,395 | DEBUG | Stopped transport: vm://localhost#14 | org.apache.activemq.broker.TransportConnection | ActiveMQ Task 2010-04-21 13:38:19,408 | DEBUG | Setting up new connection id: ID:tbarneswin7-60077-1271882288129-2:9, address: vm://localhost#16 | org.apache.activemq.broker.TransportConnection | VMTransport: vm://localhost#17 2010-04-21 13:38:19,409 | DEBUG | Connection Stopped: vm://localhost#14 | org.apache.activemq.broker.TransportConnection | ActiveMQ Task 2010-04-21 13:38:19,409 | DEBUG | localhost adding consumer: ID:tbarneswin7-60077-1271882288129-2:9:-1:1 for destination: topic://ActiveMQ.Advisory.TempQueue,topic://ActiveMQ.Advisory.TempTopic | org.apache.activemq.broker.region.AbstractRegion | VMTransport: vm://localhost#17 2010-04-21 13:38:20,411 | DEBUG | ID:tbarneswin7-60077-1271882288129-2:9:1 Transaction Commit :null | org.apache.activemq.ActiveMQSession | DefaultMessageListenerContainer-1 2010-04-21 13:38:20,411 | DEBUG | localhost removing consumer: ID:tbarneswin7-60077-1271882288129-2:9:-1:1 for destination: topic://ActiveMQ.Advisory.TempQueue,topic://ActiveMQ.Advisory.TempTopic | org.apache.activemq.broker.region.AbstractRegion | VMTransport: vm://localhost#17 2010-04-21 13:38:20,412 | DEBUG | remove connection id: ID:tbarneswin7-60077-1271882288129-2:9 | org.apache.activemq.broker.TransportConnection | VMTransport: vm://localhost#17 2010-04-21 13:38:20,413 | DEBUG | Stopping connection: vm://localhost#16 | org.apache.activemq.broker.TransportConnection | ActiveMQ Task 2010-04-21 13:38:20,414 | DEBUG | Stopped transport: vm://localhost#16 | org.apache.activemq.broker.TransportConnection | ActiveMQ Task 2010-04-21 13:38:20,415 | DEBUG | Connection Stopped: vm://localhost#16 | org.apache.activemq.broker.TransportConnection | ActiveMQ Task 2010-04-21 13:38:20,416 | DEBUG | Setting up new connection id: ID:tbarneswin7-60077-1271882288129-2:10, address: vm://localhost#18 | org.apache.activemq.broker.TransportConnection | VMTransport: vm://localhost#19 Is there some mis-configuration that I am doing that causes these JMS connections to be opened/closed constantly? Should I utilize another method to create these Camel Routes? Any suggestions or recommendations are appreciated. Here is the actual camel file used to isolate the problem: http://old.nabble.com/file/p28836324/camel.xml camel.xml | |||||