Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account

Is Apache Hive used more for the programming language or for the data warehouse aspects?

0

46 views

I used to think that Hive was just a SQL-like programming language used to make writing MapReduce-type jobs easier (i.e., a SQL-like version of Pig/Pig Latin). I'm reading more about it now, though, and apparently it's actually a full data warehouse infrastructure.

Is one of these use cases more common? That is, is it primarily used for the data warehouse infrastructure it provides, or more for the SQL-like interface? Or are both aspects of equal utility and importance?

(I'm asking because I'm trying to figure out what parts of Hive I should focus on learning about.)

asked June 19, 2011 12:27 am CDT
posted via StackOverflow

2 Answers

0
 

Hive doesn't support updates. In our implementation we used straight MapReduce jobs for populating data warehouse and Hive for making exports for further processing or importing into relational data warehouses. We also used it as an intermediary for a BI reporting tool.

answered June 24, 2011 3:09 am CDT
1
 

That's exactly what I used to think too. Now that I've had about a month's experience with Hive, I now find that it's a great ETL tool... for a data warehouse later down the line.

Hive doesn't compare with MDX. Hive is very row-based and doesn't allow a lot of the messier operations that SQL or MDX (Multidimensional Expression Language, common in BI tools) are masters at.

We're using Hive as an ETL tool to integrate our different flat file data sources and reduce the amount of data we have to upload to a SQL-based data warehouse.

If that data only has a half-life spanning a couple of weeks, then we can keep the size of our database relatively manageable, always able to reproduce the reports later on from Hive.

answered June 24, 2011 3:09 am CDT

Your answer

Join with account you already have


Sign in with Twitter account
Sign in with Facebook account
Sign in with Google Friend Connect

Preview
Similar questions
Db2 programming
April 26, 2011
Php db programming
June 1, 2011
Hive with Lucene
January 31, 2011
Hadoop/hive metastore
August 7, 2009