Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account

What's the best way to support array column types with external tables in hive?

0

39 views

So i have external tables of tab delimited data. A simple table looks like this:

create external table if not exists categories
(id string, tag string, legid string, image string, parent string, created_date string, time_stamp int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3n://somewhere/';

Now I'm adding another field to the end, it will be a comma separated list of values.

Is there a way to specify this in the same way that I specify a field terminator, or do I have to rely on one of the serdes?

eg:

...list_of_names ARRAY<String>)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ARRAY ELEMENTS SEPARATED BY ','
...

(I'm assuming I'll need to use a serde for this, but I figured there wasn't any harm in asking)

asked June 7, 2011 1:45 pm CDT
posted via StackOverflow

1 Answers

1
Best answer
 

I don't know how to update an existing table to do that, but for creating a table; what you are looking for can be found, in depth, at http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL. A snippet from there

row_format
  : DELIMITED [FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char]
        [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]

An example from our table creation is

CREATE TABLE IF NOT EXISTS visits
(
    ... Columns Removed...
)
    PARTITIONED BY (userdate STRING)
    ROW FORMAT DELIMITED
        FIELDS TERMINATED BY '\001'
        COLLECTION ITEMS TERMINATED BY '\002'
        MAP KEYS TERMINATED BY '\003'
    STORED AS TEXTFILE
;

The line from that you'd be looking for is the COLLECTION ITEMS TERMINATED BY char for an array.

hth

answered June 8, 2011 12:44 pm CDT

Your answer

Join with account you already have


Sign in with Twitter account
Sign in with Facebook account
Sign in with Google Friend Connect

Preview
Similar questions