Best unofficial Apache Server developers community |
|
So, I have an existing hdfs directory, containing a bunch of files. These files are all tab delimited. I have a hive statement....
This works pretty well, except for all of the extra fields. The file also contains between 0 and x extra data elements after the ssn field. They are still tab delimited, and '\n' record delimited. I could add a bunch of 'valuex string' (where x is the increment of extra elements)... but I don't know how many there might eventually be, and that seems messy anyway. Is there a way to tell hive to just put all the remaining fields of that row into ONE field, like 'others string'? Even if it is tab delimted in the hive return value... I am ok with that. Thanks, in advance.
posted via StackOverflow
|
![]()  
|
Creating a table in Hive essentially just creates the Metadata telling hive how to interpret the files. Hive doesn't 'know' about the rest of the data. If you add another column as an array and specify |