hadoop - how to manage modified data in Apache Hive -


we working on cloudera cdh , trying perform reporting on data stored on apache hadoop. send daily reports client need import data operational store hadoop daily.

hadoop works on append mode. hence can not perform hive update/delete query. can perform insert overwrite on dimension tables , add delta values in fact tables. introducing thousands delta rows daily not seem quite impressive solution.

are there other standard better ways update modified data in hadoop?

thanks

hdfs might append only, hive support updates 0.14 on.

see here: https://cwiki.apache.org/confluence/display/hive/languagemanual+dml#languagemanualdml-update

a design pattern take previous , current data , insert new table every time.

depending on usecase have @ apache impala/hbase/... or drill.


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -

Python Pig Latin Translator -