FusionInsight-HD-C30L--Hive执行脚本报错

发布时间:  2016-11-02 浏览次数:  378 下载次数:  0
问题描述

1.在C30L版本中,执行HiveSQL任务

2.执行抛出异常,任务无法执行

告警信息

FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
[Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: 54
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:194)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138)
        at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195)
        at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
        at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:353)
        at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:197)
        at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:183)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:545)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:173)
 ]
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:173)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException: 54
        at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:327)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540

处理过程

1.分析错误日志,找到ArrayIndexOutOfBoundsException, 发现这类问题属于Hive语法解读时,抛出的异常

2.分析HiveSQL脚本,原脚本在子查询中包含多个窗口函数

  select zoneno from (select zoneno,
                       brno,
                       tellerno,
                       count(*) as cnt_teller,
                       sum(count(*)) over(partition by zoneno) as cnt_zoneno,
                       sum(count(*)) over(partition by  zoneno,brno) as cnt_brno
                  from bdsp_roma.c1_
                 where workdate = 'XXX'
                 group by zoneno, brno, tellerno)a   ----执行抛异常

3.修改HiveSQL,

        select zoneno from (select zoneno,
                       brno,
                       tellerno,
                       count(*) as cnt_teller,
                       sum(count(*)) over(partition by zoneno) as cnt_zoneno,
                  from bdsp_roma.c1_
                 where workdate ='XXX'
                 group by zoneno, brno, tellerno)a 

4.测试通过,


根因

社区内核bug,导致针对窗口函数做列裁剪后,列的顺序出现问题

解决方案

1.此问题是产品Bug,https://issues.apache.org/jira/browse/HIVE-9228

2.修改HiveSQL,避免多窗口函数使用

3.或者升级到C60版本(Hive 1.0.2 以上,已经修改了bug问题)

建议与总结

在C30版本中,因Hive自身的Bug问题,在子查询中有多窗口函数,做列裁剪后,列的顺序出现问题。

建议在这个版本中,编写HiveSQL时,注意多窗口函数的使用


END