This is Joe on Data.

Category Archives: Hive

Memory Management in Hadoop MapReduce

If you ever have to write MapReduce jobs or custom UDF or SerDe classes for Hive in Java, you will want to re-use memory as much as possible, meaning as few object and array allocations as possible, while also taking care not to inadvertently use/re-use data that is invalid or corrupted. This is an important practice […]