job_201108111058_2464
논문 3번.
source link : http://yaseminavcular.blogspot.com/2011/03/hadoop-java-heap-space-error.html
"Error: Java Heap space" means I'm trying to allocate more memory then available in the system.
how to go around? (1) better configuration (2) look for unnecessarily allocated objects
Configuration
mapred.map.child.java.opts : heap size for map tasks
mapred.reduce.child.java.opts: heap size for reduce tasks
mapred.tasktracker.map.tasks.maximum: max map tasks can run simultaneously per node
mapred.tasktracker.reduce.tasks.maximum: max reduce tasks can run simultaneously per node
Make sure ((num_of_maps * map_heap_size) + (num_of_reducers * reduce_heap_size)) is not larger than memory available in the system. Max number of mappers & reducers can also be tuned looking at available system resources.
io.sort.factor: max # of streams to merge at once for sorting. Used both in map and reduce.
io.sort.mb: map side memory buffer size used while sorting
mapred.job.shuffle.input.buffer.percent: Reduce side buffer related - The percentage of memory to be allocated from the maximum heap size for storing map outputs during the shuffle
NOTE: Using fs.inmemory.size.mb is very bad idea!
Unnecessary memory allocation
Simply look for new keyword and make sure there is no unnecessary allocation. A very common tip is using set() method of Writable objects rather than re-allocating a new object at every map or reduce.
Here is a simple count example to show the trick:
public static class UrlReducer extends Reducer{
IntWritable sumw = new IntWritable();
int sum;
public void reduce(Text key,Iterable
sum=0;
for (IntWritable val : vals) {
sum += val.get();
}
sumw.set(sum);
context.write(key, sumw);
}
}
note: There are couple more tips here for resolving common errors in Hadoop.
createBlockOutputStream
http://getsatisfaction.com/cloudera/topics/exception_in_createblockoutputstream_java_io_ioexception_bad_connect_ack_with_firstbadlink
When i am triying to copy file from local to hdfs it shows the error. But file copying. waht is the problem? how to solve it?
hadoopadmin@DcpMaster:/$ hadoop dfs -put /home/hadoopadmin/Desktop/hadoop-0.20/ /cinema
10/06/17 13:18:12 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.0.0.2:50010
10/06/17 13:18:12 INFO hdfs.DFSClient: Abandoning block blk_8926776881229359620_1133
10/06/17 13:18:12 INFO hdfs.DFSClient: Waiting to find target node: 10.0.0.1:50010
10/06/17 13:19:18 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.0.0.2:50010
10/06/17 13:19:18 INFO hdfs.DFSClient: Abandoning block blk_-7122249881716391441_1134
10/06/17 13:19:18 INFO hdfs.DFSClient: Waiting to find target node: 10.0.0.1:50010
10/06/17 13:20:24 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.0.0.2:50010
10/06/17 13:20:24 INFO hdfs.DFSClient: Abandoning block blk_4279698506667722666_1135
10/06/17 13:20:24 INFO hdfs.DFSClient: Waiting to find target node: 10.0.0.1:50010
Patrick Angeles (Solutions Architect) 1 year ago
Hey there.
These warnings could be normal and caused by a DN timing out (due to network hiccups or load) or going down. Hadoop can normally recover from these errors.
Is the copy failing?
If the problem occurs regularly on the same target datanode (looks like 10.0.0.2 is always failing here), then I would look into that node.
Regards,
- Patrick