real indexing job
2011-08-12 13:46:59

job_201108111058_2464

▼ more
Ontology-based_Controlled_Natural_Language_Editor_Using_CFG_with_Lexical_Dependency
2011-08-12 11:32:43

논문 3번.

▼ more
Hadoop - Java Heap Space Error
2011-08-12 09:07:48

source link : http://yaseminavcular.blogspot.com/2011/03/hadoop-java-heap-space-error.html

"Error: Java Heap space" means I'm trying to allocate more memory then available in the system.

how to go around? (1) better configuration (2) look for unnecessarily allocated objects

Configuration

mapred.map.child.java.opts : heap size for map tasks

mapred.reduce.child.java.opts: heap size for reduce tasks

mapred.tasktracker.map.tasks.maximum: max map tasks can run simultaneously per node

mapred.tasktracker.reduce.tasks.maximum: max reduce tasks can run simultaneously per node

Make sure ((num_of_maps * map_heap_size) + (num_of_reducers * reduce_heap_size)) is not larger than memory available in the system. Max number of mappers & reducers can also be tuned looking at available system resources.

io.sort.factor: max # of streams to merge at once for sorting. Used both in map and reduce.

io.sort.mb: map side memory buffer size used while sorting

mapred.job.shuffle.input.buffer.percent: Reduce side buffer related - The percentage of memory to be allocated from the maximum heap size for storing map outputs during the shuffle

NOTE: Using fs.inmemory.size.mb is very bad idea!

Unnecessary memory allocation

Simply look for new keyword and make sure there is no unnecessary allocation. A very common tip is using set() method of Writable objects rather than re-allocating a new object at every map or reduce.

Here is a simple count example to show the trick:

public static class UrlReducer extends Reducer{

IntWritable sumw = new IntWritable();

int sum;

public void reduce(Text key,Iterable vals,Context context){

sum=0;

for (IntWritable val : vals) {

sum += val.get();

}

sumw.set(sum);

context.write(key, sumw);

}

}

note: There are couple more tips here for resolving common errors in Hadoop.

▼ more
hadoopexception1
2011-08-12 09:02:46

createBlockOutputStream

http://getsatisfaction.com/cloudera/topics/exception_in_createblockoutputstream_java_io_ioexception_bad_connect_ack_with_firstbadlink

When i am triying to copy file from local to hdfs it shows the error. But file copying. waht is the problem? how to solve it?

hadoopadmin@DcpMaster:/$ hadoop dfs -put /home/hadoopadmin/Desktop/hadoop-0.20/ /cinema

10/06/17 13:18:12 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.0.0.2:50010

10/06/17 13:18:12 INFO hdfs.DFSClient: Abandoning block blk_8926776881229359620_1133

10/06/17 13:18:12 INFO hdfs.DFSClient: Waiting to find target node: 10.0.0.1:50010

10/06/17 13:19:18 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.0.0.2:50010

10/06/17 13:19:18 INFO hdfs.DFSClient: Abandoning block blk_-7122249881716391441_1134

10/06/17 13:19:18 INFO hdfs.DFSClient: Waiting to find target node: 10.0.0.1:50010

10/06/17 13:20:24 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.0.0.2:50010

10/06/17 13:20:24 INFO hdfs.DFSClient: Abandoning block blk_4279698506667722666_1135

10/06/17 13:20:24 INFO hdfs.DFSClient: Waiting to find target node: 10.0.0.1:50010

Patrick Angeles (Solutions Architect) 1 year ago

Hey there.

These warnings could be normal and caused by a DN timing out (due to network hiccups or load) or going down. Hadoop can normally recover from these errors.

Is the copy failing?

If the problem occurs regularly on the same target datanode (looks like 10.0.0.2 is always failing here), then I would look into that node.

Regards,

- Patrick

▼ more