Hadoop – GC overhead limit exceeded error

In our Hadoop setup, we ended up having more than 1 million files in a single folder.  The folder had so many files, that any hdfs dfs command like -ls, -copyToLocal on the files was giving following error:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOf(Arrays.java:2367)
        at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
        at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
        at java.lang.StringBuffer.append(StringBuffer.java:237)
        at java.net.URI.appendAuthority(URI.java:1852)
        at java.net.URI.appendSchemeSpecificPart(URI.java:1890)
        at java.net.URI.toString(URI.java:1922)
        at java.net.URI.<init>(URI.java:749)
        at org.apache.hadoop.fs.Path.initialize(Path.java:203)
        at org.apache.hadoop.fs.Path.<init>(Path.java:116)
        at org.apache.hadoop.fs.Path.<init>(Path.java:94)
        at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:230)
        at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.makeQualified(HdfsFileStatus.java:263)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:732)
        at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
        at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
        at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
        at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:268)
        at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
        at org.apache.hadoop.fs.shell.CommandWithDestination.recursePath(CommandWithDestination.java:291)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
        at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
        at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)

After doing some research, we added following environment variable to update Hadoop runtime options.

export HADOOP_OPTS="-XX:-UseGCOverheadLimit"

Adding this option fixed the GC error, but started throwing the following error, citing the lack of Java Heap space.

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1351)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1413)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1524)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1533)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:557)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy15.getListing(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1969)
        at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:724)
        at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
        at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
        at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
        at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:268)
        at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
        at org.apache.hadoop.fs.shell.CommandWithDestination.recursePath(CommandWithDestination.java:291)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
        at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
        at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

We modified the above export, and tried following instead.  Note that instead of  HADOOP_OPTS,  we needed to set HADOOP_CLIENT_OPTS fix this error. This was needed because all the hadoop commands run as a client.  HADOOP_OPTS needs to be setup for modifying actual Hadoop run time, and HADOOP_CLIENT_OPTS is needed to be setup for modifying run time for Hadoop command line client.

export HADOOP_CLIENT_OPTS="-XX:-UseGCOverheadLimit -Xmx4096m"

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s