Starting Zabbix Server within docker replaces strings with nothing in config file → - docker

→ or totally ignored strings like name of new DB for testing purposes.
Firstly tries to add something about ~250 to 250 already added hosts and Z-server shutted down. I've restarted it and inside docker logs I saw this:
6:20191014:091840.201 using configuration file: /etc/zabbix/zabbix_server.conf
6:20191014:091840.223 current database version (mandatory/optional): 04020000/04020001
6:20191014:091840.223 required mandatory version: 04020000
6:20191014:091840.484 __mem_malloc: skipped 7 asked 108424 skip_min 304 skip_max 12192
6:20191014:091840.484 [file:dbconfig.c,line:94] __zbx_mem_realloc(): out of memory (requested 108424 bytes)
6:20191014:091840.484 [file:dbconfig.c,line:94] __zbx_mem_realloc(): please increase CacheSize configuration parameter
6:20191014:091840.484 === memory statistics for configuration cache ===
Solution for those problem was to increase CacheSize in zabbix_server.conf . Okay, that's not a problem and after this Im push a new config to Z-server and restart it... → and z-server stops already after start and logs says the same problem. After reading config in container I saw what string what I corrected to matching my wishes are missing O_o. Strings are deleted.
My config:
LogType=console
DBHost=postgres-server
DBName=zabbix_pwd
DBSchema=public
DBUser=zabbix
DBPassword=zabbix
DBPort=5432
StartPollers=5
StartIPMIPollers=5
StartPollersUnreachable=5
SNMPTrapperFile=/var/lib/zabbix/snmptraps/snmptraps.log
StartSNMPTrapper=1
CacheSize=512M
HistoryCacheSize=512M
HistoryIndexCacheSize=512M
TrendCacheSize=512m
ValueCacheSize=256M
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/sbin/fping
Fping6Location=/usr/sbin/fping6
SSHKeyLocation=/var/lib/zabbix/ssh_keys
SSLCertLocation=/var/lib/zabbix/ssl/certs/
SSLKeyLocation=/var/lib/zabbix/ssl/keys/
SSLCALocation=/var/lib/zabbix/ssl/ssl_ca/
LoadModulePath=/var/lib/zabbix/modules/
And what I've getting after starting z-server:
LogType=console
DBHost=postgres-server
DBName=zabbix_pwd
DBSchema=public
DBUser=zabbix
DBPassword=zabbix
DBPort=5432
SNMPTrapperFile=/var/lib/zabbix/snmptraps/snmptraps.log
StartSNMPTrapper=1
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/sbin/fping
Fping6Location=/usr/sbin/fping6
SSHKeyLocation=/var/lib/zabbix/ssh_keys
SSLCertLocation=/var/lib/zabbix/ssl/certs/
SSLKeyLocation=/var/lib/zabbix/ssl/keys/
SSLCALocation=/var/lib/zabbix/ssl/ssl_ca/
LoadModulePath=/var/lib/zabbix/modules/
Any suggestions to how-to rule the world and don't be captured by doctors ?

With docker you need to send conf parameters in the docker-compose.yml file, or in your docker run command using the -e :
For example from my docker yml file:
zabbix-server:
image: zabbix/zabbix-server-pgsql:ubuntu-4.2.6
environment:
ZBX_MAXHOUSEKEEPERDELETE: 5000
ZBX_STARTPOLLERS: 15
ZBX_CACHESIZE: 8M
ZBX_STARTDBSYNCERS: 4
ZBX_HISTORYCACHESIZE: 16M
ZBX_TRENDCACHESIZE: 4M
ZBX_VALUECACHESIZE: 8M
ZBX_LOGSLOWQUERIES: 3000
Another way to work with zabbix:
https://hub.docker.com/r/monitoringartist/zabbix-3.0-xxl/

Related

Upgrading docker artifactory pro 6.X to 7.X

compose artifactory pro v6.9.0 running
In my compose I have two services :
image: docker.bintray.io/jfrog/artifactory-pro:6.9.0
image: docker.io/library/postgres:9.6.11
I was able to upgrade to 6.23.13 without any problem just by changing the version of the image.
When I try the same thing with any 7.X version (after upgrading to at least 6.10 as the doc says), I have errors.
For example, trying 7.21.3, I have these warnings
2022-08-01T08:41:30.343L [tomct] [WARNING] [ ] [org.apache.catalina.startup.HostConfig] [org.apache.catalina.startup.HostConfig deployDescriptor] - A docBase [/opt/jfrog/artifactory/app/artifactory/tomcat/webapps/access.war] inside the host appBase has been specified, and will be ignored
2022-08-01T08:41:30.343L [tomct] [WARNING] [ ] [org.apache.catalina.startup.HostConfig] [org.apache.catalina.startup.HostConfig deployDescriptor] - A docBase [/opt/jfrog/artifactory/app/artifactory/tomcat/webapps/artifactory.war] inside the host appBase has been specified, and will be ignored
...
2022-08-01T08:41:37.344Z [jfrt ] [WARN ] [ce1b2553475da56b] [c.z.h.p.ProxyConnection:182 ] [ocalhost-startStop-2] - HikariCP Main - Connection org.apache.derby.impl.jdbc.EmbedConnection#1597179442 (XID = 24), (SESSIONID = 3), (DATABASE = {db.home}), (DRDAID = null) marked as broken because of SQLSTATE(0A000), ErrorCode(20000)
java.sql.SQLFeatureNotSupportedException: Feature not implemented: No details.
and these errors
08:41:34,803 |-ERROR in ch.qos.logback.core.joran.action.AppenderAction - Could not create an Appender of type [org.artifactory.usage.appender.UsageTrafficTimeBasedRollingFileAppender]. ch.qos.logback.core.util.DynamicClassLoadingException: Failed to instantiate type org.artifactory.usage.appender.UsageTrafficTimeBasedRollingFileAppender
...
2022-08-01T08:41:37.113Z [jfrt ] [ERROR] [ce1b2553475da56b] [d.d.l.DbDistributeLocksDao:506] [ocalhost-startStop-2] - Unable to detect database version Unable to get connection from unique lock data source
2022-08-01T08:41:37.353Z [jfrt ] [ERROR] [ce1b2553475da56b] [tifactoryHomeConfigListener:55] [ocalhost-startStop-2] - Failed initializing Home. Caught exception:
java.lang.IllegalStateException: Could not find database table: db_properties
After reading the docs I do not clearly understand if I have to download the new docker-compose package from jfrog. I've tried but once there, the config.sh ask for external database and no question about reusing existing image directory.
Thx for help
Not sure if this is still relevant for you, but I got the same problem. As far as I see there are new DB connection environment variables, that must be set. This error message is just a symptom, because without the new DB connection env, Artifactory will try to use an in-memory database (Apache Derby). Something like this is needed to fix it (as Docker Compose config of the Artifactory container):
environment:
- JF_SHARED_DATABASE_TYPE=postgresql
- JF_SHARED_DATABASE_USERNAME=${POSTGRESQL_USERNAME}
- JF_SHARED_DATABASE_PASSWORD=${POSTGRESQL_PASSWORD}
- JF_SHARED_DATABASE_URL=jdbc:postgresql://postgresql:5432/artifactory
- JF_SHARED_DATABASE_DRIVER=org.postgresql.Driver

Confluent Docker log4j logger level configurations

I am running locally Kafka using the confluentinc/cp-kafka Docker image and I am setting the following logging container environment variables:
KAFKA_LOG4J_ROOT_LOGLEVEL: ERROR
KAFKA_LOG4J_LOGGERS: >-
org.apache.zookeeper=ERROR,
org.apache.kafka=ERROR,
kafka=ERROR,
kafka.cluster=ERROR,
kafka.controller=ERROR,
kafka.coordinator=ERROR,
kafka.log=ERROR,
kafka.server=ERROR,
kafka.zookeeper=ERROR,
state.change.logger=ERROR
and I see in the Kafka logs that Kafka is starting with the following configuration:
===> ENV Variables ...
ALLOW_UNSIGNED=false
COMPONENT=kafka
CONFLUENT_DEB_VERSION=1
CONFLUENT_PLATFORM_LABEL=
CONFLUENT_VERSION=5.4.1
...
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
KAFKA_LOG4J_ROOT_LOGLEVEL=ERROR
...
Still I see further down in the logs the INFO and TRACE log levels. For example:
[2020-03-26 16:22:12,838] INFO [Controller id=1001] Ready to serve as the new controller with epoch 1 (kafka.controller.KafkaController)
[2020-03-26 16:22:12,848] INFO [Controller id=1001] Partitions undergoing preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,849] INFO [Controller id=1001] Partitions that completed preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,855] INFO [Controller id=1001] Skipping preferred replica election for partitions due to topic deletion: (kafka.controller.KafkaController)
How can I really deactivate the logs below a certain level? In the example above, I really want only ERROR logs.
The approach above is the way described in the Confluent documentation.
And the Apache Kafka source code lists all sorts of loggers that I could not influence using the KAFKA_LOG4J_LOGGERS Docker environment variable.
I went and troubleshot the Dockerfile's and inspected the Kafka container. The cause of this behaviour was the YAML multiline string folding.
Hence the provided environment variable (using a YAML multiline value) is at runtime:
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
instead of (no spaces in between):
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR,org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR,kafka.controller=ERROR, kafka.coordinator=ERROR,kafka.log=ERROR,kafka.server=ERROR,kafka.zookeeper=ERROR,state.change.logger=ERROR
And this was visible inside the container in the generated /etc/kafka/log4j.properties file:
log4j.rootLogger=ERROR, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.logger.kafka.authorizer.logger=WARN
log4j.logger.kafka.cluster=ERROR
log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG
log4j.logger.kafka.zookeeper=ERROR
log4j.logger.org.apache.kafka=ERROR
log4j.logger.kafka.coordinator=ERROR
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.kafka.log.LogCleaner=INFO
log4j.logger.kafka.controller=ERROR
log4j.logger.kafka=INFO
log4j.logger.kafka.log=ERROR
log4j.logger.state.change.logger=ERROR
log4j.logger.kafka=ERROR
log4j.logger.kafka.server=ERROR
log4j.logger.kafka.controller=TRACE
log4j.logger.kafka.network.RequestChannel$=WARN
log4j.logger.kafka.request.logger=WARN
log4j.logger.state.change.logger=TRACE
If you really need to split the long line in a YAML multiline value, you would have to use this YAML syntax.
More hints from the code:
here is where the log4j.properties file is generated when a confluent container is run.
these are the default log levels that Kafka will start with.
these should be all the loggers supported by Kafka

Spark executor sends result to a random port though all the ports are explicitly set up

I am trying to run a spark job with PySpark through Jupyter notebook running in Docker. Workers are located on separate machines in the same network. I am performing a take operation on RDD:
data.take(number_of_elements)
When the number_of_elements is 2000 everything works fine. When it is 20000 an exception occurs. From my point of view it breaks when the size of the result exceeds 2GB (or it seems for me so). The idea about 2GB comes from that spark can send results smaller than 2GB in one block and when the result is bigger than 2GB another mechanism starts to work and something breaks there (see here). Here is the exception from executor log:
19/11/05 10:27:14 INFO CodeGenerator: Code generated in 205.7623 ms
19/11/05 10:27:40 INFO PythonRunner: Times: total = 25421, boot = 3, init = 1751, finish = 23667
19/11/05 10:27:42 INFO MemoryStore: Block taskresult_4 stored as bytes in memory (estimated size 927.7 MB, free 6.4 GB)
19/11/05 10:27:42 INFO Executor: Finished task 0.0 in stage 3.0 (TID 4). 972788748 bytes result sent via BlockManager)
19/11/05 10:27:49 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1585998572000, chunkIndex=0}, buffer=org.apache.spark.storage.BlockManagerManagedBuffer#4399ad49} to /10.0.0.9:56222; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.spark.util.io.ChunkedByteBufferFileRegion.transferTo(ChunkedByteBufferFileRegion.scala:64)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:362)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1321)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:983)
at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248)
at io.netty.channel.nio.AbstractNioByteChannel$1.run(AbstractNioByteChannel.java:284)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
As we can see from the log executor tries to send result to 10.0.0.9:56222. It fails because the port is not opened in docker compose. 10.0.0.9 is an IP address of a master node but port 56222 is random though I explicitly set up all ports I can find in documentation to disable random port selection:
spark = SparkSession.builder\
.master('spark://spark.cyber.com:7077')\
.appName('My App')\
.config('spark.task.maxFailures', '16')\
.config('spark.driver.port', '20002')\
.config('spark.driver.host', 'spark.cyber.com')\
.config('spark.driver.bindAddress', '0.0.0.0')\
.config('spark.blockManager.port', '6060')\
.config('spark.driver.blockManager.port', '6060')\
.config('spark.shuffle.service.port', '7070')\
.config('spark.driver.maxResultSize', '14g')\
.getOrCreate()
I mapped these ports with docker compose:
version: "3"
services:
jupyter:
image: jupyter/pyspark-notebook:latest
ports:
- "4040-4050:4040-4050"
- "6060:6060"
- "7070:7070"
- "8888:8888"
- "20000-20010:20000-20010"
You should probably configure you spark driver memory to follow your docker container memory settings
I added
.config('spark.driver.memory', '14g')
as #ML_TN proposed and everything works now.
From my point of view it is strange that the memory setting affects the ports that spark uses.

Homebrew: Can't start elastic search

I'm in a big trouble, I can't start Elasticsearch and I need it for run my rails locally, please tell me what's going on. I installed Elasticsearch in the normal fashion then I did the following:
elasticsearch --config=/usr/local/opt/elasticsearch/config/elasticsearch.yml
But it shows the following error: [2015-11-01 20:36:50,574][INFO ][bootstrap] es.config is no longer supported. elasticsearch.yml must be placed in the config directory and cannot be renamed.
I tried several alternative ways of run it, like:
elasticsearch -f -D
But then I get the following error, and I can't find any useful for solve it, it seems to be related with file perms but not sure:
java.io.IOException: Resource not found: "org/joda/time/tz/data/ZoneInfoMap" ClassLoader: sun.misc.Launcher$AppClassLoader#33909752
at org.joda.time.tz.ZoneInfoProvider.openResource(ZoneInfoProvider.java:210)
at org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:127)
at org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:86)
at org.joda.time.DateTimeZone.getDefaultProvider(DateTimeZone.java:514)
at org.joda.time.DateTimeZone.getProvider(DateTimeZone.java:413)
at org.joda.time.DateTimeZone.forID(DateTimeZone.java:216)
at org.joda.time.DateTimeZone.getDefault(DateTimeZone.java:151)
at org.joda.time.chrono.ISOChronology.getInstance(ISOChronology.java:79)
at org.joda.time.DateTimeUtils.getChronology(DateTimeUtils.java:266)
at org.joda.time.format.DateTimeFormatter.selectChronology(DateTimeFormatter.java:968)
at org.joda.time.format.DateTimeFormatter.printTo(DateTimeFormatter.java:672)
at org.joda.time.format.DateTimeFormatter.printTo(DateTimeFormatter.java:560)
at org.joda.time.format.DateTimeFormatter.print(DateTimeFormatter.java:644)
at org.elasticsearch.Build.<clinit>(Build.java:51)
at org.elasticsearch.node.Node.<init>(Node.java:135)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:145)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:170)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
[2015-11-01 20:40:57,602][INFO ][node ] [Centurius] version[2.0.0], pid[22063], build[de54438/2015-10-22T08:09:48Z]
[2015-11-01 20:40:57,605][INFO ][node ] [Centurius] initializing ...
Exception in thread "main" java.lang.IllegalStateException: failed to load bundle [] due to jar hell
Likely root cause: java.security.AccessControlException: access denied ("java.io.FilePermission" "/usr/local/Cellar/elasticsearch/2.0.0/libexec/antlr-runtime-3.5.jar" "read")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
at java.security.AccessController.checkPermission(AccessController.java:884)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
at java.util.zip.ZipFile.<init>(ZipFile.java:210)
at java.util.zip.ZipFile.<init>(ZipFile.java:149)
at java.util.jar.JarFile.<init>(JarFile.java:166)
at java.util.jar.JarFile.<init>(JarFile.java:103)
at org.elasticsearch.bootstrap.JarHell.checkJarHell(JarHell.java:173)
at org.elasticsearch.plugins.PluginsService.loadBundles(PluginsService.java:340)
at org.elasticsearch.plugins.PluginsService.<init>(PluginsService.java:113)
at org.elasticsearch.node.Node.<init>(Node.java:144)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:145)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:170)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
Refer to the log for complete error details.
Thanks for your help.
There are some changes with libexec with Elasticsearch/homebrew installation and that is why it is failing to start. There is a PR #45644 currently being worked on. Till the PR gets accepted, you can use the same formula to fix the installation of Elasticsearch.
First uninstall the earlier/older version. Then edit the formula of Elasticsearch:
$ brew edit elasticsearch
And use the formula from the PR.
Then do brew install elasticsearch, it should work fine.
To start Elasticsearch, just do:
$ elasticsearch
config option is no longer valid. For custom config, use path.config:
$ elasticsearch --path.conf=/usr/local/opt/elasticsearch/config

hadoop only launch local job by default why?

I have written my own hadoop program and I can run using pseudo distribute mode in my own laptop, however, when I put the program in the cluster which can run example jar of hadoop, it by default launches the local job though I indicate the hdfs file path, below is the output, give suggestions?
./hadoop -jar MyRandomForest_oob_distance.jar hdfs://montana-01:8020/user/randomforest/input/genotype1.txt hdfs://montana-01:8020/user/randomforest/input/phenotype1.txt hdfs://montana-01:8020/user/randomforest/output1_distance/ hdfs://montana-01:8020/user/randomforest/input/genotype101.txt hdfs://montana-01:8020/user/randomforest/input/phenotype101.txt 33 500 1
12/03/16 16:21:25 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/03/16 16:21:25 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/03/16 16:21:25 INFO mapred.JobClient: Running job: job_local_0001
12/03/16 16:21:25 INFO mapred.MapTask: io.sort.mb = 100
12/03/16 16:21:25 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/16 16:21:25 INFO mapred.MapTask: record buffer = 262144/327680
12/03/16 16:21:25 WARN mapred.LocalJobRunner: job_local_0001
java.io.FileNotFoundException: File /user/randomforest/input/genotype1.txt does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:125)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356)
at Data.Data.loadData(Data.java:103)
at MapReduce.DearMapper.loadData(DearMapper.java:261)
at MapReduce.DearMapper.setup(DearMapper.java:332)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
12/03/16 16:21:26 INFO mapred.JobClient: map 0% reduce 0%
12/03/16 16:21:26 INFO mapred.JobClient: Job complete: job_local_0001
12/03/16 16:21:26 INFO mapred.JobClient: Counters: 0
Total Running time is: 1 secs
LocalJobRunner has been chosen as your configuration most probably has the mapred.job.tracker property set to local or has not been set at all (in which case the default is local). To check, go to "wherever you extracted/installed hadoop"/etc/hadoop/ and see if the file mapred-site.xml exists (for me it did not, a file called mapped-site.xml.template was there). In that file (or create it if it doesn't exist) make sure it has the following property:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
See the source for org.apache.hadoop.mapred.JobClient.init(JobConf)
What is the value of this configuration property in the hadoop configuration on the machine you are submitting this from? Also confirm that the hadoop executable you are running references this configuration (and that you don't have 2+ installations configured differently) - type which hadoop and trace any symlinks you come across.
Alternatively you can override this when you submit your job, if you know the JobTracker host and port number using the -jt option:
hadoop jar MyRandomForest_oob_distance.jar -jt hostname:port hdfs://montana-01:8020/user/randomforest/input/genotype1.txt hdfs://montana-01:8020/user/randomforest/input/phenotype1.txt hdfs://montana-01:8020/user/randomforest/output1_distance/ hdfs://montana-01:8020/user/randomforest/input/genotype101.txt hdfs://montana-01:8020/user/randomforest/input/phenotype101.txt 33 500 1
If you're using Hadoop 2 and your job is running locally instead of on the cluster, ensure that you have setup mapred-site.xml to contain the mapreduce.framework.name property with a value of yarn. You also need to set up an aux-service in yarn-site.xml
Checkout the Cloudera Hadoop 2 operator migration blog for more information.
I had the same problem that every mapreduce v2 (mrv2) or yarn task only ran with the mapred.LocalJobRunner
INFO mapred.LocalJobRunner: Starting task: attempt_local284299729_0001_m_000000_0
The Resourcemanager and Nodemanagers were accessible and the mapreduce.framework.name was set to yarn.
Setting the HADOOP_MAPRED_HOME before executing the job fixed the problem for me.
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
cheers
dan

Resources