FLUME EXCEPTION - twitter

I am trying to configure flume and am following this link. The following command works for me:
flume-ng agent -n TwitterAgent -c conf -f /usr/lib/apache-flume-1.7.0-bin/conf/flume.conf
The result I got with error is,
17/01/31 12:04:08 INFO source.DefaultSourceFactory: Creating instance of source Twitter, type com.cloudera.flume.source.TwitterSource
17/01/31 12:04:08 ERROR node.PollingPropertiesFileConfigurationProvider: Failed to load configuration data.
Exception follows. org.apache.flume.FlumeException:
Unable to load source type:
com.cloudera.flume.source.TwitterSource, class:
com.cloudera.flume.source.TwitterSource.
(This is part of the result, I just copied the error part of it)
Can anyone help to solve this error please? I need to fix it to go on step 24 which is the last step.

Please find CDH 5.12 Flume Twitter Setup:
1. Here is file /usr/lib/flume-ng/conf/flume.conf:
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type= com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = xxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.consumerSecret = xxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.accessToken = xxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.accessTokenSecret = xxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.keywords = Hadoop,BigData
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://quickstart.cloudera:8020/user/cloudera/flume/tweets/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
2. Rename the below flume-env.sh.template file as flume-env.sh
~]$ sudo cp /usr/lib/flume-ng/conf/flume-env.sh.template /usr/lib/flume-ng/conf/flume-env.sh
3. Set JAVA_HOME and FLUME_CLASSPATH in flume-env.sh file as:
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
FLUME_CLASSPATH="/usr/lib/flume-ng/lib/flume-sources-1.0-SNAPSHOT.jar"
4. If you don't find "/usr/lib/flume-ng/lib/flume-sources-1.0-SNAPSHOT.jar" on your system then download the apache-flume-1.6.0-bin from google and copy lib folder of this to current lib folder.
Link: https://www.apache.org/dist/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz
4.1. Rename old lib folder
4.2. Download this above link to your cloudera desktop and do the following:
~]$ sudo mv /usr/lib/flume-ng/lib /usr/lib/flume-ng/lib_cloudera
~]$ sudo mv /home/cloudera/Desktop/apache-flume-1.6.0-bin/lib /usr/lib/flume-ng/lib
5. Now run Flume Agent Command:
~]$ flume-ng agent --conf-file /usr/lib/flume-ng/conf/flume.conf --name TwitterAgent -Dflume.root.logger=INFO,console -n TwitterAgent
This should run successfully.
All the Best.

Related

Error: getDatabaseDefaults() failed. Do dumpStack() to see details

While following this tutorial to create OAM container:
https://docs.oracle.com/en/middleware/idm/access-manager/12.2.1.4/tutorial-oam-docker/
i am faced with the follwoing error:
------------ log start -------------
CONNECTION_STRING=oamDB:1521
RCUPREFIX=OAM04
DOMAIN_HOME=/u01/oracle/user_projects/domains/access_domain
INFO: Admin Server not configured. Will run RCU and Domain Configuration Phase...
Configuring Domain for first time
Start the Admin and Managed Servers
Loading RCU Phase
CONNECTION_STRING=oamDB:1521
RCUPREFIX=OAM04
jdbc_url=jdbc:oracle:thin:#oamDB:1521
Creating Domain 1st execution
RCU has already been loaded.. skipping
Domain Configuration Phase
/u01/oracle/oracle_common/common/bin/wlst.sh -skipWLSModuleScanning /u01/oracle/dockertools/create_domain.py -oh /u01/oracle -jh /u01/jdk -parent /u01/oracle/user_projects/domains -name access_domain -user weblogic -password weblogic1 -rcuDb oamDB:1521 -rcuPrefix OAM04 -rcuSchemaPwd oamdb1234 -isSSLEnabled true
Cmd is /u01/oracle/oracle_common/common/bin/wlst.sh -skipWLSModuleScanning /u01/oracle/dockertools/create_domain.py -oh /u01/oracle -jh /u01/jdk -parent /u01/oracle/user_projects/domains -name access_domain -user weblogic -password weblogic1 -rcuDb oamDB:1521 -rcuPrefix OAM04 -rcuSchemaPwd oamdb1234 -isSSLEnabled true
Initializing WebLogic Scripting Tool (WLST) ...
Welcome to WebLogic Server Administration Scripting Shell
Type help() for help on available commands
create_domain.py called with the following inputs:
INFO: sys.argv[0] = /u01/oracle/dockertools/create_domain.py
INFO: sys.argv[1] = -oh
INFO: sys.argv[2] = /u01/oracle
INFO: sys.argv[3] = -jh
INFO: sys.argv[4] = /u01/jdk
INFO: sys.argv[5] = -parent
INFO: sys.argv[6] = /u01/oracle/user_projects/domains
INFO: sys.argv[7] = -name
INFO: sys.argv[8] = access_domain
INFO: sys.argv[9] = -user
INFO: sys.argv[10] = weblogic
INFO: sys.argv[11] = -password
INFO: sys.argv[12] = weblogic1
INFO: sys.argv[13] = -rcuDb
INFO: sys.argv[14] = oamDB:1521
INFO: sys.argv[15] = -rcuPrefix
INFO: sys.argv[16] = OAM04
INFO: sys.argv[17] = -rcuSchemaPwd
INFO: sys.argv[18] = oamdb1234
INFO: sys.argv[19] = -isSSLEnabled
INFO: sys.argv[20] = true
INFO: Creating Admin server...
INFO: Enabling SSL PORT for AdminServer...
Creating Node Managers...
Will create Base domain at /u01/oracle/user_projects/domains/access_domain
Writing base domain...
Base domain created at /u01/oracle/user_projects/domains/access_domain
Extending domain at /u01/oracle/user_projects/domains/access_domain
Database oamDB:1521
Apply Extension templates
Extension Templates added
Extension Templates added
Deleting oam_server1
The default oam_server1 coming from the oam extension template deleted
Deleting oam_policy_mgr1
The default oam_server1 coming from the oam extension template deleted
Configuring JDBC Templates ...
Configuring the Service Table DataSource...
fmwDatabase jdbc:oracle:thin:#oamDB:1521
Getting Database Defaults...
Error: getDatabaseDefaults() failed. Do dumpStack() to see details.
Error: runCmd() failed. Do dumpStack() to see details.
Problem invoking WLST - Traceback (innermost last):
File "/u01/oracle/dockertools/create_domain.py", line 513, in ?
File "/u01/oracle/dockertools/create_domain.py", line 124, in createOAMDomain
File "/u01/oracle/dockertools/create_domain.py", line 328, in extendOamDomain
File "/u01/oracle/dockertools/create_domain.py", line 259, in configureJDBCTemplates
File "/tmp/WLSTOfflineIni6456738277719198193.py", line 267, in getDatabaseDefaults
File "/tmp/WLSTOfflineIni6456738277719198193.py", line 19, in command
Failed to build JDBC Connection object:
at com.oracle.cie.domain.script.jython.CommandExceptionHandler.handleException(CommandExceptionHandler.java:69)
at com.oracle.cie.domain.script.jython.WLScriptContext.handleException(WLScriptContext.java:3085)
at com.oracle.cie.domain.script.jython.WLScriptContext.runCmd(WLScriptContext.java:738)
at sun.reflect.GeneratedMethodAccessor141.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
com.oracle.cie.domain.script.jython.WLSTException: com.oracle.cie.domain.script.jython.WLSTException: Got exception when auto configuring the schema component(s) with data obtained from shadow table:
Failed to build JDBC Connection object:
Domain Creation Failed.. Please check the Domain Logs
------------ log end -------------
i am using the following docker run for admin server
docker run -d -p 7001:7001 --name oamadmin --network=OamNET --env-file /home/oam/oracle/oam-admin.env --shm-size="8g" --volume /home/oam/oracle/user_projects:/u01/oracle/user_projects oam:12.2.1.4
oam-admin.env content:
DOMAIN_NAME=access_domain
ADMIN_USER=weblogic
ADMIN_PASSWORD=weblogic1
ADMIN_LISTEN_HOST=oamadmin
ADMIN_LISTEN_PORT=7001
CONNECTION_STRING=oamDB:1521
RCUPREFIX=OAM04
DB_USER=sys
DB_PASSWORD=oamdb1234
DB_SCHEMA_PASSWORD=oamdb1234
oracle database is created using:
docker run -d --name oamDB --network=oamNET -p 1521:1521 -p 5500:5500 -e ORACLE_PWD=db1 -v /home/oam/user/host/dbtemp:/opt/oracle/oradata --env-file /home/oam/oracle/env.txt -it --shm-size="8g" -e ORACLE_EDITION=enterprise e ORACLE_ALLOW_REMOTE=true oamdb:19.3.0
i am able to connect to DB using docker
i have also executed:
alter user sys identified by oamdb1234 container=all; successfully.
containers running in docker:
oam#botrosubuntu:~/oracle$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d620fef9ddfc oamdb:19.3.0 "/bin/sh -c 'exec $O…" 9 days ago Up 3 hours (healthy) 0.0.0.0:1521->1521/tcp, 0.0.0.0:5500->5500/tcp oamDB

How to monitor Apache Flume agents status?

I know the Enterprise (Cloudera for example) way, by using a CM (via browser) or by Cloudera REST API one can access monitoring and configuring facilities.
But how to schedule (run and rerun) flume agents livecycle, and monitor their running/failure status without CM? Are there such things in the Flume distribution?
Flume's JSON Reporting API can be used to monitor health and performance.
Link
I tried adding flume.monitoring.type/port to flume-ng on start. And it completely fits my needs.
Lets create a simple agent a1 for example. Which listens on localhost:44444 and logs to console as a sink:
# flume.conf
a1.sources = s1
a1.channels = c1
a1.sinks = d1
a1.sources.s1.channels = c1
a1.sources.s1.type = netcat
a1.sources.s1.bind = localhost
a1.sources.s1.port = 44444
a1.sinks.d1.channel = c1
a1.sinks.d1.type = logger
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100
a1.channels.c1.transactionCapacity = 10
Run it with additional parameters flume.monitoring.type/port:
flume-ng agent -n a1 -c conf -f flume.conf -Dflume.root.logger=INFO,console -Dflume.monitoring.type=http -Dflume.monitoring.port=44123
And then monitor output in browser at localhost:44123/metrics
{"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","EventTakeSuccessCount":"570448","ChannelSize":"0","EventTakeAttemptCount":"570573","StartTime":"1567002601836","EventPutAttemptCount":"570449","EventPutSuccessCount":"570448","StopTime":"0"}}
Just try some load:
dd if=/dev/urandom count=1024 bs=1024 | base64 | nc localhost 44444

The run result of flume and test flume

enter image description here
enter image description here
my flume configuration file is as follows:
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /home/hadoop/flume-1.5.0-bin/log_exec_tail
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
And start my flume agent with the following stript:
bin/flume-ng agent -n a1 -c conf -f conf/flume_log.conf -Dflume.root.logger=INFO,console
question 1: the run result is as follows, however I don't konw if it run successful or not!
question 2: And there is the sentences as follows and I don't know what the mean is about "the queation of flume test":
NOTE: To test that the Flume agent is running properly, open a new terminal window and change directories to /home/horton/solutions/:
horton#ip:~$ cd /home/horton/solutions/
Run the following script, which writes log entries to nodemanager.log:
$ ./test_flume_log.sh
If successful, you should see new files in the /user/horton/flume_sink directory in HDFS
Stop the logagent Flume agent
As per your flume configuration, whenever the file /home/hadoop/flume-1.5.0-bin/log_exec_tail is changed, it will do a tail operation and append the results in the console.
So to test it working correctly,
1. run the command bin/flume-ng agent -n a1 -c conf -f conf/flume_log.conf -Dflume.root.logger=INFO,console
2. Open a terminal and add few lines in the file /home/hadoop/flume-1.5.0-bin/log_exec_tail
3. Save it
4. Now check the terminal where you triggered flume command
5. You can see newly added lines displayed

loading file into hdfs using flume

***I want to load a text file from my system into hdfs.
this is my conf file:
agent.sources = seqGenSrc
agent.sinks = loggerSink
agent.channels = memoryChannel
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F my.system.IP/D:/salespeople.txt
agent.sinks.loggerSink.type = hdfs
agent.sinks.loggerSink.hdfs.path = hdfs://IP.address:port:user/flume
agent.sinks.loggerSink.hdfs.filePrefix = events-
agent.sinks.loggerSink.hdfs.round = true
agent.sinks.loggerSink.hdfs.roundValue = 10
agent.sinks.loggerSink.hdfs.roundUnit = minute
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 1000
agent.channels.memoryChannel.transactionCapacity = 100
agent.sources.seqGenSrc.channels = memoryChannel
agent.sinks.loggerSink.channel = memoryChannel
** when i run it .. i get following .. and then it gets stuck.
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Starting Channel memoryChannel
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Waiting for channel:
memoryChannel to start. Sleeping for 500 ms
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink loggerSink
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Starting Source seqGenSrc
13/07/23 16:30:44 INFO source.ExecSource: Exec source starting with command:tail -F 10.48.226.27/D:/salespeople.txt
** where am i wrong, or what could be the error ??
I assume you want to write your file to /user/flume, so your path should be :
agent.sinks.loggerSink.hdfs.path = hdfs://IP.address:port/user/flume
As your agent uses tail -F there is no message that tells you it is finished (because it never is ^^). if you want to know if your file is created you have to look at /user/flume folder.
I'm using a configuration like yours and it works perfectly. You could try using
-Dflume.root.logger=INFO,console to have more information ?

Not able to get output in hdfs directory using hdfs as sink in flume

I am trying to give normal text file to flume as source and sink is hdfs ,the source ,channel and sink are showing registered and started but nothing is comming in output directory of hdfs.M new to flume can anyone help me through this???????
Conf for flume .conf file are
agent12.sources = source1
agent12.channels = channel1
agent12.sinks = HDFS
agent12.sources.source1.type = exec
agent12.sources.source1.command = tail -F /usr/sap/sample.txt
agent12.sources.source1.channels = channel1
agent12.sinks.HDFS.channels = channel1
agent12.sinks.HDFS.type = hdfs
agent12.sinks.HDFS.hdfs.path= hdfs://172.18.36.248:50070:user/root/xz
agent12.channels.channel1.type =memory
agent12.channels.channel1.capacity = 1000
agent started using
/usr/bin/flume-ng agent -n agent12 -c usr/lib//flume-ng/conf/sample.conf -f /usr/lib/flume-ng/conf/flume-conf.properties.template

Resources