Adding Local Files in Beeline (Hive) - apache-hive

I'm trying to add local files via the Beeline client, however I keep running into an issue where it tells me the file does not exist.
[test#test-001 tmp]$ touch /tmp/m.py
[test#test-001 tmp]$ stat /tmp/m.py
File: ‘/tmp/m.py’
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 801h/2049d Inode: 34091464 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1036/ test) Gid: ( 1037/ test)
Context: unconfined_u:object_r:user_tmp_t:s0
Access: 2017-02-27 22:04:06.527970709 +0000
Modify: 2017-02-27 22:04:06.527970709 +0000
Change: 2017-02-27 22:04:06.527970709 +0000
Birth: -
[test#test-001 tmp]$ beeline -u jdbc:hive2://hs2-test:10000/default -n r-zubis
Connecting to jdbc:hive2://hs2-test:10000/default
Connected to: Apache Hive (version 1.2.1.2.3.0.0-2557)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1 by Apache Hive
0: jdbc:hive2://hs2-test:10000/def> ADD FILE '/tmp/m.py';
Error: Error while processing statement: '/tmp/m.py' does not exist (state=,code=1)
0: jdbc:hive2://hs2-test:10000/def>
What's the issue?

You can only add files on the box HiveServer2 is running on. (and I needed to remove the quotes) I found it via a blog comment on Cloudera. Not sure why this isn't in the Beeline docs.

If, like me you are stuck in the position where HiveServer2 is running remotely, beeline will let you load the files from HDFS,
hdfs fs -put /tmp/m.py
then
beeline> add file hdfs:/user/homedir/m.py;

Related

Custom Runtime Won't Use Dockerfile

I have an App Engine service I deploy a custom runtime in a flexible environment. Deployments functioned normally on 11/20. On 11/21 gcloud app deploy stopped using the Dockerfile and began treating it as a non-custom runtime. Neither the app.yaml nor the Dockerfile have changed.
Below is a sample log from 11/20 and 11/21 respectively. You will note Using Dockerfile found in... of the first log is not present in the second log.
First log, 11/20:
2020-11-20 11:12:02,202 DEBUG root Loaded Command Group: ['gcloud', 'app']
2020-11-20 11:12:02,547 DEBUG root Loaded Command Group: ['gcloud', 'app', 'deploy']
2020-11-20 11:12:02,551 DEBUG root Running [gcloud.app.deploy] with arguments: [--project: "distributed-computing-qa", --version: "9-2-0rc9"]
2020-11-20 11:12:02,621 INFO oauth2client.client Refreshing access_token
2020-11-20 11:12:03,043 DEBUG root Loading runtimes experiment config from [gs://runtime-builders/experiments.yaml]
2020-11-20 11:12:03,076 INFO root Reading [<googlecloudsdk.api_lib.storage.storage_util.ObjectReference object at 0x0000021920ECA548>]
2020-11-20 11:12:03,526 DEBUG root API endpoint: [https://appengine.googleapis.com/], API version: [v1]
2020-11-20 11:12:04,419 INFO ___FILE_ONLY___ Services to deploy:
2020-11-20 11:12:04,420 INFO ___FILE_ONLY___ descriptor: [C:\Users\Benjamin Filkins\Documents\Projects\Deployment\QA\dci\app.yaml]
source: [C:\Users\Benjamin Filkins\Documents\Projects\Deployment\QA\dci]
target project: [distributed-computing-qa]
target service: [default]
target version: [9-2-0rc9]
target url: [https://distributed-computing-qa.uc.r.appspot.com]
2020-11-20 11:12:05,272 DEBUG root No bucket specified, retrieving default bucket.
2020-11-20 11:12:05,274 DEBUG root Using bucket [gs://staging.distributed-computing-qa.appspot.com].
2020-11-20 11:12:05,941 DEBUG root Service [appengineflex.googleapis.com] is already enabled for project [distributed-computing-qa]
2020-11-20 11:12:06,109 INFO ___FILE_ONLY___ Beginning deployment of service [default]...
2020-11-20 11:12:06,123 INFO root Ignoring directory [node_modules]: Directory matches ignore regex.
2020-11-20 11:12:09,085 INFO root Ignoring directory [server\node_modules]: Directory matches ignore regex.
2020-11-20 11:12:09,679 INFO root Using Dockerfile found in C:\Users\Benjamin Filkins\Documents\Projects\Deployment\QA\dci
2020-11-20 11:12:09,679 INFO ___FILE_ONLY___ Building and pushing image for service [default]
2020-11-20 11:12:10,305 DEBUG root Could not call git with args ('config', '--get-regexp', 'remote\\.(.*)\\.url'): Command '['git', 'config', '--get-regexp', 'remote\\.(.*)\\.url']' returned non-zero exit status 1.
2020-11-20 11:12:10,305 INFO root Could not generate [source-context.json]: Could not list remote URLs from source directory: C:\Users\Benjamin Filkins\Documents\Projects\Deployment\QA\dci
2020-11-20 11:12:37,592 INFO root Uploading [C:\Users\BENJAM~1\AppData\Local\Temp\tmpwbdhi28f\src.tgz] to [staging.distributed-computing-qa.appspot.com/us.gcr.io/distributed-computing-qa/appengine/default.9-2-0rc9:latest]
2020-11-20 11:13:03,413 DEBUG root Using builder image: [gcr.io/cloud-builders/docker]
Second log, 11/21:
2020-11-21 05:10:39,041 DEBUG root Loaded Command Group: ['gcloud', 'app']
2020-11-21 05:10:39,177 DEBUG root Loaded Command Group: ['gcloud', 'app', 'deploy']
2020-11-21 05:10:39,181 DEBUG root Running [gcloud.app.deploy] with arguments: [--project: "distributed-computing-qa", --version: "9-2-0rc10"]
2020-11-21 05:10:39,203 DEBUG root Loading runtimes experiment config from [gs://runtime-builders/experiments.yaml]
2020-11-21 05:10:39,231 INFO root Reading [<googlecloudsdk.api_lib.storage.storage_util.ObjectReference object at 0x000001E60B3ED208>]
2020-11-21 05:10:39,522 DEBUG root API endpoint: [https://appengine.googleapis.com/], API version: [v1]
2020-11-21 05:10:40,196 INFO ___FILE_ONLY___ Services to deploy:
2020-11-21 05:10:40,198 INFO ___FILE_ONLY___ descriptor: [C:\Users\Benjamin
Filkins\Documents\Projects\Deployment\QA\dci\app.yaml]
source: [C:\Users\Benjamin Filkins\Documents\Projects\Deployment\QA\dci]
target project: [distributed-computing-qa]
target service: [default]
target version: [9-2-0rc10]
target url: [https://distributed-computing-qa.uc.r.appspot.com]
2020-11-21 05:10:44,749 DEBUG root No bucket specified, retrieving default bucket.
2020-11-21 05:10:44,758 DEBUG root Using bucket [gs://staging.distributed-computing-qa.appspot.com].
2020-11-21 05:10:45,460 DEBUG root Service [appengineflex.googleapis.com] is already enabled for project [distributed-computing-qa]
2020-11-21 05:10:45,645 INFO ___FILE_ONLY___ Beginning deployment of service [default]...
2020-11-21 05:10:45,658 INFO root Ignoring directory [node_modules]: Directory matches ignore regex.
2020-11-21 05:10:48,255 INFO root Ignoring directory [server\node_modules]: Directory matches ignore regex.
2020-11-21 05:10:57,261 DEBUG root Could not call git with args ('config', '--get-regexp', 'remote\\.(.*)\\.url'): Command '['git', 'config', '--get-regexp', 'remote\\.(.*)\\.url']' returned non-zero exit status 1.
2020-11-21 05:10:57,261 INFO root Could not find any remote repositories associated with [C:\Users\Benjamin Filkins\Documents\Projects\Deployment\QA\dci]. Cloud diagnostic tools may not be able to display the correct source code for this deployment.
2020-11-21 05:11:19,099 DEBUG root Skipping upload of [.env]
2020-11-21 05:11:19,099 INFO root Incremental upload skipped 100.0% of data
There are four separate projects this is now occurring on. A co-worker can also confirm the same behavior. What I have tried and can confirm:
Updated Google Cloud SDK to latest version (319.0.0)
Confirmed Cloud Build API is active
Confirmed the Cloud Build service account has the App Engine Admin, Cloud Build Service Account and Service Account User roles
App.yaml and Dockerfile present in root and unchanged between attempts
App.yaml contains runtime: custom and env: flex
What I cannot confirm with certainty or prove did not have an impact:
Changes in OS (Windows 10), though no update had occurred during this time period
Changes in my GCP service account roles/permissions, though given the spread across four distinct projects and impacting multiple users seems incredibly unlikely
Any additional insight into this issue or additional items I may have missed would be greatly appreciated.
I have solved the issue by downgrading to SDK version 271.0.0. My machine has both Python 2.7 and 3 and I noted 274 and above began support for using Python 3.
Upgrading to 274 or above results in the reported issue. 273 and below (I only went as far as 267) does not have the reported issues. While I am currently unable to provide concrete evidence, my suspicion would be down to the SDK's ability to determine which version of Python to prefer. As noted here support of Python 2 was deprecated on 09/30/2020.

Brew postinstall mysql#5.7 complaining about data directory not empty when it is empty

Having a lot of trouble installing mysql 5.7 on Mac Mojave, (ran 'brew install mysql#5.7')
on initial install, got msg saying postinstall was not completed successfully (please see msg below).
So, after I delete everything in the directory /usr/local/var/mysql (which mysql says is not empty), I STILL get same message when re-running postinstall command ... (which is quite annoying seems MySQL is populating the data dir then complaining it is not empty?!)
[08:02:48][~/tmp]#brew postinstall mysql#5.7
==> Postinstalling mysql#5.7
==> /usr/local/Cellar/mysql#5.7/5.7.28/bin/mysqld --initialize-insecure --user=gert --basedir=/usr/local/Cellar/mysql#5.7/5.7.28 --datadir=/usr/local/var/my Last 15 lines from /Users/gert/Library/Logs/Homebrew/mysql#5.7/post_install.01.mysqld: 2019-12-09 08:03:39 +0200
/usr/local/Cellar/mysql#5.7/5.7.28/bin/mysqld
--initialize-insecure
--user=gert
--basedir=/usr/local/Cellar/mysql#5.7/5.7.28
--datadir=/usr/local/var/mysql
--tmpdir=/tmp
2019-12-09T06:03:39.151987Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use
--explicit_defaults_for_timestamp server option (see documentation for more details). 2019-12-09T06:03:39.154025Z 0
[ERROR] --initialize specified but the data directory has files in it. Aborting. 2019-12-09T06:03:39.154074Z 0 [ERROR] Aborting
Trying to start mysql as root gives error:
[08:04:41][~/tmp]#sudo /usr/local/opt/mysql#5.7/bin/mysql.server start
Password: Starting MySQL ..... ERROR! The server quit without updating
PID file (/var/run/mysqld/mysqld.pid).
Banging head against wall for days now trying to follow StackOverflow posts MySql server startup error 'The server quit without updating PID file ', none of which is working ...
My my.cnf:
[mysqld]
# Only allow connections from localhost
#bind-address = 127.0.0.1
#SO posts said to comment out the above ...
pid-file = /var/run/mysqld/mysqld.pid #Checked, this folder + file exists, with write permissions
Try using a data dir away from the mysql directory i.e if mysql is in /usr/local/mysql, use the data dir as /var/data.
root#photon [ /var ]# /usr/local/mysql/bin/mysqld --initialize-insecure --user=mysql --datadir=/var/data
2020-02-22T21:42:27.121230Z 0 [System] [MY-013169] [Server] /usr/local/mysql/bin/mysqld (mysqld 8.0.19) initializing of server in progress as process 820
2020-02-22T21:42:35.018238Z 5 [Warning] [MY-010453] [Server] root#localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.

Starting Zabbix Server within docker replaces strings with nothing in config file →

→ or totally ignored strings like name of new DB for testing purposes.
Firstly tries to add something about ~250 to 250 already added hosts and Z-server shutted down. I've restarted it and inside docker logs I saw this:
6:20191014:091840.201 using configuration file: /etc/zabbix/zabbix_server.conf
6:20191014:091840.223 current database version (mandatory/optional): 04020000/04020001
6:20191014:091840.223 required mandatory version: 04020000
6:20191014:091840.484 __mem_malloc: skipped 7 asked 108424 skip_min 304 skip_max 12192
6:20191014:091840.484 [file:dbconfig.c,line:94] __zbx_mem_realloc(): out of memory (requested 108424 bytes)
6:20191014:091840.484 [file:dbconfig.c,line:94] __zbx_mem_realloc(): please increase CacheSize configuration parameter
6:20191014:091840.484 === memory statistics for configuration cache ===
Solution for those problem was to increase CacheSize in zabbix_server.conf . Okay, that's not a problem and after this Im push a new config to Z-server and restart it... → and z-server stops already after start and logs says the same problem. After reading config in container I saw what string what I corrected to matching my wishes are missing O_o. Strings are deleted.
My config:
LogType=console
DBHost=postgres-server
DBName=zabbix_pwd
DBSchema=public
DBUser=zabbix
DBPassword=zabbix
DBPort=5432
StartPollers=5
StartIPMIPollers=5
StartPollersUnreachable=5
SNMPTrapperFile=/var/lib/zabbix/snmptraps/snmptraps.log
StartSNMPTrapper=1
CacheSize=512M
HistoryCacheSize=512M
HistoryIndexCacheSize=512M
TrendCacheSize=512m
ValueCacheSize=256M
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/sbin/fping
Fping6Location=/usr/sbin/fping6
SSHKeyLocation=/var/lib/zabbix/ssh_keys
SSLCertLocation=/var/lib/zabbix/ssl/certs/
SSLKeyLocation=/var/lib/zabbix/ssl/keys/
SSLCALocation=/var/lib/zabbix/ssl/ssl_ca/
LoadModulePath=/var/lib/zabbix/modules/
And what I've getting after starting z-server:
LogType=console
DBHost=postgres-server
DBName=zabbix_pwd
DBSchema=public
DBUser=zabbix
DBPassword=zabbix
DBPort=5432
SNMPTrapperFile=/var/lib/zabbix/snmptraps/snmptraps.log
StartSNMPTrapper=1
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/sbin/fping
Fping6Location=/usr/sbin/fping6
SSHKeyLocation=/var/lib/zabbix/ssh_keys
SSLCertLocation=/var/lib/zabbix/ssl/certs/
SSLKeyLocation=/var/lib/zabbix/ssl/keys/
SSLCALocation=/var/lib/zabbix/ssl/ssl_ca/
LoadModulePath=/var/lib/zabbix/modules/
Any suggestions to how-to rule the world and don't be captured by doctors ?
With docker you need to send conf parameters in the docker-compose.yml file, or in your docker run command using the -e :
For example from my docker yml file:
zabbix-server:
image: zabbix/zabbix-server-pgsql:ubuntu-4.2.6
environment:
ZBX_MAXHOUSEKEEPERDELETE: 5000
ZBX_STARTPOLLERS: 15
ZBX_CACHESIZE: 8M
ZBX_STARTDBSYNCERS: 4
ZBX_HISTORYCACHESIZE: 16M
ZBX_TRENDCACHESIZE: 4M
ZBX_VALUECACHESIZE: 8M
ZBX_LOGSLOWQUERIES: 3000
Another way to work with zabbix:
https://hub.docker.com/r/monitoringartist/zabbix-3.0-xxl/

Starting neo4j from docker container shows "Neo4j is not running"

I have been trying to use use neo4j community in a container and am getting errors. I think this might more a docker usage issues rather than neo4j usage.
I have built a container image from https://github.com/neo4j/docker-neo4j-publish 2.3.9, 3.3.3, 3.3.4 and 3.3.5 (only differences being some new ports in later versions). I have even pulled a native 3.3.3 from dockerhub.com
mkdir /tmp/data
chmod 777 /tmp/data
docker run --detach=true --name=neo4j --publish=7474:7474 --publish=7687:7687 --publish=7473:7473 --volume=/tmp/data:/data neo4j:3.3.3
docker exec -it neo4j find / -name '*.log'
and although it seems to be working with
neo4j> CREATE (n);
0 rows available after 50 ms, consumed after another 0 ms
Added 1 nodes
neo4j> CREATE (m),(o);
0 rows available after 15 ms, consumed after another 0 ms
Added 2 nodes
neo4j> MATCH (n) RETURN n;
+----+
| n |
+----+
| () |
| () |
| () |
+----+
3 rows available after 21 ms, consumed after another 8 ms
I actually get errors like this:
docker exec -it neo4j neo4j status
Neo4j is not running
Now this one looks like I am mistakenly trying to start another instance of Neo4j over a running instance:
docker exec -it neo4j neo4j console
Active database: graph.db
Directories in use:
home: /var/lib/neo4j
config: /var/lib/neo4j/conf
logs: /var/lib/neo4j/logs
plugins: /var/lib/neo4j/plugins
import: /var/lib/neo4j/import
data: /var/lib/neo4j/data
certificates: /var/lib/neo4j/certificates
run: /var/lib/neo4j/run
Starting Neo4j.
2018-04-15 06:30:13.119+0000 WARN Unknown config option: causal_clustering.discovery_listen_address
2018-04-15 06:30:13.123+0000 WARN Unknown config option: causal_clustering.raft_advertised_address
2018-04-15 06:30:13.123+0000 WARN Unknown config option: causal_clustering.raft_listen_address
2018-04-15 06:30:13.123+0000 WARN Unknown config option: ha.host.coordination
2018-04-15 06:30:13.124+0000 WARN Unknown config option: causal_clustering.transaction_advertised_address
2018-04-15 06:30:13.124+0000 WARN Unknown config option: causal_clustering.discovery_advertised_address
2018-04-15 06:30:13.124+0000 WARN Unknown config option: ha.host.data
2018-04-15 06:30:13.124+0000 WARN Unknown config option: causal_clustering.transaction_listen_address
2018-04-15 06:30:13.146+0000 INFO ======== Neo4j 3.3.3 ========
2018-04-15 06:30:13.186+0000 INFO Starting...
2018-04-15 06:30:13.997+0000 INFO Bolt enabled on 0.0.0.0:7687.
2018-04-15 06:30:14.094+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#44a59da3' was successfully initialized, but failed to start. Please see the attached cause exception "Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#44a59da3' was successfully initialized, but failed to start. Please see the attached cause exception "Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)".
org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#44a59da3' was successfully initialized, but failed to start. Please see the attached cause exception "Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)".
Does anybody have experience with Neo4j's docker implementation? Is it a single threaded issue meaning I need to call the CLI tools differently from the container?
The neo4j status command only works if you've started neo4j with neo4j start. Start creates a neo4j.pid file that status uses to see if neo4j is running. Starting under docker uses the console option instead of the start option. This does not create the PID file, so the status doesn't work. But that hardly matters, because neo4j is just about the only process running; if neo4j dies, the container will exit. If docker ps -a says that the container is up, then neo4j is up.

Graylog 2.2.0-beta.1 in Docker with UDP input: Unable to load default stream

I'm trying to use graylog2 to collect logs from docker containers. Docs says that only UDP GELF input is supported for this purpose.
I'm using docker-compose to run the graylog server. See gist for all files used: https://gist.github.com/olegabr/7f5190c453bb63c71dabf151d2373c2f.
And I'm using this command to test it:
sendip -p ipv4 -is 127.0.0.1 -p udp -us 5070 -ud 12201 -d '{"version": "1.1","host":"example.org","short_message":"Short message","full_message":"Backtrace here\n\nmore stuff","level":1,"_user_id":9001,"_some_info":"foo","_some_env_var":"bar"}' -v 127.0.0.1
Server receives this message, but it can not process it. I see following in the graylog2 logs:
2016-12-09 11:53:20,125 WARN : org.graylog2.bindings.providers.DefaultStreamProvider - Unable to load default stream, tried 1 times, retrying every 500ms. Processing is blocked until this succeeds.
2016-12-09 11:53:25,129 WARN : org.graylog2.bindings.providers.DefaultStreamProvider - Unable to load default stream, tried 11 times, retrying every 500ms. Processing is blocked until this succeeds.
e.t.c. many many similar lines.
The API call curl http://admin:123456#127.0.0.1:9000/api/count/total returns
{"events":0}
In the server logs I see that the default stream was initialized:
mongo_1 | 2016-12-09T11:51:12.522+0000 I INDEX [conn3] build index on: graylog.pipeline_processor_pipelines_streams properties: { v: 2, unique: true, key: { stream_id: 1 }, name: "stream_id_1", ns: "graylog.pipeline_processor_pipelines_streams" }
graylog_1 | 2016-12-09 11:51:13,408 INFO : org.graylog2.periodical.Periodicals - Starting [org.graylog.plugins.pipelineprocessor.periodical.LegacyDefaultStreamMigration] periodical, running forever.
graylog_1 | 2016-12-09 11:51:13,424 INFO : org.graylog.plugins.pipelineprocessor.periodical.LegacyDefaultStreamMigration - Legacy default stream has no connections, no migration needed.
graylog_1 | 2016-12-09 11:51:13,487 INFO : org.graylog2.migrations.V20160929120500_CreateDefaultStreamMigration - Successfully created default stream: All messages
graylog_1 | 2016-12-09 11:51:13,653 INFO : org.graylog2.migrations.V20161125142400_EmailAlarmCallbackMigration - No streams needed to be migrated.
graylog_1 | 2016-12-09 11:51:13,662 INFO : org.graylog2.migrations.V20161125161400_AlertReceiversMigration - No streams needed to be migrated.
graylog_1 | 2016-12-09 11:51:13,672 INFO : org.graylog2.migrations.V20161130141500_DefaultStreamRecalcIndexRanges - Cluster not connected yet, delaying migration until it is reachable.
So, why it can not be loaded when the message arrives? Why it is needed in the first place?
I've tried to find similar reports in web but with no success.
This has nothing to do with the UDP input per se.
Graylog 2.2.0-beta.1 is broken and shouldn't be used. Please downgrade to Graylog 2.1.2 (the latest stable version) or wait for Graylog 2.2.0-beta.2.
See https://groups.google.com/forum/#!searchin/graylog2/docker|sort:date/graylog2/gCycC3_K3vU/EL-Lz_uNDQAJ for a related post on the Graylog mailing list.
same trouble
just setup graylog and configure input gelf udp 12209 port
then test it twice by:
docker run --log-driver=gelf --log-opt gelf-address=udp://127.0.0.1:12209 busybox echo Hello Graylog
in UI i saw:
2 messages in process buffe
2 unprocessed messages are currently in the journal, in 1 segments.
0 messages have been appended in the last second, 0 messages have been read in the last second.
and still getting:
2016-12-09 12:41:23,715 INFO : org.graylog2.inputs.InputStateListener - Input [GELF UDP/584aa67308813b00010d009e] is now RUNNING
2016-12-09 12:41:43,666 WARN : org.graylog2.bindings.providers.DefaultStreamProvider - Unable to load default stream, tried 1 times, retrying every 500ms. Processing is blocked until this succeeds.
anyone have found solution ?

Resources