Use MongoDB with docker-compose: create database and user - docker

Is possible to create a database using docker-compose? I'm trying to run mongodb on docker but I'm not able to create user and initial database :(
version: '3.1'
services:
mongo:
image: mongo
restart: always
environment:
- MONGO_INITDB_DATABASE=test
- MONGO_INITDB_ROOT_USERNAME=admin
- MONGO_INITDB_ROOT_PASSWORD=admin
volumes:
- ~/docker/volumes/mongodb:/data/db
ports:
- 27017:27017
What is missing in my script?
test code:
from pymongo import MongoClient
mongo_client = MongoClient('mongodb://%s:%s#127.0.0.1' % ('admin', 'admin'))
cursor = mongo_client.list_databases()
for db in cursor:
print(db)

The above docker-compose seems fine, it will create a user named admin and the DB named will be test if your *.js file contain insert data script. otherwise, it will not create Database because so if you do not insert data with your JavaScript files, then no database is created.
You can log in with these credential that specifies in env.
- MONGO_INITDB_DATABASE=test
- MONGO_INITDB_ROOT_USERNAME=test
- MONGO_INITDB_ROOT_PASSWORD=admin
To initialize DB, you need to mount you DB init script.
volumes:
- ~/docker/volumes/mongodb:/data/db
- youdbscript/init.js:/docker-entrypoint-initdb.d/
MONGO_INITDB_DATABASE
This variable allows you to specify the name of a database to be used
for creation scripts in /docker-entrypoint-initdb.d/*.js (see
Initializing a fresh instance below). MongoDB is fundamentally
designed for "create on first use", so if you do not insert data with
your JavaScript files, then no database is created.
Initializing a fresh instance
When a container is started for the first time it will execute files
with extensions .sh and .js that are found in
/docker-entrypoint-initdb.d. Files will be executed in alphabetical
order. .js files will be executed by mongo using the database
specified by the MONGO_INITDB_DATABASE variable, if it is present, or
test otherwise. You may also switch databases within the .js script.
mongo-docker
Update:
For your information the code you pasted code is not js file. its python script. Also, change the user name from admin might conflict with admin.
from pymongo import MongoClient
mongo_client = MongoClient('mongodb://%s:%s#127.0.0.1' % ('admin', 'admin'))
cursor = mongo_client.list_databases()
for db in cursor:
print(db)
You init script will some thing like
youdbscript/init.js
You can try this.
db = db.getSiblingDB("test");
db.article.drop();
db.article.save( {
title : "this is my title" ,
author : "bob" ,
posted : new Date(1079895594000) ,
pageViews : 5 ,
tags : [ "fun" , "good" , "fun" ] ,
comments : [
{ author :"joe" , text : "this is cool" } ,
{ author :"sam" , text : "this is bad" }
],
other : { foo : 5 }
});

Just like #Adiii answer. If you just declare a db name and not insert some data. The db will not created actually.
As your question. You can add one simple script (such as .sh or .js script) to the path /docker-entrypoint-initdb.d. Like the docs on mongo Initializing a fresh instance section.

Related

IoT-Agent OPC-UA Docker-compose setting for NGSI ld or NGSI v2

In the docker-composer files of the OPC-UA IoT-Agent there are some comments unclear to me, in particular at the line is told to comment if you want to use NGSI-LD or to comment the line if you want to use NGSI-V2.
Reading the strings that should be commented out however, it would seem that it is necessary to remove the comments from both the lines to use NGSI-LD, and comment both of them to use NGS-V2.
Is my interpretation correct? Thanks for clearing it up.
PS: the same issue is present to the file docker-compose-external-server.yml
Setting up NGSI-v2 vs NGSI-LD is common to all IoT Agents. The Installation Guide describes the required configuration - default operation is NGSI-v2.
If you want to operate NGSI-LD, the ngsiVersion and jsonLdContext must be defined.
{
host: '192.168.56.101',
port: '1026',
ngsiVersion: 'ld',
jsonLdContext: 'http://context.json-ld'
}
ngsiVersion can be v2, ld or mixed.
Both settings can also be set up using Environment Variables which is more convenient when using Docker
Therefore, for NGSI-LD the following minimal set-up is required:
iotage:
hostname: iotage
image: iotagent4fiware/iotagent-opcua:latest
environment:
- IOTA_CB_NGSI_VERSION=ld
- IOTA_JSON_LD_CONTEXT=https://path-to-context-file
- IOTA_FALLBACK_TENANT=opcua_car
- IOTA_RELAX_TEMPLATE_VALIDATION=true
For NGSI-v2 the following is required:
iotage:
hostname: iotage
image: iotagent4fiware/iotagent-opcua:latest
environment:
- IOTA_CB_NGSI_VERSION=v2
- IOTA_RELAX_TEMPLATE_VALIDATION=true
IOTA_RELAX_TEMPLATE_VALIDATION is required for OPC-UA to allow the provisioning of OPC-UA topics with = within them which would normally be disallowed.

How can you conditionally set services in the docker-compose.yml file?

I'm new to using Docker and Cake. At the moment we have a simple Cake task that runs the DockerComposeUp() method that takes a DockerComposeUpSettings object. The docker-compose.yaml file holds some info on a service that I want to conditionally run (serviceA):
version: "1.0"
services:
serviceA:
image: someImage
ports:
-"000000"
-"000001"
serviceB:
image: someOtherImage
anotherProperty: somethingElse
ports:
-"111111"
I've tried splitting out serviceA into a separate docker-compose file called 'docker-compose.serviceA.yaml' and calling it by adding to the DockerComposeUpSettings.ArgumentCustomization the following:
if(some setting)
{
dockerComposeUpSettings.ArgumentCustomization = builder => builder.Append("-f docker-compose.yaml -f docker-compose.serviceA.yaml");
}
However, Cake throws the following error:
"unknown shorthand flag: 'f' in -f"
How can I merge to docker-compose files as part of the DockerComposeUp method using Cake?
Update
I've found there is a 'Files' property on the DockerComposeUpSettings object (inherited from DockerComposeSettings object), where you can declare the configuration files. So I've added:
if(some flag)
{
dockerComposeSettings.Files = new[]{ "docker-compose.yaml", "docker-compose.serviceA.yaml" };
}
I don't know much about docker, but looking at the docs here and here it seems it would be important to have the -f option set before the command specified on the commandline. Your customization (builder.Append()) puts them at the end of the commandline.
Have you tried setting the Files property of the DockerComposeUpSettings? That looks like what you are looking for.

Error creating a keyspace with dataframes in spark cassandra

I'm trying to connect spark to cassandra and then I make a query to the keyspace and table from flask.
The problem is that when I run the web application I get an error saying that the keyspace in not created.
cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="Keyspace MyKeyspace does not exist"
In spark I run the following commands:
val flightRecommendations = finalPredictions.writeStream.foreachBatch {
(batchDF: DataFrame, batchId: Long) =>
batchDF
.write
.cassandraFormat("MytableName", "MyKeyspace")
.option("cluster", "cassandra_cluster")
.mode("append")
.save
}.start()
My question is whether the above code automatically generates the keyspace and the table.
I think it could also be a problem of connection because I'm working in docker and the setting I put is this:
spark.setCassandraConf("cassandra_cluster", CassandraConnectorConf.ConnectionHostParam.option("cassandra"))
Also in the spark-submit command I put the following two configurations:
--conf spark.cassandra.connection.host=cassandra \
--conf spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions \
It's weird because the spark-submit doesn't give errors but the keyspace is not created.
Yes, that's possible with since Spark Cassandra Connector 2.5.0. There is a new function createCassandraTableEx that allows to create a new table based on the Dataframe schema and it has an option to handle the cases where table does already exist (in addition to other things, like, control the sorting of the clustering columns, table options, etc.) - before 2.5.0 there was the createCassandraTable function, but it thrown exception if table already exists.
Here is example from the blog post that announces 2.5.0 release. For dataframe with following structure:
root
|-- id: integer (nullable = false)
|-- c: integer (nullable = false)
|-- t: string (nullable = true)
it's possible to create a new table using following code:
import com.datastax.spark.connector.cql.ClusteringColumn
import org.apache.spark.sql.cassandra._
import com.datastax.spark.connector._
data.createCassandraTableEx("test", "test_new", Seq("id"),
Seq(("c", ClusteringColumn.Descending)),
ifNotExists = true, tableOptions = Map("gc_grace_seconds" -> "1000"))
And you don't need to use foreachBatch with that version - it was required only before 2.5.0 - in new version you can just write:
val query = streamingCountsDF.writeStream
.outputMode(OutputMode.Update)
.option("checkpointLocation", ".../checkpoint")
.option("table", "tablename")
.("keyspace", "ksname")
.start()
And with Spark 3.x & SCC 3.x you can create keyspaces & tables in Cassandra using the Spark SQL - see documentation for more details.

pyhdfs.HdfsIOException: Failed to find datanode, suggest to check cluster health. excludeDatanodes=null

I am trying to run hadoop using docker provided here:
https://github.com/big-data-europe/docker-hadoop
I use the following command:
docker-compose up -d
to up the service and am able to access it and browse file system using: localhost:9870. Problem rises whenever I try to use pyhdfs to put file on HDFS. Here is my sample code:
hdfs_client = HdfsClient(hosts = 'localhost:9870')
# Determine the output_hdfs_path
output_hdfs_path = 'path/to/test/dir'
# Does the output path exist? If not then create it
if not hdfs_client.exists(output_hdfs_path):
hdfs_client.mkdirs(output_hdfs_path)
hdfs_client.create(output_hdfs_path + 'data.json', data = 'This is test.', overwrite = True)
If test directory does not exist on HDFS, the code is able to successfully create it but when it gets to the .create part it throws the following exception:
pyhdfs.HdfsIOException: Failed to find datanode, suggest to check cluster health. excludeDatanodes=null
What surprises me is that my code is able to create the empty directory but fails to put the file on HDFS. My docker-compose.yml file is exactly the same as the one provided in the github repo. The only change I've made is in the hadoop.env file where I change:
CORE_CONF_fs_defaultFS=hdfs://namenode:9000
to
CORE_CONF_fs_defaultFS=hdfs://localhost:9000
I have seen this other post on sof and tried the following command:
hdfs dfs -mkdir hdfs:///demofolder
which works fine in my case. Any help is much appreciated.
I would keep the default CORE_CONF_fs_defaultFS=hdfs://namenode:9000 setting.
Works fine for me after adding a forward slash to the paths
import pyhdfs
fs = pyhdfs.HdfsClient(hosts="namenode")
output_hdfs_path = '/path/to/test/dir'
if not fs.exists(output_hdfs_path):
fs.mkdirs(output_hdfs_path)
fs.create(output_hdfs_path + '/data.json', data = 'This is test.')
# check that it's present
list(fs.walk(output_hdfs_path))
[('/path/to/test/dir', [], ['data.json'])]

symfony/yaml backed symfony/config not parsing environment variables

I have recreated a simple example in this tiny github repo. I am attempting to use symfony/dependency-injection to configure monolog/monolog to write logs to php://stderr. I am using a yaml file called services.yml to configure dependency injection.
This all works fine if my yml file looks like this:
parameters:
log.file: 'php://stderr'
log.level: 'DEBUG'
services:
stream_handler:
class: \Monolog\Handler\StreamHandler
arguments:
- '%log.file%'
- '%log.level%'
log:
class: \Monolog\Logger
arguments: [ 'default', ['#stream_handler'] ]
However, my goal is to read the path of the log files and the log level from environment variables, $APP_LOG and LOG_LEVEL respectively. According to The symphony documentations on external paramaters the correct way to do that in the services.yml file is like this:
parameters:
log.file: '%env(APP_LOG)%'
log.level: '%env(LOGGING_LEVEL)%'
In my sample app I verified PHP can read these environment variables with the following:
echo "Hello World!\n\n";
echo 'APP_LOG=' . (getenv('APP_LOG') ?? '__NULL__') . "\n";
echo 'LOG_LEVEL=' . (getenv('LOG_LEVEL') ?? '__NULL__') . "\n";
Which writes the following to the browser when I use my original services.yml with hard coded values.:
Hello World!
APP_LOG=php://stderr
LOG_LEVEL=debug
However, if I use the %env(VAR_NAME)% syntax in services.yml, I get the following error:
Fatal error: Uncaught UnexpectedValueException: The stream or file "env_PATH_a61e1e48db268605210ee2286597d6fb" could not be opened: failed to open stream: Permission denied in /var/www/vendor/monolog/monolog/src/Monolog/Handler/StreamHandler.php:107 Stack trace: #0 /var/www/vendor/monolog/monolog/src/Monolog/Handler/AbstractProcessingHandler.php(37): Monolog\Handler\StreamHandler->write(Array) #1 /var/www/vendor/monolog/monolog/src/Monolog/Logger.php(337): Monolog\Handler\AbstractProcessingHandler->handle(Array) #2 /var/www/vendor/monolog/monolog/src/Monolog/Logger.php(532): Monolog\Logger->addRecord(100, 'Initialized dep...', Array) #3 /var/www/html/index.php(17): Monolog\Logger->debug('Initialized dep...') #4 {main} thrown in /var/www/vendor/monolog/monolog/src/Monolog/Handler/StreamHandler.php on line 107
What am I doing wrong?
Ok you need a few things here. First of all you need version 3.3 of Symfony, which is still in beta. 3.2 was the released version when I encountered this. Second you need to "compile" the environment variables.
Edit your composer.json with the following values and run composer update. You might need to update other dependencies. You can substitute ^3.3 with dev-master.
"symfony/config": "^3.3",
"symfony/console": "^3.3",
"symfony/dependency-injection": "^3.3",
"symfony/yaml": "^3.3",
You will likely have to do this for symfony/__WHATEVER__ if you have other symfony components.
Now in you're code after you load your yaml configuration into your dependency container you compile it.
So after you're lines here (perhaps in bin/console):
$container = new ContainerBuilder();
$loader = new YamlFileLoader($container, new FileLocator(__DIR__ . DIRECTORY_SEPARATOR . '..'));
$loader->load('services.yml');
Do this:
$container->compile(true);
Your IDE's intellisense might tell you compile takes no parameters. That's ok. That's because compile() grabs its args indirectly via func_get_arg().
public function compile(/*$resolveEnvPlaceholders = false*/)
{
if (1 <= func_num_args()) {
$resolveEnvPlaceholders = func_get_arg(0);
} else {
. . .
}
References
Github issue where this was discussed
Pull request to add compile(true)
Using this command after loading your services.yaml file should help.
$containerBuilder->compile(true);
given your files gets also validated by the checks for proper configurations which this method also does. The parameter is $resolveEnvPlaceholders which makes environmental variables accessible to the yaml services configuration.

Resources