mongoimport very very slow for large json file - mongoimport

I have a large json file (350GB) and I am trying to import it in MongoDB collection using mongoimport.The mongoimport is very very slow and I am not sure how many days it will take.
Can any one please suggest the best way to load this json file to mongodb collection. I have enough disk space to load this json file.

I came across similar situation. I used mongorestore instead of mongoimport but the idea is the same. iotop shows that the restore process had an IO rate of about 1M/s which is pretty low. As other post here suggests, the low performance is probably due to the json to bson serialization. So I ended splitting up the exported json file into different chunks with the following command
mongodump --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > --query "{DayOfWeek:"Monday"}" --out "SomeDir-Monday" &
mongodump --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > --query "{DayOfWeek:"Tuesday"}" --out "SomeDir-Tuesday" &
...
then I ended up having 7 chunks.
Finally import these chunks in parallel using mongorestore with following command.
mongorestore --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > PATH_TO_MONDAY.json &
mongorestore --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > PATH_TO_TUESDAY.json &
...

If you are using mongodb > 3.0.0 you can use the --numInsertionWorkers on the mongoimport command
Set this to the number of CPUs you have in order to speedup the import.
ref.

Use a Studio3T-mongoChef GUI client where importing JSON,dump etc. are simple yet faster.

Related

Need to verify/check ipv6 address using ping in lua script

i am not aware of lua script but i need some help.
Basically current lua script will receive structure.
in those structure has address parameter where will get two index parameters(ipv6 & ipv4) addresses.
lua script need to implement below case
ping ipv6 address and result will get store in local variable.
if local variable gets (ping success) will connect/call uv.tcp_connect for passed ipv6 address.
otherwise i will check the same for ipv4 address and try to connect/call uv.tcp_connect.
I am using online lua editor there its returning nil.
local results = load('ping -q -c1 -6 localhost 2>&1 >/dev/null && printf "IPv6: true" || (ping -q -c1 www.google.com 2>&1 >/dev/null && printf "IPv4 true" || printf "false")')
print(results)
output is:nil
and
if i am using in lua online editor ..
local handler = io.popen("ping -c 3 -i 0.5 www.google.com")-- wrong here.
local response = handler:read("*a")
print(response)
output error :
lua: main.lua:3: expected near '"ping -c 3 -i 0.5 www.google.com"'
kindly suggest me , am i missing something above.
To store output of system commands i suggest io.popen().
An example for conditional ping that tries first IPv6 and if fail IPv4...
> code.cmd
-- cmd(shell)
return function(shell)
return io.popen(shell, 'r'):read('a+')
end
> results={}
> results.ping=load(code.cmd)()('ping -q -c1 -6 localhost 2>&1 >/dev/null && printf "IPv6: true" || (ping -q -c1 localhost 2>&1 >/dev/null && printf "IPv4 true" || printf "false")')
> print(results.ping)
IPv6: true
...typed in a Lua console.
EDIT
Online Lua Environments dont support above code!

Error face following tutorial on REST persistent data Store on Hyperledger composer

https://i.imgur.com/nGh5orv.png
I am setting this up in a AWS ec2 environment.Everything works fine till I tried doing a multi-user mode.
I am facing this issue where I had setup the mongoldb persistent data store following the tutorials.
Here is my setup on the envvars.txt
COMPOSER_CARD=admin#property-network
COMPOSER_NAMESPACES=never
COMPOSER_AUTHENTICATION=true
COMPOSER_MULTIUSER=true
COMPOSER_PROVIDERS='{
"github": {
"provider": "github",
"module": "passport-github",
"clientID": "xxxx",
"clientSecret": "xxxx
"authPath": "/auth/github",
"callbackURL": "/auth/github/callback",
"successRedirect": "/",
"failureRedirect": "/"
}
}'
COMPOSER_DATASOURCES='{
"db": {
"name": "db",
"connector": "mongodb",
"host": "mongo"
}
}'
And I had changed the connection profile of both h1lfv1 and admin#xxx-network to 0.0.0.0 as seen here.
https://github.com/hyperledger/composer/issues/1784
I tried his solution here and it doesn't work.
Thank you!
Currently there's an issue with admin re-enrolling (strictly an issue with REST server) even though the admin card has a certificate (it ignores it - but fixed in 0.18.x).
Further, there's a hostname resolution issue which you'll need to address because Docker needs to be able to resolve the container names from within the persistent REST server container - we will need to change the hostnames to represent the docker resolvable hostnames as they are current set to localhost values - (example shows a newly issued 'restadmin' card that was created for the purposes of using it to start the REST server and using the standard 'Developer setup' Composer environment):
Create a REST Adninistrator identity restadmin and an associated business network card (used to launch the REST server later).
composer participant add -c admin#property-network -d '{"$class":"org.hyperledger.composer.system.NetworkAdmin", "participantId":"restadmin"}'
Issue a 'restadmin' identity, mapped to the above participant:
composer identity issue -c admin#property-network -f restadmin.card -u restadmin -a "resource:org.hyperledger.composer.system.NetworkAdmin#restadmin"
Import and test the card:
composer card import -f restadmin.card
composer network ping -c restadmin#property-network
run this one-liner to carry out the resolution changes easily:
sed -e 's/localhost:/orderer.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/ca.org1.example.com:/' < $HOME/.composer/cards/restadmin#property-network/connection.json > /tmp/connection.json && cp -p /tmp/connection.json $HOME/.composer/cards/restadmin#property-network
Try running the REST server with the card -c restadmin#property-network - if you're running this tutorial https://hyperledger.github.io/composer/latest/integrating/deploying-the-rest-server then you will need to put this CARD NAME in the top of your envvars.txt and then ensure you run source envvars.txt to get it set 'in your current shell environment'
If you wish to issue further identities - say kcoe below - from the REST client (given you're currently 'restadmin') you simply do the following (first two can be done in Playground too FYI):
composer participant add -c admin#trade-network -d '{"$class":"org.acme.trading.Trader","tradeId":"trader2", "firstName":"Ken","lastName":"Coe"}'
composer identity issue -c admin#trade-network -f kcoe.card -u kcoe -a "resource:org.acme.trading.Trader#trader2"
composer card import -f kcoe.card # imported to the card store
Next - one-liner to get docker hostname resolution right, from inside the persistent dockerized REST server:
sed -e 's/localhost:/orderer.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/ca.org1.example.com:/' < $HOME/.composer/cards/kcoe#trade-network/connection.json > /tmp/connection.json && cp -p /tmp/connection.json $HOME/.composer/cards/kcoe#trade-network
Start your REST server as per the Deploy REST server doc:
docker run \
-d \
-e COMPOSER_CARD=${COMPOSER_CARD} \
-e COMPOSER_NAMESPACES=${COMPOSER_NAMESPACES} \
-e COMPOSER_AUTHENTICATION=${COMPOSER_AUTHENTICATION} \
-e COMPOSER_MULTIUSER=${COMPOSER_MULTIUSER} \
-e COMPOSER_PROVIDERS="${COMPOSER_PROVIDERS}" \
-e COMPOSER_DATASOURCES="${COMPOSER_DATASOURCES}" \
-v ~/.composer:/home/composer/.composer \
--name rest \
--network composer_default \
-p 3000:3000 \
myorg/my-composer-rest-server
From the System REST API in http://localhost:3000/explorer - go to the POST /wallet/import operation and import the card file kcoe.card with (in this case) the card name set to kcoe#trade-network and click on 'Try it Out' to import it - it should return a successful (204) response.
This is set as the default ID in the Wallet via System REST API endpoint
(if you need to set any further imported cards as the default card name in our REST client Wallet - go to the POST /wallet/name/setDefault/ method and choose the card name and click on Try it Out. This would now the default card).
Test it out - try getting a list of Traders (trade-network example):
Return to the Trader methods in the REST API client and expand the /GET Trader endpoint then click 'Try it Out' . It should confirm that we are now using a card in the business network, and should be able to interact with the REST Server and get a list of Traders (that were added to your business network)..

Why is `mongorestore` not restoring one of my collections

I have rails app which requires mongo dump of a test database, which I restore using something like
mongorestore -d test_database dump/test_databse
when I run this command from the terminal everything works fine
$ mongo test_database
MongoDB shell version: 2.4.12
connecting to: test_database
> db.users_user.count()
50
> db.users_posts.count()
100
but when I run the same command using Ruby
system "mongorestore -d test_database dump/test_databse"
one of the collections users_posts is not inserted
$ mongo test_database
MongoDB shell version: 2.4.12
connecting to: test_database
> db.users_user.count()
50
> db.users_posts.count()
0
What's going on here? Is it a permissions issue? I am stumped.

mongoexport with sub documents and field names with spaces

I need to export data from a mongoDB where I have sub-documents and fields with spaces. I have tried several permutations such as:
mongoexport --db main --collection prices --fields _id,subdoc1.sum,subdoc1["field name 1"],subdoc1["field name 2"] --csv > out.dat
or
mongoexport --db main --collection prices --fields _id,subdoc1.sum,subdoc1."field name 1",subdoc1."field name 2" --csv > out.dat
There is no documentation on how to do this. Is this possible?
You can escape spaces without quotation marks on the command-line on unix-like operating systems using a backslash:
mongoexport --db main --collection prices --fields _id,subdoc1.sum,subdoc1.field\ name\ 1,subdoc1.field\ name\ 2 --csv >out.dat
And on Windows, (I think) you use the caret:
mongoexport --db main --collection prices --fields _id,subdoc1.sum,subdoc1.field^ name^ 1,subdoc1.field^ name^ 2 --csv >out.dat

Transfer database using Rails

The production server that hosts my rails app is being wiped and started again, as a result i will need to transfer my rails app onto the new system. The source isnt a problem i can just pull down from git again but the database is another matter. I could install phpmyadmin or something similar to access the database but i was wondering if there was something in rails (possibly a rake task) that would let me dump the current database and then import it onto a new server.
You don't need Rails or PHPMyAdmin for this. Assuming you're using MySQL, simply ssh to your server:
mysqldump -u root -p databasename > database.sql
Then on the other system:
mysql -u root -p newdatabasename < database.sql
Easy, huh?
If it is a recurring task, you could also put that into a rake task under lib/tasks:
namespace :db do
desc "Dump database"
task :dump => :environment do
exec "mysqldump -u root -p databasename > database.sql"
end
desc "Restore database"
task :restore => :environment do
exec "mysql -u root -p newdatabasename < database.sql"
end
end

Resources