Influx v1.8 CLI query gets Killed - influxdb

I'm looking at options to export data from InfluxDB to MySQL. I'm exploring the option to export the data to flat files for the import (so we don't have to hit our production InfluxDB instance).
When I execute the command influx -database 'mydb' -execute 'SELECT * FROM "1D"' -format csv > my-influx-all.csv it runs for about a minute and then outputs Killed.
My test DB is about 2.1GB in size atm so not large. The production DB is 51GB. Is there a flag I can pass so Influx CLI doesn't die? Or is there an alternate way to export data into a flat file?

The query you can might hit an OOM. Further details should be found in the logs.
If you want to export the data in an readable format, you could try influx_inspect :
sudo influx_inspect export -database yourDatabase -out "influx_backup.db"

Related

influx: merging large database with select into fails. Alternatives?

influxdb 1.8.10
I have 2 databases which was originally one, but due to hardware limitations,I had to move to a new system and just started a new database there.
now i've upgraded to a new system and wants to merge these two again.
I've restored the backups of both into a seperate docker instance in two db. energypre2021 and energycombined(which has the data beyond 2021)
if I use
influx -database=energycombined -execute 'SELECT * INTO energypre2021..:MEASUREMENT FROM /.*/ GROUP BY *'
in the docker container, i just get kicked out of the docker instance without any messages and no merged db.
the log just says this
ts=2022-09-08T22:30:10.118858Z lvl=info msg="Open store (end)" log_id=0cooRaaG000 service=store trace_id=0cooRa~0000 op_name=tsdb_open op_event=end op_elapsed=4042.491ms
any tips on how to effectively merge both db's? I am willing to merge one table at a time if needed.
influxdb 1.8.10
64GBRam +1TB SSD should be enough power for this stuff me thinks.
I actually did an export -portable of both instances
influxd backup -portable -database energy /mnt/backup/pre2020
influxd backup -portable -database energy /mnt/backup/newestdata
then I restore the instance with the least data (newestdata) in one empty influx instance. . influxd restore -portable -database energy /mnt/backup/newestdata and export a copy of this influx_inspect export -datadir /var/lib/influxdb/data/ -waldir /var/lib/influxdb/wal/ -out /mnt/backup/newestdata.gz
then i drop that instance and restore the export with the most data influxd restore -portable -database energy /mnt/backup/pre2020
and then import the export
influx -import -compressed=true -path=/mnt/backup/newestdata.gz
then I import both instances-backups one by one in one empty influx instance.
You could probably export the database from the two instances separately and then import them one by one.
Step 1: export the database from the two instances
influx_inspect export -compress -datadir /var/lib/influxdb/data -waldir /var/lib/influxdb/wal -out /root/1.gz
influx_inspect export -compress -datadir /var/lib/influxdb/data -waldir /var/lib/influxdb/wal -out /root/2.gz
Step 2: and then you can import these two files one by one:
influx -import -compressed=true -path=/root/1.gz
influx -import -compressed=true -path=/root/2.gz
See more details here on the export and there on the import.

InfluxDB: Move only one database of many from one server instance to another

I have an InfluxDB server instance containing several databases, like sensors, network, telegraf and so on.
Together these databases consume several dozens of GB, and I want to offload only the sensors database to another more powerful machine.
The simplest case would be that I create a new InfluxDB server instance on that other machine, and just move (rsync) the influxdb/data/sensors folder to the other machine, and delete it from the original one.
While I haven't tested it, I assume that this does not work that easily; there is a data/_internal directory, then there's the meta/meta.db file as well as the wal/* directory, which will probably require everything to be left "as-is" in order for the server instance to boot without error.
Since I'm talking about dozens of GBs per database, I'd ideally just would like to mount a new ssd, copy the files/directories, and then mount that new ssd on the other machine and use it directly as the new data source without further copying.
I'd basically wish I could do this in a way as easy as moving rrd-tool's rrd files from one machine to another.
Is this possible? If not, what are my options?
Edit 2022: This is a solution which works for InfluxDB 1.x, the commands shown here may not be directly applicable to 2.x. Here is a link to the 2.x backup/restore documentation: https://docs.influxdata.com/influxdb/v2.2/backup-restore/
The InfluxDB 2.2 influx backup command is not compatible with versions of InfluxDB prior to 2.0.0.
I resorted to using influxd backup / influxd restore as Yuri Lachin pointed out.
While it does have the drawback of first needing to save the data on disc and then read it in from there, it seems to be the the most flexible approach.
Rsyncing 50GB does take a certain amount, and the databases would need to be offline during that time, which is not a requirement for backup / restore; so no data is lost. It also allows to migrate the data which used to be on one single InfluxDB instance to different InfluxDB servers without having to think about the issue with the metadata database.
The backup / restore can be done in steps, where the first step ist to initially backup all the data of the database, restore it into the new server instance, and then exporting the newest data again which didn't make it into the first backup, restoring it again into the new database.
Step 1:
On the machine containing the new, empty InfluxDB server instance, backup the data from the remote, old InfluxDB instance:
influxd backup \
-portable \
-host 192.168.11.10:8088 \
-database sensors \
/var/lib/influxdb/export-sensors-01
Afterwards import this data into the new server instance:
influxd restore \
-portable \
/var/lib/influxdb/export-sensors-01
Step 2:
Now take the time to adjust the IP-address or domain name to which the InfluxDB clients are currently connected, and make them point to the new InfluxDB server; restart the clients if necessary.
Step 3:
During the time the backup finished and you restarted the clients with the new IP-address, new data was still written to the old database, so we will need to sync that data over.
Again, on the new server, pull a backup from the old one, but specify the time range of the missing data and a different target directory:
influxd backup \
-portable \
-host 192.168.11.10:8088 \
-database sensors \
-start 2019-06-22T19:30:00Z \
-end 2019-06-24T00:00:00Z \
/var/lib/influxdb/export-sensors-02
Apparently it is important to specify -end as well, one test I did which had no -end argument started to backup the entire database again. I just ctrl-d'd out of it and deleted /var/lib/influxdb/export-sensors-02 and started it again with the -end argument set.
The -start argument can contain a couple of minutes of the data which already got restored, since during restoring this second backup these duplicated entries will be ignored or overwrite the already existing identical values.
For example, if you start the main backup at 4pm and it finishes at 6pm, the second backup can contain a -start argument of 5:55pm and an -end argument a couple of days in the future, which is no problem, because as soon as you switch the IP-addresses of the client, no more future data will be written to the old database. Probably the -since argument would have been better, but I was experimenting a bit with time ranges so I left it at using -start+-end.
In order to insert the missing data which you just backed up into /var/lib/influxdb/export-sensors-02 you need to do a bit more work, since you can't restore into an already existing database. If you try to do it, nothing is damaged, only a warning message is shown and restore gets aborted.
So we will need to restore the data into a new, temporary database:
influxd restore \
-portable \
-database sensors \
-newdb sensors_tmp_backup \
/var/lib/influxdb/export-sensors-02
Then copy the data into the sensors database:
influx \
-database=sensors_tmp_backup \
-execute 'SELECT * INTO sensors..:MEASUREMENT FROM /.*/ GROUP BY *'
And delete the temporary database:
influx \
-database=sensors_tmp_backup \
-execute 'DROP DATABASE sensors_tmp_backup'
If all is OK, delete the backup directories
rm -rf /var/lib/influxdb/export-sensors-01
rm -rf /var/lib/influxdb/export-sensors-02
Before changing the addresses with Step 2, you can test Step 3 a couple of times, by making the new db catch up the old, current one via a couple of backups. It's a good way to get acquainted with the procedure in Step 3.
If you're running InfluxDB in Docker, like I am doing, you can execute all the commands from the host. Step 3 would then look like this:
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influxd backup -portable -host 192.168.11.10:8088 -database sensors -start 2019-06-22T19:40:00Z -end 2019-06-24T00:00:00Z /var/lib/influxdb/export-sensors-02
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influxd restore -portable -database sensors -newdb sensors_tmp_back /var/lib/influxdb/export-sensors-02
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influx -database=sensors_tmp_back -execute 'SELECT * INTO sensors..:MEASUREMENT FROM /.*/ GROUP BY *'
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influx -database=sensors_tmp_back -execute 'DROP DATABASE sensors_tmp_back'
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 rm -rf /var/lib/influxdb/export-sensors-01
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 rm -rf /var/lib/influxdb/export-sensors-02
If you are having problems accessing the remote InfluxDB server keep in mind that the RPC-port 8088 is usually bound to localhost for security reasons, so you may need to bind it to 0.0.0.0 first, probably by setting the environment variable INFLUXDB_BIND_ADDRESS on the remote instance to 0.0.0.0:8088, as specified in the documentation, and then restarting the server.
Not sure it is safe to rsync influxdb/data/sensors directory files from a running influxdb instance. At least you should copy files with rsync and a running influxd, then stop influxd service and repeat rsync to fetch recently updated files.
Without copying `influxdb/meta/meta.db' to a new server your new instance won't know about existing old databases and measurements.
AFAIK, the procedure of manual file copying is not officially documented or recommended by InfluxData.
Probably using official influxd backup / influxd restore commands is a safer approach. They were buggy 1-2 years ago when I tried them, but are likely to work now. You can run backup on a new server from remote old instance and restore backup locally.
I may try as you mentioned in your question copy influxdb/data/sensors directory to the new machine.
_internal database maintains the run time statistics. So you can ignore that if you are not looking into that database.
I am ignorant where it is using its metadata, so be cautious.
wal/* - directory is nothing but write ahead log to avoid data loss. I assume you have some downtime for this activity. If you can find most recent data within sensor DB before you do this copying, there is not chance for data loss from wal.

Where to find dumped data (using dump command in Neo4j Shell) in Neo4j

Wondering where to find the dumped file of neo4j dump command. I was running Neo4j v2.1.3 in windows operating system. Please help thanks.
I believe the dump command just outputs to the console, so you need to redirect the output. Something like:
Neo4jShell.bat -c "dump match (n:People)-[:Shared]-(m) return n,m;" > social.connection.db\test2.cql
Edited with the Windows version of the command
For UNIX systems a similar command works:
neo4j-shell -c "dump match (n:People)-[:Shared]-(m) return n,m;" > social.connection.db/test2.cql

How to automatically export Instruments data to CSV

I'm looking at a way to automate the gathering of iOS memory usage. So far, I've been using the Instruments cli to do this:
instruments -w <ID> -t "/Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/Resources/templates/Activity Monitor.tracetemplate" -l 30000
My problem now is exporting the data to be parsed. I noticed in the Instruments GUI there is an option to export to CSV, however there doesn't appear to be anything like this for the CLI.
I unzipped the .trace package that the Instruments CLI outputs and found a lot of binary data, which isn't too useful.
Is it possible to export this data or convert it to a parsable format?
Thanks

How to change the memory config for the Neo4j-shell command tool of Cypher batch import

I am trying to use neo4j-shell command tool to do Cypher batch import. I followed the instructions described in Import data into your neo4j database from the neo4j-shell command. Here was the command that I ran:
import-cypher -d "," -i c://temp//neo//import.csv -o c://temp//neo//out.csv start n=node:employee_idx(EmpID={emp_id}), m=node:permit_idx(PmtID={pmtid}) create n<-[:Assign{AssID:{assid}}]-m
If there were only 100000 records in import.csv file, it ran perfectly. But if there were 200000 records in import.csv file, I got error: Error occurred in server thread; nested exception is: java.lang.OutOfMemory Error: Java heap space.
How to change the default memory config of this tool?
You need to set the environment variable JAVA_OPTS to appropriate values, e.g. on Linux it can be done using
JAVA_OPTS="-Xmx4G" bin/neo4j-shell

Resources