How to maintain character-set of standard-values when uploading MySQL-dump - character-encoding

I have a MySQL4 database db4 (charset: latin1) which I want to copy to a MySQL5 database (standard charset: utf-8) db5 using the following command:
mysqldump -u dbo4 --password="..." --default-character-set="latin1" db4 | mysql -S /tmp/mysql5.sock -u dbo5 --password="..." --default-character-set="latin1" db5
The values of the entries are copied in a correct way. But the the german Umlaute (äüö...) which are contained in standard-values of some fields, are afterwards schown as "?"
What is wrong with my copy-command?
I simply want to keep everything as is was before (all data in the database stored as "latin1")

Related

Using R to Query a SQLite file Help: DBI::dbGetQuery() never completes, very slow

I have a SQLite file that contains a table with 9 million or so rows and 30+ columns. Up until a few days ago, the following code worked fine:
path <- file.path(basepath, "Rmark_web", "SQL_Data", "2020Q3_Enterprise_Exposure_Wind.sqlite")
cn <- DBI::dbConnect(RSQLite::SQLite(), path)
df <- DBI::dbGetQuery(cn, "select Longitude, Latitude, City, State, Total_Value, GridID, new_LOB from [2020Q3_Enterprise_Exposure_Wind] where State in ('GA')")
DBI::dbDisconnect(cn)
When I run the code that contains this query on my local machine, it takes some time but it does finish. I am currently trying to run it in a docker image with the following metrics:
docker run --memory=10g --rm -d -p 8787:8787 -v /efs:/home/rstudio/efs -e ROOT=TRUE -e DISABLE_AUTH=true myrstudio
Is there a way to debug the RSQLite package? Is there another way to perform this query without using this package? The rest of the code runs fine, but it gets held up on this specific step and usually does not finish (especially if it is the 2nd or 3rd time that this piece of code runs in the docker image).
The number of states to include in the query changes from run to run.
If you have this issue, be sure to remove any columns you are not using from the SQL file. In the end, I loaded it as a postgres database online and that seems to fix the issue I was experiencing. Here is the new query for anyone's benefit.
library(RPostgres)
library(RPostgreSQL)
library(DBI)
db <- 'airflow_db' #provide the name of your db
host_db <- [omitted for privacy]
db_port <- [omitted for privacy] # or any other port specified by the DBA
db_user <- [omitted for privacy]
db_password <- Sys.getenv("DB_PASS")
con <- dbConnect(RPostgres::Postgres(), dbname = db, host=host_db, port=db_port, user=db_user, password=db_password)
query <- paste('select * from weather_report.ent_expo_data where "State" in (', in_states_clause,')', sep='')
print(query)
df <- dbGetQuery(con, query)

Grepping list of phpass hashes against a file

I'm trying to grep multiple strings which look like this (there's a few hundred) against a file which contains data:string
Example strings: (no sensitive data is provided, they have been modified).
$H$9a...DcuCqC/rMVmfiFNm2rqhK5vFW1
$H$9n...AHZAV.sTefg8ap8qI8U4A5fY91
$H$9o...Bi6Z3E04x6ev1ZCz0hItSh2JJ/
$H$9w...CFva1ddp8IRBkgwww3COVLf/K1
I've been researching how to grep a file of patterns against another file, and came across the following commands
grep -f strings.txt datastring.txt > output.txt
grep -Ff strings.txt datastring.txt > output.txt
But unfortunately, these commands do NOT work successfully, and only print out a handful of results to my output file. I think it may be something to do with the symbols contained in strings.txt, but I'm unsure. Any help/advice would be great.
To further mention, I'm using Cygwin on Windows (if this is relevant).
Here's an updated example:
strings.txt contains the following:
$H$9a...DcuCqC/rMVmfiFNm2rqhK5vFW1
$H$9n...AHZAV.sTefg8ap8qI8U4A5fY91
$H$9o...Bi6Z3E04x6ev1ZCz0hItSh2JJ/
$H$9w...CFva1ddp8IRBkgwww3COVLf/K1
datastring.txt contains the following:
$H$9a...DcuCqC/rMVmfiFNm2rqhK5vFW1:53491
$H$9n...AHZAV.sTefg8ap8qI8U4A5fY91:03221
$H$9o...Bi6Z3E04x6ev1ZCz0hItSh2JJ/:20521
$H$9w...CFva1ddp8IRBkgwww3COVLf/K1:30142
So technically, all lines should be included in the OUTPUT file, but only this line is outputted:
$H$9w...CFva1ddp8IRBkgwww3COVLf/K1:30142
I just don't understand.
You have showed the output of cat -A strings.txt elsewhere, which includes ^M representing a CR (carriage return) character at the end of each line:
This indicates your file has Windows line endings (CR LF) instead of the Unix line endings (only LF) that grep would expect.
You can convert files with dos2unix strings.txt and back with unix2dos strings.txt.
Alternatively, if you don't have dos2unix installed in your Cygwin environment, you can also do that with sed.
sed -i 's/\r$//' strings.txt # dos2unix
sed -i 's/$/\r/' strings.txt # unix2dos

Cannot restore data because of dot in database name

InfluxDB-version: 1.6.3
I've created a backup of a database called 'test.mydb' using the legacy backup format:
influxd backup -database <mydatabase> <path-to-backup>
The backup went fine but when I tried to restore:
sudo influxd restore -db "test.mydb" -newdb "test.mydb" -datadir /var/lib/influxdb/data /home/ubuntu/influxdb/test.mydb/
I got the error: backup tarfile name incorrect format.
After searching I think it is because of this code in influxdb/cmd/influxd/restore/restore.go:
// should get us ["db","rp", "00001", "00"]
pathParts := strings.Split(filepath.Base(tarFile), ".")
if len(pathParts) != 4 {
return fmt.Errorf("backup tarfile name incorrect format")
}
It checks how many dots there are in the backup file names. The amount needs to be 4 but because of my database-name the files have 5 dots.
Are there any workarounds?
I did not find an optimal solution to this problem so I manually copied and pasted the data to InfluxDB.

character 0xc286 of encoding "UTF-8" has no equivalent in "WIN1252"....On conversion with iconv postgres restore crashes

I'm working on a software that uses delphi and postgres 9.0,
the original developer had chosen the database encoding as 'SQL_ASCII'...
so we changed the encoding to UTF-8 for our database..
we we started getting this error after
Onclick of the One of the check boxes
(the form is populated from the database)
the query where the error comes is
'select * from diary where survey in ('2005407')';
but this error only comes for few of the check boxes and not ALL
The change is straight forward but we have large amount of historical data that we will have to re-store into the newly created UTF-8 database..so i followed the steps i found on the net and stackoverflow also
Dump the database as e- UTF-8 SQL_Ascii_backup.backup
Use iconv to convert SQL_ASCII to UTF-8
"C:\Program Files\GnuWin32\bin\iconv.exe" -f ISO8859-1 -t UTF-8 C:\SQL_Ascii_backup.backup>UTF_Backup.backup
3.Create a new Database with encoding as UTF-8 and re-store the backup UTF_Backup.backup
But when i try to restore it i get t his error
then i tried with dumping the original SQL_ASCII database as plain SQL_Ascii_.sql file
and then again i used iconv to change the encoding..and then restoring
>"C:\Program Files\PostgreSQL\9.0\bin\psql.exe"-h localhost -p 5434 -d myDB -U myDB_admin -f C:\converted_utf8.sql
this is restoring properly but im still geting the error.
'character 0xc286 of encoding "UTF-8" has no equivalent in "WIN1252";
C2 86 is the UTF-8 encoding of the character U+0086, an obscure C1 control character. This character exists in ISO-8859-1, but not in Windows' default code page 1252, which has printable characters in the space where ISO-8859-1 has the C1 controls.
Your iconv command to convert to UTF-8 has -f ISO8859-1, but your probably meant -f windows-1252 instead. This maps the byte 86 to the † character.
I got rid of the error
'character 0xc286 of encoding "UTF-8" has no equivalent in "WIN1252";
by following dan04 answer, but to prevent the iconv failing to convert the dumped
Dump the database UTF-8 (do a plain dump..so you may be able to find the point of failure)
Use iconv to convert SQL_ASCII to UTF-8 using
"C:\Program Files\GnuWin32\bin\iconv.exe" -f windows-1252 -t UTF-8 C:\MqPlainDump.sql>convertedDump.sql
Replace the '[]' character (this is in my case which was causing the trouble..its a square character)
Restore the database
And the application is good to go (in my case)

Informix: How to get the table contents and column names using dbaccess?

Supposing I have:
an Informix database named "my_database"
a table named "my_table" with the columns "col_1", "col_2" and "col_3":
I can extract the contents of the table by creating a my_table.sql script like:
unload to "my_table.txt"
select * from my_table;
and invoking dbaccess from the command line:
dbaccess my_database my_table.sql
This will produce the my_table.txt file with contents like:
value_a1|value_a2|value_a3
value_b1|value_b2|value_b3
Now, what do I have to do if I want to obtain the column names in the my_table.txt? Like:
col_1|col_2|col_3
value_a1|value_a2|value_a3
value_b1|value_b2|value_b3
Why you don't use dbschema?
To get schema of one table (without -t parameter show all database)
dbschema -d [DBName] -t [DBTable] > file.sql
To get schema of one stored procedure
dbschema -d [DBName] -f [SPName] > file.sql
None of the standard Informix tools put the column names at the top of the output as you want.
The program SQLCMD (not the Microsoft newcomer - the original one, available from the IIUG Software Archive) has the ability to do that; use the -H option for the column headings (and -T to get the column types).
sqlcmd -U -d my_database -t my_table -HT -o my_table.txt
sqlunload -d my_database -t my_table -HT -o my_table.txt
SQLCMD also can do CSV output if that's what you need (but — bug — it doesn't format the column names or column types lines correctly).
Found an easier solution. Place the headers in one file say header.txt (it will contain a single line "col_1|col_2|col_3") then to combine the header file and your output file run:
cat header.txt my_table.txt > my_table_wth_head.txt

Resources