Sybase ASE: show the character encoding used by the database - character-encoding

I am working on a Sybase ASE database and would like to know the character encoding (UTF8 or ASCII or whatever) used by the databae.
What's the command to show which character encoding the database uses?

The command you're looking for is actually a system stored procedure:
1> sp_helpsort
2> go
... snip ...
Sort Order Description
------------------------------------------------------------------
Character Set = 190, utf8
Unicode 3.1 UTF-8 Character Set
Class 2 Character Set
Sort Order = 50, bin_utf8
Binary sort order for the ISO 10646-1, UTF-8 multibyte encodin
g character set (utf8).
... snip ...
From this output we see this particular ASE dataserver has been configured with a default character set of utf8 and default sort order of binary (bin_utf8). This means all data is stored as utf8 and all indexing/sort operations are performed using a binary sort order.
Keep in mind ASE can perform character set conversions (for reads and writes) based on the client's character set configuration. Though the successfulness of said conversions will depend on the character sets in question (eg, a client connecting with utf8 may find many characters cannot be converted for storage in a dataserver defined with a default character set of iso_1).

With a query:
select
cs.name as server_character_set,
cs.description as character_set_description
from
master..syscharsets cs left outer join
master..sysconfigures cfg on
cs.id = cfg.value
where
cfg.config = 131
Example output:
server_character_set character_set_description
utf8 Unicode 3.1 UTF-8 Character Set

Related

Importing bitcoin blockchain into neo4j - error: Missing required option '--nodes=[<label>[:<label>]...=]<files>'

I am trying to import bitcoin blockchain into neo4j.
There are four files
1- tx_header.csv - tx_hash:ID, timestamp
2- outputs_headers.csv - tx_hash:ID, wallet_address:END_ID, amount
3- inputs_headers.csv - wallet_address:START_ID, tx_hash:END_ID, amount
4- add_headers.csv - wallet_address:ID
which have header information and content details are in other csv files. When i try:
neo4j-admin import
--nodes:Transaction $DATA/transactions/transaction/tx_headers.csv,$DATA/transactions/transaction/transaction_unique.csv
--nodes:Address $DATA/add/address/add_headers.csv,$DATA/add/address/unique_address.csv
--relationships:Output $DATA/outputs/outputs_headers.csv,$DATA/outputs/outputs.csv
--relationships:Input $DATA/inputs/inputs/inputs_headers.csv,$DATA/inputs/inputs/inputs1.csv
--ignore-missing-nodes=true
I get Message: I am using 4.0.1 version
WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
Missing required option '--nodes=[<label>[:<label>]...=]<files>'
[picocli WARN] Could not format 'Maximum memory that neo4j-admin can use for various data structures and caching to improve performance. Values can be plain numbers, like 10000000 or e.g. 20G for 20 gigabyte, or even e.g. 70%.' (Underlying error: Conversion = '.'). Using raw String: '%n' format strings have not been replaced with newlines. Please ensure to escape '%' characters with another '%'.
[picocli WARN] Could not format 'Maximum memory that neo4j-admin can use for various data structures and caching to improve performance. Values can be plain numbers, like 10000000 or e.g. 20G for 20 gigabyte, or even e.g. 70%.' (Underlying error: Conversion = '.'). Using raw String: '%n' format strings have not been replaced with newlines. Please ensure to escape '%' characters with another '%'.
[picocli WARN] Could not format 'Maximum memory that neo4j-admin can use for various data structures and caching to improve performance. Values can be plain numbers, like 10000000 or e.g. 20G for 20 gigabyte, or even e.g. 70%.' (Underlying error: Conversion = '.'). Using raw String: '%n' format strings have not been replaced with newlines. Please ensure to escape '%' characters with another '%'.
USAGE
neo4j-admin import [--verbose] [--cache-on-heap[=<true/false>]] [--high-io
[=<true/false>]] [--ignore-empty-strings[=<true/false>]]
[--ignore-extra-columns[=<true/false>]]
[--legacy-style-quoting[=<true/false>]] [--multiline-fields
[=<true/false>]] [--normalize-types[=<true/false>]]
[--skip-bad-entries-logging[=<true/false>]]
[--skip-bad-relationships[=<true/false>]]
[--skip-duplicate-nodes[=<true/false>]] [--trim-strings
[=<true/false>]] [--additional-config=<path>]
[--array-delimiter=<char>] [--bad-tolerance=<num>]
[--database=<database>] [--delimiter=<char>]
[--id-type=<STRING|INTEGER|ACTUAL>]
[--input-encoding=<character-set>] [--max-memory=<size>]
[--processors=<num>] [--quote=<char>]
[--read-buffer-size=<size>] [--report-file=<path>] --nodes=
[<label>[:<label>]...=]<files>... [--nodes=[<label>[:
<label>]...=]<files>...]... [--relationships=[<type>=]
<files>...]...
DESCRIPTION
Import a collection of CSV files.
OPTIONS
--verbose Enable verbose output.
--database=<database> Name of the database to import.
Default: neo4j
--additional-config=<path>
Configuration file to supply additional
configuration in.
--report-file=<path> File in which to store the report of the
csv-import.
Default: import.report
--id-type=<STRING|INTEGER|ACTUAL>
Each node must provide a unique id. This is used
to find the correct nodes when creating
relationships. Possible values are:
STRING: arbitrary strings for identifying nodes,
INTEGER: arbitrary integer values for
identifying nodes,
ACTUAL: (advanced) actual node ids.
For more information on id handling, please see
the Neo4j Manual: https://neo4j.
com/docs/operations-manual/current/tools/import/
Default: STRING
--input-encoding=<character-set>
Character set that input data is encoded in.
Default: UTF-8
--ignore-extra-columns[=<true/false>]
If un-specified columns should be ignored during
the import.
Default: false
--multiline-fields[=<true/false>]
Whether or not fields from input source can span
multiple lines, i.e. contain newline characters.
Default: false
--ignore-empty-strings[=<true/false>]
Whether or not empty string fields, i.e. "" from
input source are ignored, i.e. treated as null.
Default: false
--trim-strings[=<true/false>]
Whether or not strings should be trimmed for
whitespaces.
Default: false
--legacy-style-quoting[=<true/false>]
Whether or not backslash-escaped quote e.g. \" is
interpreted as inner quote.
Default: false
--delimiter=<char> Delimiter character between values in CSV data.
Default: ,
--array-delimiter=<char>
Delimiter character between array elements within
a value in CSV data.
Default: ;
--quote=<char> Character to treat as quotation character for
values in CSV data. Quotes can be escaped as per
RFC 4180 by doubling them, for example "" would
be interpreted as a literal ". You cannot escape
using \.
Default: "
--read-buffer-size=<size>
Size of each buffer for reading input data. It has
to at least be large enough to hold the biggest
single value in the input data.
Default: 4194304
--max-memory=<size> Maximum memory that neo4j-admin can use for
various data structures and caching to improve
performance. Values can be plain numbers, like
10000000 or e.g. 20G for 20 gigabyte, or even e.
g. 70%.
Default: 90%
--high-io[=<true/false>]
Ignore environment-based heuristics, and assume
that the target storage subsystem can support
parallel IO with high throughput.
Default: false
--cache-on-heap[=<true/false>]
(advanced) Whether or not to allow allocating
memory for the cache on heap. If 'false' then
caches will still be allocated off-heap, but the
additional free memory inside the JVM will not
be allocated for the caches. This to be able to
have better control over the heap memory
Default: false
--processors=<num> (advanced) Max number of processors used by the
importer. Defaults to the number of available
processors reported by the JVM. There is a
certain amount of minimum threads needed so for
that reason there is no lower bound for this
value. For optimal performance this value
shouldn't be greater than the number of
available processors.
Default: 8
--bad-tolerance=<num> Number of bad entries before the import is
considered failed. This tolerance threshold is
about relationships referring to missing nodes.
Format errors in input data are still treated as
errors
Default: 1000
--skip-bad-entries-logging[=<true/false>]
Whether or not to skip logging bad entries
detected during import.
Default: false
--skip-bad-relationships[=<true/false>]
Whether or not to skip importing relationships
that refers to missing node ids, i.e. either
start or end node id/group referring to node
that wasn't specified by the node input data.
Skipped nodes will be logged, containing at most
number of entities specified by bad-tolerance,
unless otherwise specified by
skip-bad-entries-logging option.
Default: false
--skip-duplicate-nodes[=<true/false>]
Whether or not to skip importing nodes that have
the same id/group. In the event of multiple
nodes within the same group having the same id,
the first encountered will be imported whereas
consecutive such nodes will be skipped. Skipped
nodes will be logged, containing at most number
of entities specified by bad-tolerance, unless
otherwise specified by skip-bad-entries-logging
option.
Default: false
--normalize-types[=<true/false>]
Whether or not to normalize property types to
Cypher types, e.g. 'int' becomes 'long' and
'float' becomes 'double'
Default: true
--nodes=[<label>[:<label>]...=]<files>...
Node CSV header and data. Multiple files will be
logically seen as one big file from the
perspective of the importer. The first line must
contain the header. Multiple data sources like
these can be specified in one import, where each
data source has its own header.
--relationships=[<type>=]<files>...
Relationship CSV header and data. Multiple files
will be logically seen as one big file from the
perspective of the importer. The first line must
contain the header. Multiple data sources like
these can be specified in one import, where each
data source has its own header.
I have specified --nodes=......... already . how to resolve this? the command is a single line without breaks.
[UPDATED]
Try putting the entire command on one line, and changing the --nodes and --relationships options to use the equals sign ("=") where necessary.
This may work better for you:
neo4j-admin import --nodes=Transactions="$DATA/transactions/transaction/tx_headers.csv,$DATA/transactions/transaction/transaction_unique.csv" --nodes=Address="$DATA/add/address/add_headers.csv,$DATA/add/address/unique_address.csv" --relationships=Output="$DATA/outputs/outputs_headers.csv,$DATA/outputs/outputs.csv" --relationships=Input="$DATAinputs_/inputs/inputs_headers.csv,$DATAinputs1/inputs/inputs1.csv" --ignore-missing-nodes=true
Or you can use the appropriate line-continuation syntax for your operating system. For example, in Linux or OSX, you can use the backslash (\) before a newline character to break up a command line:
neo4j-admin import \
--nodes=Transactions="$DATA/transactions/transaction/tx_headers.csv,$DATA/transactions/transaction/transaction_unique.csv" \
--nodes=Address="$DATA/add/address/add_headers.csv,$DATA/add/address/unique_address.csv" \
--relationships=Output="$DATA/outputs/outputs_headers.csv,$DATA/outputs/outputs.csv" \
--relationships=Input="$DATAinputs_/inputs/inputs_headers.csv,$DATAinputs1/inputs/inputs1.csv" \
--ignore-missing-nodes=true
In Windows, the caret (^) can be used instead of the backslash.

MemSQL load data infile does not support hexadecimal delimiter

From this, MySQL load data infile command works well with hexadecimal delimiter like X'01' or X'1e' in my case. But the same command can't be run with same command load data infile on MemSQL.
I tried specifying various forms of of the same delimiter \x1e like:
'0x1e' or 0x1e
X'1e'
'\x1e' or 'x1e'
All the above don't work and throw either syntax error or other error like this:
This is like the delimiter can't be resolved correctly:
mysql> load data local infile '/container/data/sf10/region.tbl.hex' into table REGION CHARACTER SET utf8 fields terminated by '\x1e' lines terminated by '\n';
ERROR 1261 (01000): Row 1 doesn't contain data for all columns
This is syntax error:
mysql> load data local infile '/container/data/sf10/region.tbl.hex' into table REGION CHARACTER SET utf8 fields terminated by 0x1e lines terminated by '\n';
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '0x1e lines terminated by '\n'' at line 1
mysql>
The data is actually delimited by non-printable hexadecimal character of \x1e and line terminated by regular \n. Use cat -A can see the delimited characters as ^^. So the delimiter should be correct.
$ cat -A region.tbl.hex
0^^AFRICA^^lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to $
1^^AMERICA^^hs use ironic, even requests. s$
Are there a correct way to use hex values as delimiter? I can't find such information in documentation.
For the purpose of comparison, hex delimiter (0x1e) can work well on MySQL:
mysql> load data local infile '/tmp/region.tbl.hex' into table region CHARACTER SET utf8 fields terminated by 0x1e lines terminated by '\n';
Query OK, 5 rows affected (0.01 sec)
Records: 5 Deleted: 0 Skipped: 0 Warnings: 0
MemSQL supported hex delimiters as of 6.7, of the form in the last code block in your question. Prior to that, you would need the literal quoted 0x1e character in your sql string, which is annoying to do from a CLI. If youre on an older version you may need to upgrade.

How to store Japanese kanji character in Oracle database

This Japanese character 𠮷, which has four bytes, is saved as ???? in Oracle database whereas other Japanese characters are saved properly.
The configuration in boot.rb of my rails application contains:
ENV['NLS_LANG'] = 'AMERICAN_AMERICA.UTF8'
and sqldeveloper of oracle db has
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CHARACTERSET JA16SJISTILDE
The datatype of the column is NVARCHAR2.
Try NLS_LANG=AMERICAN_AMERICA.AL32UTF8
Oracle Character set UTF8 is actually CESU-8 whereas AL32UTF8 is commonly known UTF-8
If you stay in Basic Multilingual Plane (BMP) then UTF8 and AL32UTF8 are equal, however when you have characters above U+FFFF then they are different.
𠮷 is U+20BB7 which is Supplementary Ideographic Plane

Firedac select working with Firebird returns no records

Hello I'm working with Firedac (Delphi Seattle) using Firebird (2.5) as a database, when I run this query using a TFDQuery, no records are returned:
SELECT ID FROM USERS WHERE PWD = 'êHÆ–!+'
The same query within a Database program as IbExpert return one record. Is there some parameter with Firedac components to configure that can solve this issue. Thanks.
It's in the query string and it's the ! char. By default, query strings are preprocessed, and you must escape constant chars like !, &, :, ?, { or }, otherwise they are used as special chars.
Your best option is using parameters. That will (except other benefits) get rid of that ! char from the preprocessed command:
FDQuery.SQL.Text := 'SELECT ID FROM USERS WHERE PWD = :Password';
FDQuery.ParamByName('Password').AsString := 'êHÆ–!+';
FDQuery.Open;
Another option is escaping that constant char or disable macro preprocessor. For more information see the Special Character Processing topic.

Encoding error PostgreSQL 8.4

I am importing data from a CSV file. One of the fields has an accent(Telefónica O2 UK Limited). The application throws en error while inserting the data to the table.
PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xf36e6963
HINT: This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by "client_encoding".
: INSERT INTO "companies" ("name", "validated")
VALUES(E'Telef?nica O2 UK Limited', 't')
The data entry through the forms works when I enter names with accents and umlaut.
How do I workaround this issue?
Edit
I addressed the issue by converting the file encoding. I uploaded the CSV file to Google docs and exported the file to CSV.
The error message is pretty clear: Your client_encoding setting is set to UTF8 and you try to insert a character which isn't encoded in UTF8 (if it's a CSV from MS Excel, your file is probably encoded in Windows-1252 instead).
You could either convert it in your application or you can alter your PostgreSQL connection to match the encoding you want to insert (thus enabling PostgreSQL to do the conversion for you). You can do so by executing SET CLIENT_ENCODING TO 'WIN1252'; on your PostgreSQL connection before trying to insert that data. After the import you should reset it to its original value with RESET CLIENT_ENCODING;
HTH!
I think you can try to use the Ruby gem rchardet, which may be a better solution. Example code:
require ‘rchardet’
cd = CharDet.detect(string_of_unknown_encoding)
encoding = cd['encoding']
converted_string = Iconv.conv(‘UTF-8′, encoding, str_of_unknown_encoding)
Here are some related links:
https://github.com/jmhodges/rchardet
http://www.meeho.net/blog/2010/03/ruby-how-to-detect-the-encoding-of-a-string/

Resources