Inserting binary data into Varchar2 with OTL (OCCI, OCI) - oracle-call-interface

How do I insert data that might be binary into a Varchar2 with OTL?
(OCI/OCCI would be OK of course)
Background: We have a lot of Varchar2 columns which are generally not binary, but it might happen somewhere someday (I am especially concerned about \0 and UTF-8)
Tuesday: I posted this related question:
How can I store bytes in Oracle Varchar2, and have ASCII treated as text

If you must use VARCHAR2, you'll need to convert the binary data first, for example using BASE-64 encoding.
So if you're calling an insert statement from C++, you first in C encode the bytes you wish to insert, then call the statement to insert the resulting string.
If you wish to insert binary values from another table, it gets trickier, but you can encode them in a PL/SQL function.
But if you can alter the data type, it is probably better to use the RAW datatype instead.

Related

Avoiding ASCII conversion in serial communication using luars232

I am using luars233 library for serial communication using Lua. I need to send data bytes without converting them in ASCII form, but the write function of luars232 is converting the data into ASCII before transmission even if I pass it to the function as a number(data type). Please provide possible assistance
I have worked-around the issue by using escape sequence in String datatype e.g. '\2' would pass 0x02 on to the serial port using write function of luars232. But this restricts performing mathematical operations on the data before transmission. Further suggestions are welcomed.
The library takes the data argument and coerces it to a string via luaL_checklstring using standard Lua rules. So, if you want complete control over the data, you should pass a string. A Lua string is a counted sequence of bytes.
Certainly, as you have found, a literal escaped character sequence will work.
You can also use the string.char(...) function, which takes a list of zero or more values 0-255 and creates string with those byte values.
If you have a table sequence of bytes, you can unpack them into a list:
local bytes = { 27, 76, 117, 97 }
port:write(string.char(table.unpack(bytes)))
So, yes, you do have to convert to a string. But, you can defer that until just before the write call.

Which NLS_LENGTH_SEMANTICS for WE8MSWIN1252 Character Set

We have a database where the character set is set to WE8MSWIN1252 which I understand is a single byte character set.
We created a schema and its tables by running a script with the following:
ALTER SYSTEM SET NLS_LENGTH_SEMANTICS=CHAR
Could we possibly lose data since we are using VARCHAR2 columns with character semantics while the underlying character set is single byte?
If you are using a single-byte character set like Windows-1252, it is irrelevant whether you are using character or byte semantics. Each character occupies exactly one byte so it doesn't matter whether a column is declared VARCHAR2(10 CHAR) or VARCHAR2(10 BYTE). In either case, up to 10 bytes of storage for up to 10 characters will be allocated.
Since you gain no benefit from changing the NLS_LENGTH_SEMANTICS setting, you ought to keep the setting at the default (BYTE) since that is less likely to cause issues with other scripts that you might need to run (such as those from Oracle).
Excellent question. Multi-byte characters will take up the number of bytes required, which could use more storage than you expect. If you store a 4-byte character in a varchar2(4) column, you have used all 4 bytes. If you store a 4-byte character in a varchar2(4 char) column, you have only used 1 character. Many foreign languages and special characters use 2-byte character sets, so it's best to 'know your data' and make your database column definitions accordingly. Oracle does NOT recommend changing NLS_LENGTH_SEMANTICS to CHAR because it will affect every new column defined as CHAR or VARCHAR2, possibly including your catalog tables when you do an in-place upgrade. You can see why this is probably not a good idea. Other Oracle toolsets and interfaces may present issues as well.

Problem reading a TStream in Delphi XE

In the previous versions of Delphi, the following code:
var InBuf: array[1..45] of Byte;
Count := InStream.Read(InBuf, SizeOf(InBuf));
filled the variable InBuf with the correct values ( every byte had a value ). Now in Delphi XE, every second byte of the array is 0, I suppose because the Byte data type is twice as big, because of its Unicode nature in Delphi XE. But, my streams are already generated and need to pass through this procedure, so I need another type (maybe?) that is half size of Byte or another solution if someone faced this problem. Thanks
What has happened here, with >99% probability is that you have written the stream from a string variable. Unicode strings with UTF-16 encoding have two bytes per character whereas older versions of Delphi using ANSI encodings with one byte per character.
English text, when encoded with UTF-16 have the pattern you observe of every second byte being zero.
In order to solve this you will need to investigate the section of code that writes to the stream.

reading and sorting a variable length CSV file

We am using OpenVMS system and I believe it is using the Cobol from HP.
With a data file of a lot of records ( 500mb or more ) which variable length. The records are comma delimited. I would like to parse each records and extract corresponding fields for processing. After that, I might want to sort it by some particular fields. Is it possible with cobol?
I've seen sorting with fixed-length records only.
Variable length is no problem, not sure exactly how this is done in VMS cobol but the IBMese for this is:-
FILE SECTION.
FD THE-FILE RECORD IS VARYING DEPENDING ON REC-LENGTH.
01 THE-RECORD PICTURE X(5000) .
WORKING-STORAGE SECTION.
01 REC-LENGTH PICTURE 9(5) COMPUTATIONAL.
When you read the file "REC-LENGTH" will contain the record length, when write a record it will write a record of length REC-LENGTH.
To handle the delimited record files you will probably need to use the "UNSTRING" verb to convert into a fixed format. This is pretty verbose (but then this is COBOL).
UNSTRING record DELIMITED BY ","
INTO field1, field2, field3, field4, field5 etc....
END-UNSTRING
Once the record is in fixed format you can use the SORT as normal.
The Cobol SORT verb will do what you need.
If the SD file contains variable-length records, all of the KEY data-items must be contained within the first n character positions of the record, where n equals the minimum records size
specified for the file. In other words, they have to be in the fixed part.
However, you can get around this easily by using an input procedure. This will let you create a virtual file that has its keys in the right place. In your input procedure, you will reformat your variable, comma delimited, record, into one that has its keys at the front, then "Release" it to the sort.
If my memory is correct, VMS has a SORT/MERGE utility that you could use after you have processed the file into a fixed file format (variable may also be possible). Typically a standalone SORT utility performs better than in-line COLBOL SORT and can be better design if the sort criteria changes in the future.
No need to write a solution in COBOL, at least not to sort the file. The UNIX sort utility should do it just fine, just call sort -t ',' -n with maybe a couple of other options.

What is binary character set?

I'm wondering what binary character set is and what is a difference from, let's say, ISO/IEC 8859-1 aka Latin-1 character set?
There's a page in the MySQL documentation about The _bin and binary Collations.
Nonbinary strings (as stored in the CHAR, VARCHAR, and TEXT data types) have a character set and collation. A given character set can have several collations, each of which defines a particular sorting and comparison order for the characters in the set. One of these is the binary collation for the character set, indicated by a _bin suffix in the collation name. For example, latin1 and utf8 have binary collations named latin1_bin and utf8_bin.
Binary strings (as stored in the BINARY, VARBINARY, and BLOB data types) have no character set or collation in the sense that nonbinary strings do. (Applied to a binary string, the CHARSET() and COLLATION() functions both return a value of binary.) Binary strings are sequences of bytes and the numeric values of those bytes determine sort order.
And so on. Maybe gives more sense? If not, I'd recommend looking further in the documentation for descriptions about these things. If it's a concept, it should be explained. Usually is :)

Resources