How to store byte array field in Elasticsearch using spring data elasticsearch without indexing? - spring-data-elasticsearch

I need to store byte array field in Elasticsearch without indexing it. I'm using spring data elasticsearch module. What is the right way? Thanks

The correct fieldtype in Elasticsearch for this would be the binary field type. Alas this is currently not available in Spring Data Elasticsearch, I just created a Jira issue for this.
Even if the binary field type were implemented, you still would need to base64 encode the binary data, so that in Elasticsearch it would be stored in a text representation.
Until this binary field type is implemented, you might try to use a field definition like:
#Field(type = FieldType.Keyword, index = false)
private String base64Data;
Like with the binary field type you have to encode your data as base64 String and decode it when it's coming back from the search. Even better wer if you could add the doc_values=false argument to #Field annotation, to have support for this, there is currently a PullRequest open, but which is not yet ready to be merged; not sure, if this will make it into the 3.2.0 release.
Edit March 2020:
in version 4.0.0, you can user the FieldType.Binary type together with a byte[] and this will be converted to a base64 encoded string (and back again).

Related

How to define a protobuf message to pass an active record protobuf message directly as an input parameter into a ruby grpc code by compiling protomsg?

We are creating a grpc server using proto3. and compiling it into a ruby function.We have converted the Active Record Message into a protobuf-message (got by calling the "activerecord".to_proto method) using ActiveRecord-Protobuf gem. However while creating the protobuf message to create the ruby server we are not able to pass the "activerecord".to_proto message as while defining the type of the input value we have no other go but to define it as a message in proto3 and hence it only accepts a hash value. inorder for which we have to convert our .to_proto object to "activerecord".to_proto.to_hash. This is futile and reduces the optimality of grpc. Ironically we are shifting to grpc for its optimality. Could you suggest how to define the protobuf message (using proto3) to ensure the "activerecord".to_proto message is compatible with the proto3 input value definition.
This is the active record object.
class AppPreferenceMessage < ::Protobuf::Message
optional ::Protobuf::Field::Int64Field, :id, 1
optional ::Protobuf::Field::StringField, :preference_key, 2
optional ::Protobuf::Field::StringField, :value, 3
optional ::Protobuf::Field::StringField, :value_type, 4
optional ::Protobuf::Field::Int64Field, :vaccount_id, 5
end
This is converted to AppPreference.last.to_proto which is a protobuf message by class.
my protobuf definition of the ruby input parameters is as follows.
syntax="proto3";
service App_Preferences{
rpc Index(Empty) returns (Index_Output){}
}
message Index_Output{
int64 id=1;
string preference_key=2;
string value=3;
string value_type=4;
int64 vaccount_id = 5;
}
This parameter "Index_Output" only accepts AppPreference.last.to_proto.to_hash but however I want it to accept AppPreference.last.to_proto as an input. How do i change my protobuf code.
I think you would like to convert this AppPreferenceMessage object to a protobuf message, and then pass that to an rpc method which has been created from a separate proto file? Also I'm unfamiliar with what AppPreferenceMessage.to_proto returns, is it serialized bytes?
I'm not sure that I'm completely clear, but it sounds like you might not want to use the grpc-protobuf stub and service code generators, as these are designed with passing of ruby protobuf objects in mind.
There are some examples in https://github.com/grpc/grpc/tree/master/examples/ruby/without_protobuf that might be useful if skipping the grpc-protobuf code gen is what you need.

Decrypt AES-256-CBC String (need IV, string/data format?)

I've been going around in circles from Apple's CCCrypto docs, frameworks and other SO answers and am not making any headway.
I think I need to figure out how to get a IV from an encrypted string that I receive.
I receive a JSON payload which contains a String. That string is encrypted in AES-256-CBC. (From a Laravel PHP instance that I think uses OpenSSL). The string itself, decrypted, is another JSON object.
I have a pre-defined key.
The string I receive looks something like:
eJahdkawWKajashwlkwAkajsne8ehAhdhsiwkdkdhwNIEhHEheLlwhwlLLLLhshnNWhwhabwiIWHWHwh=
(but is a lot longer).
I'm trying to use this answer here: Issue using CCCrypt (CommonCrypt) in Swift
But am a) unsure if I'm properly converting the string to data and b) how to get the IV (initialization vector) from the string I receive.
Using that answer I do get "success" however when I try to pass it to the NSJSONSerailizer I never got a good result (it always fails) but I do get data out - I think it's garbage.
Edit:
I really mis-understood my original problem - I was receiving a base64 encoded string that I needed to decode into JSON (which went fine). Then using the linked answer and importing CommonCrypto I thought I'd be able to get usable data but I am not. #Rob Napier 's answer is extremely helpful. I think my problem is that the instance of laravel in question is using OpenSSL.
There is no really commonly used standard format for AES encrypted data (there are several "standard formats" but they're not commonly used....) The only way to know how the data you have is encrypted is to look at the documentation for the data format, or failing that, the encrypting code itself.
In good encryption formats, the IV is sent along with the data. But in many common (insecure) formats, there is a hard-coded IV (sometimes 16 bytes of 0x00). If there's a password, you also need to find out how they've converted the password to a key (there are several ways to do this, some good, some horrible). In a good format, the key derivation may include some random "salt" that you need to extract from the data. You'll also need to know if there is an HMAC or similar authentication (which might be stored at the beginning or the end of the data, and may include its own salt).
There just isn't any good way to know without documentation from the sender. Any decently encrypted format is going to look like random noise, so figuring it out just by looking at the final message is pretty hard.
If this comes out of Laravel's encrypt function, then that seems to be ultimately this code:
public function encrypt($value)
{
$iv = mcrypt_create_iv($this->getIvSize(), $this->getRandomizer());
$value = base64_encode($this->padAndMcrypt($value, $iv));
// Once we have the encrypted value we will go ahead base64_encode the input
// vector and create the MAC for the encrypted value so we can verify its
// authenticity. Then, we'll JSON encode the data in a "payload" array.
$mac = $this->hash($iv = base64_encode($iv), $value);
return base64_encode(json_encode(compact('iv', 'value', 'mac')));
}
If this is correct, then you should have been passed base64-encoded JSON with three fields: the IV (iv), the ciphertext (value), and what looks like an HMAC encrypted using the same key as the plaintext (mac). The data you've given above doesn't look like JSON at all (even after base-64 decoding).
This assumes that the caller used this encrypt function, though. There are many, many ways to encrypt, though, so you need to know how the actual server you're talking to did it.

XML Schema - Allow Invalid Dates

Hi I am using biztalk's FlatFile parser (using XML schema) to part a CSV file. The CSV File sometimes contains invalid date - 1/1/1900. Currently the schema validation for the flat file fails because of invalid date. Is there any setting that I can use to allow the date to be used?
I dont want to read the date as string. I might be forced to if there is no other way.
You could change it to a valid XML date time (e.g., 1900-01-00:00:00Z) using a custom pipeline component (see examples here). Or you can just treat it as a string in your schema and deal with converting it later in a map, in an orchestration, or in a downstream system.
Here is a a C# snippet that you could put into a scripting functoid inside a BizTalk map to convert the string to an xs:dateTime, though you'll need to do some more work if you want to handle the potential for bad input data:
public string ConvertStringDateToDateTime(string param1)
{
return DateTime.Parse(inputDate).ToString("s",System.Globalization.DateTimeFormatInfo.InvariantInfo);
}
Also see this blog post if you're looking to do that in multiple places in a single map.

SHA256 implementation using Base64 for input and output

I've been asked to develop the company's backoffice for the iPad and, while developing the login screen, I've ran into an issue with the authentication process.
The passwords are concatenated with a salt, hashed using SHA-256 and stored in the database.
The backoffice is Flash-based and uses the as3crypto library to hash then password+salt and my problem is that the current implementation uses Base64 for both input and output.
This site demonstrates how this can be done: just select Hash and select Base64 for both input and output format and fire away. So far, all my attempts have yielded different results from the ones this site (and the backoffice code) give me.
While I think that in theory it should be relatively simply:
Base64 encode the pass+salt
Hash it using SHA-256
Base64 encode the result again
so far I haven't been able to do this and I'm getting quite the headache to be honest.
My code is becoming a living maze, i'll have to redo-it tomorrow I reckon.
Any ideas?
Cheers and thanks in advance
PS: Here's the Backoffice's Flash code for generating hashed passwords by the way:
var currentResult:ByteArray;
var hash:IHash = Crypto.getHash('sha256');
var data:ByteArray = Base64.decodeToByteArray(str + vatel);
currentResult = hash.hash(data);
return Base64.encodeByteArray(currentResult).toString();
The backoffice code does not do
Base64 encode the pass+salt
Hash it using SHA-256
Base64 encode the result again
(as you wrote above)
Instead, what it does is
Base64 decode the pass+salt string into a byte array
Hash the byte array using SHA-256
Base64 encode the byte array, returning a string
As per step 1 above, it's a unclear what kind of character encoding the input strings uses. You need to make sure that both systems use the same encoding for the input strings! UTF8, UTF16-LE or UTF16-BE makes a world of a difference in this case!
Start by finding out the correct character encoding to use on the iOS side.
Oh, and Matt Gallagher has written an easy to use wrapper class for hashes to use on iOS, HashValue.m, I've used it with good results.

Rails - Saving Mail Attachment in a Postgres DB, results in PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xa0

Has anyone seen this error before?
PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xa0
I'm trying to save an incoming mail attachment(s), of any file type to the database for processing.
Any ideas?
What type of column are you saving your data to? If the attachment could be of any type, you need a bytea column to ensure that the data is simply passed through as a blob (binary "large" object). As mentioned in other answers, that error indicates that some data sent to PostgreSQL that was tagged as being text in UTF-8 encoding was invalid.
I'd recommend you store email attachments as binary along with their MIME content-type header. The Content-Type header should include the character encoding needed to convert the binary content to text for attachments where that makes sense: e.g. "text/plain; charset=iso-8859-1".
If you want the decoded text available in the database, you can have the application decode it and store the textual content, maybe having an extra column for the decoded version. That's useful if you want to use PostgreSQL's full-text indexing on email attachments, for example. However, if you just want to store them in the database for later retrieval as-is, just store them as binary and leave worrying about text encoding to the application.
The 0xa0 is a non-breaking space, possibly latin1 encoding. In Python I'd use str.decode() and str.encode() to change it from its current encoding to the target encoding, here 'utf8'. But I don't know how you'd go about it in Rails.
I do not know about Rails, but when PG gives this error message it means that :
the connection between postgres and your Rails client is correctly configured to use utf-8 encoding, meaning that all text data going between the client and postgres must be encoed in utf-8
and your Rails client erroneously sent some data encoded in another encoding (most probably latin-1 or ISO-8859) : therefore postgres rejects it
You must look into your client code where the data is inserted into the database, probably you try to insert a non-unicode string or there is some improper transcoding taking place.

Resources