botocore.excceptions.ClientError: An error occurred (InvalidTextEncoding) when calling the SelectObjectContent operation - amazon-s3-select

while executing below code through python
response= S3.select_object_content(Bucket=S3_bucket_name,Key=S3_file_Key,ExpressionType='SQL', Expression="select count(*) from s3object", InputSerialization={'CSV': {"FileHeaderInfo": header_usage},'CompressionType':compressformat}, OutputSerialization={'CSV': {}},)
I am getting error like
Traceback (most recent call last):
File OutputSerialization={'CSV': {}},)
File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 320, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 623, in _make_api_call
raise error_class(parsed_response, operation_name)
**ClientError: An error occurred (InvalidTextEncoding) when calling the SelectObjectContent operation: UTF-8 encoding is required. The text encoding error was found near byte 49,152.**
I searched for Invalid text Encoding in boto3 but couldn't found.
Can you please help me to check this?
Thanks in advance

The data you wish you receive has the wrong Output serialization. The output serialization Describes the format of the data that you want Amazon S3 to return in response, and you are asking it to return a format that has the wrong encoding. I cannot test your code myself because I only have small bits of it, but you need to encode your serialized output to utf-8 format, otherwise the Amazon S3 storage service can't serialize your response. Probably you need to expand OutputSerialization={'CSV': {}} to make sure that your respone is coded in UTF-8 format.
Maybe these resources can help you:
Select object content parameter guide
select Object Content description

Sadly "UTF-8 encoding is required." means that object is not matching required format.
Reference: https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html
UTF-8 - UTF-8 is the only encoding type Amazon S3 Select supports.

Related

Which csv format is appropriate for influxdb2?

I'm going to put the csv file into the bucket using influxdb v2.1.
Attempting to insert a simple example file results in the following error:
error in csv.from(): failed to read metadata: failed to read annotations: expected annotation datatype
The csv file that I was going to write is as follows.
#datatype measurement,tag,double,dateTime:RFC3339
m,host,used_percent,time
mem,host1,64.23,2020-01-01T00:00:00Z
mem,host2,72.01,2020-01-01T00:00:00Z
mem,host1,62.61,2020-01-01T00:00:10Z
mem,host2,72.98,2020-01-01T00:00:10Z
mem,host1,63.40,2020-01-01T00:00:20Z
mem,host2,73.77,2020-01-01T00:00:20Z
This is the example data in the official document of influxdata.
If you look at the first line of the example, you can see that datatype is annotated, but why does the error occur?
How should I modify it?
This looks like invalid annotated CVS.
In the csv.from function documentation, you can find examples (as string literals) of both annotated and raw CVS that the cvs.from supports.

is possible to read a URL compressed with gzip (tvs.gz) with CSVProvider in F#?

is possible to read a URL compressed with gzip (tvs.gz) with CSVProvider in F#? Im trying with this code:
type name = CsvProvider<"https://datasets.imdbws.com/name.basics.tsv.gz", "\t">
But, I'm getting this error:
The type provider 'ProviderImplementation.CsvProvider' reported an error: Cannot read sample CSV from 'https://datasets.imdbws.com/name.basics.tsv.gz': Couldn't parse row 1 according to schema: Expected 2 columns, got 1
So, is possible to use a type provider in F# to do easy analisys over compressed CSVs?
You'll need to decompress the file with something like GZipStream, before being able to read it with CsvProvider.

Ruby: Is there a way to specify your encoding in File.write?

TL;DR
How would I specify the mode of encoding on File.write, or how would one save image binary to a file in a similar fashion?
More Details
I'm trying to download an image from a Trello card and then upload that image to S3 so it has an accessible URL. I have been able to download the image from Trello as binary (I believe it is some form of binary), but I have been having issues saving this as a .jpeg using File.write. Every time I attempt that, it gives me this error in my Rails console:
Encoding::UndefinedConversionError: "\xFF" from ASCII-8BIT to UTF-8
from /app/app/services/customer_order_status_notifier/card.rb:181:in `write'
And here is the code that triggers that:
def trello_pics
#trello_pics ||=
card.attachments.last(config_pics_number)&.map(&:url).map do |url|
binary = Faraday.get(url, nil, {'Authorization' => "OAuth oauth_consumer_key=\"#{ENV['TRELLO_PUBLIC_KEY']}\", oauth_token=\"#{ENV['TRELLO_TOKEN']}\""}).body
File.write(FILE_LOCATION, binary) # doesn't work
run_me
end
end
So I figure this must be an issue with the way that File.write converts the input into a file. Is there a way to specify encoding?
AFIK you can't do it at the time of performing the write, but you can do it at the time of creating the File object; here an example of UTF8 encoding:
File.open(FILE_LOCATION, "w:UTF-8") do
|f|
f.write(....)
end
Another possibility would be to use the external_encoding option:
File.open(FILE_LOCATION, "w", external_encoding: Encoding::UTF_8)
Of course this assumes that the data which is written, is a String. If you have (packed) binary data, you would use "wb" for openeing the file, and syswrite instead of write to write the data to the file.
UPDATE As engineersmnky points out in a comment, the arguments for the encoding can also be passed as parameter to the write method itself, for instance
IO::write(FILE_LOCATION, data_to_write, external_encoding: Encoding::UTF_8)

File object from URL

I'd like to create a file object from an image located at a specific url. I'm downloading the file with Net Http:
img = Net::HTTP.get_response(URI.parse('https://prium-solutions.com/wp-content/uploads/2016/11/rails-1.png'))
file = File.read(img.body)
However, I get ArgumentError: string contains null byte when trying to read the file and store in into the file variable.
How can I do this without having to store it locally ?
Since File deals with reading from storage, it's really not applicable here. The read method is expecting you to hand it a location to read from, and you're passing in binary data.
If you have a situation where you need to interface with a library that expects an object that is streaming, you can wrap the string body in a StringIO object:
file = StringIO.new(img)
# you can now call file.read, file.seek, file.rewind, etc.

Writing valid Excel file

I got an Endpoint that receives emails with .xlsx files as attachments. I want to save these file in my app, so I can later access the data.
After receiving the mail and its attachment - which has a mime_type of application/vnd.openxmlformats-officedocument.spreadsheetml.sheet- I call
path = "data/emails/#{attachment.filename}"
File.write(path, attachment.body.decoded)
but I get this error:
Encoding::UndefinedConversionError: "\x85" from ASCII-8BIT to UTF-8
When I use add .force_encoding('utf-8') to the decoded body, it does succeed, but the file it writes becomes invalid. I cannot open it normally, nor access its data.
How do I write a normal Excel file?
Does this work?
File.open( path, "w+b", 0644 ) { |f| f.write attachment.body.decoded }
Taken from here:
https://cbpowell.wordpress.com/2011/01/17/saving-attachments-with-ruby-1-9-2-rails-3-and-the-mail-gem/

Resources