Not able to manipulate the image after uploading it on Telegram - image-processing

I'm trying to build a telegram bot for resizing image. I'm using Python for coding and using the pyTelegramBotAPI as the wrapper. The issue that I'm facing is that I'm not able to manipulate the image after I upload it on Telegram. I'm trying to use Pillow module for image manipulation. As fas as I understand as of now, the bot API is providing the "file ID" of the image instead of the image itself, due to which the Pillow module is not able to do anything about it. Here's the function that I've written so far:
# Handles all sent image files
#bot.message_handler(content_types=['photo'])
def image_resize(message):
sent_photo = message.photo[-1].file_id
bot.send_message(message.chat.id, "Enter the desired dimensions (WIDTHxHEIGHT), for example 300x150.")
#bot.message_handler(content_types=['text'])
def resize(message):
width, height = message.text.split("x")
im = Image.open(sent_photo)
resized_img = im.resize((width, height))
bot.send_photo(message.chat.id, resized_img)
The error that I'm getting:
im = Image.open(sent_photo)
File "/home/runner/imageResizertb/venv/lib/python3.8/site-packages/PIL/Image.py", line 3068, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'AgACAgUAAxkBAANyYnk0JGdGDY1_NbbYlfiYamTyrYMAAtmwMRvHg8lXDixv48nY6SYBAAMCAAN5AAMkBA'
Can anybody help me understanding how the API is handling the image file and how to manipulate it?

You are sending the File/Image ID to pillow. You'll need to save the file name somewhere or download the file into memory as such:
from io import BytesIO
#bot.message_handler(content_types=['text'])
def resize(message):
file = bot.get_file(sent_photo)
download_file = bot.download_file(file.file_path)
f = BytesIO(download_file)
im = Image.open(f)
resized_img = im.resize((width, height))
bot.send_photo(message.chat.id, resized_img)

Related

Ruby on Rails - How to convert to images some elements from a word document

Context
In our platform we allow users to upload word documents, those documents are stored in google drive and then dowloaded again to our platform in HTML format to create a section where the users can interact with that content.
Rails 5.0.7
Ruby 2.5.7p206
selenium-webdriver 3.142.7 (latest stable version compatible with our ruby and rails versions)
Problem
Some of the documents have charts or graphics inside that are not processed correctly giving wrong results after all the process.
We have been trying to fix this problem at the moment we get the word document and before to send it to google drive.
I'm looking for a simple way to export the entire chart and/or table as an image, if anyone knows of a way to do this the advice would be much appreciated.
Edit 1: Adding some screenshots:
This screenshot is from the original word doc:
And this is how it looks in our systems:
Here are the approaches I have tried that haven't worked for me so far.
Approach 1
Using nokogiri to read the document and found the nodes that contain the charts (we've found that they are called drawing) and then use Selenium to navigate through the file and take and screenshot of that particular section.
The problem we found with this approach is that the versions our gems are not compatible with the latest versions of selenium and its web drivers (chrome or firefox) and it is not posible to perform this action.
Other problem, and it seems is due to security, is that selenium is not able to browse inside local files and open it.
options = Selenium::WebDriver::Firefox::Options.new(binary: '/usr/bin/firefox', headless: true)
driver = Selenium::WebDriver.for :firefox, options: options
path = "#{Rails.root}/doc_file.docx"
driver.navigate.to("file://#{path}")
# Here occurs the first issue, it is not able to navigate to the file
puts "Title: #{driver.title}"
puts "URL: #{driver.current_url}"
# Below is the code that I am trying to use to replace the images with the modified images
drawing_elements = driver.find_elements(:css, 'w|drawing')
modified_paragraphs = []
drawing_elements.each do |drawing_element|
paragraph_element = drawing_element.find_element(:xpath, '..')
paragraph_element.screenshot.save('paragraph.png')
modified_paragraph = File.read('paragraph.png')
modified_paragraphs << modified_paragraph
end
driver.quit
file = File.open(File.join(Rails.root, 'doc_file.docx'))
doc = Nokogiri::XML(file)
drawing_elements = doc.css('w|drawing')
drawing_elements.each_with_index do |drawing_element, i|
paragraph_element = drawing_element.parent
paragraph_element.replace(modified_paragraphs[i])
end
new_doc_file = File.write('modified_doc.docx', doc.to_xml)
s3_client.put_object(bucket: bucket, key: #document_path, body: new_doc_file)
File.delete('doc_file.docx')
Approach 2
Using nokogiri to get the drawing elements and the try to convert it directly to an image using rmagick or mini_magick.
It is only possible if the drawing element actually contains an image, it can convert that correctly to an image, but the problem is when inside of the drawing element are not images but other elements like graphicData, pic, blipFill, blip. It needs to start looping into the element and rebuilding it, but at that point of time it seems that the element is malformed and it can't rebuild it.
Other issue with this approach is when it founds elements that seem to conform an svg file, it also needs to loop into all the elements and try to rebuild it, but the same as the above issue, it seems that the element is malformed.
response = s3_client.get_object(bucket: bucket, key: #document_path)
docx = response.body.read
Zip::File.open_buffer(docx) do |zip|
doc = zip.find_entry("word/document.xml")
doc_xml = doc.get_input_stream.read
doc = Nokogiri::XML(doc_xml)
drawing_elements = doc.xpath("//w:drawing")
drawing_elements.each do |drawing_element|
node = get_chil_by_name(drawing_element, "graphic")
if node.xpath("//a:graphicData/a:pic/a:blipFill/a:blip").any?
img_data = node.xpath("//a:graphicData/a:pic/a:blipFill/a:blip").first.attributes["r:embed"].value
img = Magick::Image.from_blob(img_data).first
img.write("node.jpeg")
node.replace("<img src='#{img.to_blob}'/>")
elsif node.xpath("//a:graphicData/a:svg").any?
svg_data = node.xpath("//a:graphicData/a:svg").to_s
Prawn::Document.generate("node.pdf") do |pdf|
pdf.svg svg_data, at: [0, pdf.cursor], width: pdf.bounds.width
end
else
puts "unsupported format"
end
end
# update the file in S3
s3.put_object(bucket: bucket, key: #document_path, body: doc)
end
Approach 3
Convert the elements since its parents to a pdf file and then to an image.
Basically the same issue as in the approach 2, it needs to loop inside all the elements and try to rebuild it, we haven't found a way to do that.

unknown url type: '//drive.google.com/drive/folders/11XfAPOgFv7qJbdUdPpHKy8pt6aItGvyg'

I am trying to use Haar cascade classifier for object detection.I have copied a code for haar cascade algorithm but its not working.It's giving error as
unknown url type: '//drive.google.com/drive/folders/11XfAPOgFv7qJbdUdPpHKy8pt6aItGvyg'
even though this link is working.
import urllib.request, urllib.error, urllib.parse
import cv2
import os
def store_raw_images():
neg_images_link = '//drive.google.com/drive/folders/11XfAPOgFv7qJbdUdPpHKy8pt6aItGvyg'
neg_image_urls = urllib.request.urlopen(neg_images_link).read().decode()
pic_num = 1
if not os.path.exists('neg'):
os.makedirs('neg')
for i in neg_image_urls.split('\n'):
try:
print(i)
urllib.request.urlretrieve(i, "neg/"+str(pic_num)+".jpg")
img = cv2.imread("neg/"+str(pic_num)+".jpg",cv2.IMREAD_GRAYSCALE)
# should be larger than samples / pos pic (so we can place our image on it)
resized_image = cv2.resize(img, (100, 100))
cv2.imwrite("neg/"+str(pic_num)+".jpg",resized_image)
pic_num += 1
except Exception as e:
print(str(e))
store_raw_images()
I am expecting output as set of negative images for creating dataset module for object detection.
I think the missing "https:" at the start of the url is the causing the specific error.
Furthermore, you cannot just load a drive folder when it is not shared (you should use the drive link) and event then it is not optimal, you have to parse the html response and it may not even work.
I strongly suggest you to use a normal HTTP server or the Google Drive python API.

How to get a bitmap image in ruby?

The google vision API requires a bitmap sent as an argument. I am trying to convert a png from a URL to a bitmap to pass to the google api:
require "google/cloud/vision"
PROJECT_ID = Rails.application.secrets["project_id"]
KEY_FILE = "#{Rails.root}/#{Rails.application.secrets["key_file"]}"
google_vision = Google::Cloud::Vision.new project: PROJECT_ID, keyfile: KEY_FILE
img = open("https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png").read
image = google_vision.image img
ArgumentError: string contains null byte
This is the source code processing of the gem:
def self.from_source source, vision = nil
if source.respond_to?(:read) && source.respond_to?(:rewind)
return from_io(source, vision)
end
# Convert Storage::File objects to the URL
source = source.to_gs_url if source.respond_to? :to_gs_url
# Everything should be a string from now on
source = String source
# Create an Image from a HTTP/HTTPS URL or Google Storage URL.
return from_url(source, vision) if url? source
# Create an image from a file on the filesystem
if File.file? source
unless File.readable? source
fail ArgumentError, "Cannot read #{source}"
end
return from_io(File.open(source, "rb"), vision)
end
fail ArgumentError, "Unable to convert #{source} to an Image"
end
https://github.com/GoogleCloudPlatform/google-cloud-ruby
Why is it telling me string contains null byte? How can I get a bitmap in ruby?
According to the documentation (which, to be fair, is not exactly easy to find without digging into the source code), Google::Cloud::Vision#image doesn't want the raw image bytes, it wants a path or URL of some sort:
Use Vision::Project#image to create images for the Cloud Vision service.
You can provide a file path:
[...]
Or any publicly-accessible image HTTP/HTTPS URL:
[...]
Or, you can initialize the image with a Google Cloud Storage URI:
So you'd want to say something like:
image = google_vision.image "https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png"
instead of reading the image data yourself.
Instead of using write you want to use IO.copy_stream as it streams the download straight to the file system instead of reading the whole file into memory and then writing it:
require 'open-uri'
require 'tempfile'
uri = URI("https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png")
tmp_img = Tempfile.new(uri.path.split('/').last)
IO.copy_stream(open(uri), tmp_img)
Note that you don't need to set the 'r:BINARY' flag as the bytes are just streamed without actually reading the file.
You can then use the file by:
require "google/cloud/vision"
# Use fetch as it raises an error if the key is not present
PROJECT_ID = Rails.application.secrets.fetch("project_id")
# Rails.root is a Pathname object so use `.join` to construct paths
KEY_FILE = Rails.root.join(Rails.application.secrets.fetch("key_file"))
google_vision = Google::Cloud::Vision.new(
project: PROJECT_ID,
keyfile: KEY_FILE
)
image = google_vision.image(File.absolute_path(tmp_img))
When you are done you clean up by calling tmp_img.unlink.
Remember to read things in binary format:
open("https://www.google.com/..._272x92dp.png",'r:BINARY').read
If you forget this it might try and open it as UTF-8 textual data which would cause lots of problems.

Converting a pdf to jpeg using Rmagick

I am trying to convert a pdf to a jpeg image using Rmagick. I am running into some trouble with the following code:
pdf_link = "https://staging.shurpa.com/deliveries/BtrPsIxl/label.pdf"
file = Tempfile.new(['order', '.jpeg'])
p pdf_link
p file.path
im = ImageList.new(pdf_link)
puts "SUPP"
im.write(file.path.to_s)
I recieve this error:
"https://staging.shurpa.com/deliveries/BtrPsIxl/label.pdf"
"/var/folders/qm/yk_w5d9545j_6wqk6100dhjm0000gq/T/order20170706-43294-
15myct1.png"
Magick::ImageMagickError: unable to open file `/var/folders/qm/yk_w5d9545j_6wqk6100dhjm0000gq/T/magick-43294MCNyzIu4Oenn': No such file or directory # error/constitute.c/ReadImage/544from/Users/timnaughton/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/rmagick-2.16.0/lib/rmagick_internal.rb:1616:in `read'
However the code works perfectly fine when I feed it this pdf_string:
"https://shippo-delivery-east.s3.amazonaws.com/b2a3e1cd070748cd80b492aa421832a3.pdf?Signature=nf6woycGiOydPI8eSnLcq3r0tEc%3D&Expires=1530816480&AWSAccessKeyId=AKIAJGLCC5MYLLWIG42A"
There appears to have been an issue with the service that was providing the pdf to me. The pdf was recently changed to a secure state and I needed to utilize access keys. This resulted in rmagick not being able to access the image file and returning the stated error.

Downloaded image file is corrupt

I'm making a simple Lua script to download images. I get the URL of the image, and then this is my code to download it:
content = http.request(imageurl)
file = io.open("E:\\Users\\Me\\Documents\\Lua\\IMGDownload\\output.jpg", "w")
file:write(content)
print("Wrote content")
I get a 4KB file, however it isn't what I want it to be.
For reference, here is the image that I want to download:
RealImage http://cdn.akamai.steamstatic.com/steamcommunity/public/images/avatars/bd/bd05e23129b5d03ecb3f933589ff1477fbff4e92_full.jpg
This is what I actually get:
Can anyone pinpoint me as to the cause?
You probably just need to open the file with "wb" mode to get Windows to open the file in binary mode and not do line-ending conversion on you.
Try io.open("E:\\Users\\Me\\Documents\\Lua\\IMGDownload\\output.jpg", "wb").

Resources