replace special characters in a alternate text of image - str-replace

I am trying to generate alternate text for the images in my blog to make my work easily and i found a mistake in the execution of this script.
If the image name is "image_good_looking.jpg" the output will be "image_good_looking".
Good upto some extent. If the image name is"image good looking.jpg" before upload it changes to "image+good+looking.jpg". I tried filename.replace("+"," "),title.replace("+"," ") But in the output there is no change in title and alternate text of the image.
this script must be placed after section
var filename = $img.attr('src')
$img.attr('title', filename.substring((filename.lastIndexOf('/'))+1, filename.lastIndexOf('.')));

$img.attr('title',filename.replace(/\W/," "));
This will replace all non-alfanumeric characters with space.
Is this what are you looking for ?

Related

regex to extract URLs from text - Ruby

I am trying to detect the urls from a text and replace them by wrapping in quotes like below:
original text: Hey, it is a url here www.example.com
required text: Hey, it is a url here "www.example.com"
original text show my input value and required text represents the required output. I searched a lot on web but could not find any possible solution. I already have tried URL.extract feature but that doesn't seem to detect URLs without http or https. Below are the examples of some of urls I want to deal with. Kindly let me know if you know the solution.
ANQUETIL-DUPERRON Abraham-Hyacinthe, KIEFFER Jean-Luc, www.hominides.net/html/actualites/outils-preuve-presence-hominides-asie-0422.php,Les Belles lettres, 2001.
https://www.ancient-code.com/indian-archeologists-stumbleacross-ruins-great-forgotten-civilization-mizoram/
www.jstor.org/stable/24084454
www.biorespire.com/2016/03/22/une-nouvelle-villeantique-d%C3%A9couverte-en-inde/
insu.cnrs.fr/terre-solide/terre-et-vie/de-nouvellesdatations-repoussent-l-age-de-l-apparition-d-outils-surle-so
www.cerege.fr/spip.php?page=pageperso&id_user=94
Find words who look like urls:
str = "ANQUETIL-DUPERRON Abraham-Hyacinthe, KIEFFER Jean-Luc, www.hominides.net/html/actualites/outils-preuve-presence-hominides-asie-0422.php,Les Belles lettres, 2001.\n\nhttps://www.ancient-code.com/indian-archeologists-stumbleacross-ruins-great-forgotten-civilization-mizoram/\n\nwww.jstor.org/stable/24084454\n\nwww.biorespire.com/2016/03/22/une-nouvelle-villeantique-d%C3%A9couverte-en-inde/\n\ninsu.cnrs.fr/terre-solide/terre-et-vie/de-nouvellesdatations-repoussent-l-age-de-l-apparition-d-outils-surle-so\n\nwww.cerege.fr/spip.php?page=pageperso&id_user=94"
str.split.select{|w| w[/(\b+\.\w+)/]}
This will give you an array of words which have no spaces and include a one or more . characters which MIGHT work for your use case.
puts str.split.select{|w| w[/(\b+\.\w+)/]}
www.hominides.net/html/actualites/outils-preuve-presence-hominides-asie-0422.php,
https://www.ancient-code.com/indian-archeologists-stumbleacross-ruins-great-forgotten-civilization-mizoram/
www.jstor.org/stable/24084454
www.biorespire.com/2016/03/22/une-nouvelle-villeantique-d%C3%A9couverte-en-inde/
insu.cnrs.fr/terre-solide/terre-et-vie/de-nouvellesdatations-repoussent-l-age-de-l-apparition-d-outils-surle-so
www.cerege.fr/spip.php?page=pageperso&id_user=94
Updated
Complete solution to modify your string:
str_with_quote = str.clone # make a clone for the `gsub!`
str.split.select{|w| w[/(\b+\.\w+)/]}
.each{|url| str_with_quote.gsub!(url, '"' + url + '"')}
Now your cloned object wraps urls inside double quotes
puts str_with_quote
Will give you this output
ANQUETIL-DUPERRON Abraham-Hyacinthe, KIEFFER Jean-Luc, "www.hominides.net/html/actualites/outils-preuve-presence-hominides-asie-0422.php,Les" Belles lettres, 2001.
"https://www.ancient-code.com/indian-archeologists-stumbleacross-ruins-great-forgotten-civilization-mizoram/"
"www.jstor.org/stable/24084454"
"www.biorespire.com/2016/03/22/une-nouvelle-villeantique-d%C3%A9couverte-en-inde/"
"insu.cnrs.fr/terre-solide/terre-et-vie/de-nouvellesdatations-repoussent-l-age-de-l-apparition-d-outils-surle-so"
"www.cerege.fr/spip.php?page=pageperso&id_user=94"

Saving image in plot window after placing points in plot

Using Octave, I am able to show a image and then plot some red circles over it, as follow:
tux = imread('tux.png');
imshow(tux);
hold on;
plot(100,100,'r','markersize', 10);
plot(150,200,'r','markersize', 10);
The above code display this window:
My question is: How can I save this image as it is being showed inside the window?
Thank you very much!
Pretty simple. Use:
print -djpg image.jpg
print is a command in Octave that allows you to capture what's currently seen in the current figure window. -d specifies what output device you want to write to. There are multiple "devices" you can use to save to file... EPS, PS, TEX, etc. A device can also be an image writer, and so here I chose JPEG. You can choose other valid image formats that are supported by Octave. Take a look at the link I provided above for more details.
After, you just specify what file name you want to save the plot to. In this case, I chose image.jpg.
You can also take a look at saveas. Make sure you get a handle to the current figure first before doing so:
h = gcf;
saveas(h, "image.jpg");
Also... a more point-and-click approach would be to Go to File -> Save As in the figure that your image is displayed in :)
You can use print to save your plot to a file:
print (FILENAME, OPTIONS) // for the current figure
print (H, FILENAME, OPTIONS) // for the figure handle H
and also take a look to saveas
saveas (H, FILENAME)

Rails/Ruby save image as base64 and access it in the views

I would like to know can we convert a image into base64 and save it in a database and access it in the views.
I have searched google and stackoverflow, all of them starts from middle like encoding or displaying the image.
I need to know how can we convert a image url/path(lets say i store image inside my app and its url stored in column)
How to encode it as base64 before saving(should we convert to base64 first and save in db?).
How to display it in the views
You can split this task to three or four steps:
getting the image
encoding to base64
storing it in database (optionaly)
display it in views
Getting the image
From Assets pipeline
If you are using Rails asset pipeline for that, you can use Rails.application.assets hash to get to image: Rails.application.assets['image_name.png'].to_s will give you the content of image_name.png image.
from file - local or by url
Here is the question about that on StackOverflow.
encode
Base64 Ruby module docs tells how to use Base64 encoding in Ruby:
Base64.strict_encode64(your_content_here)
NOTE: in this case strict_encode64 is preferrable over just encode64 because it doesn't add any newlines. (credit goes to Sergey Mell for pointing that out)
From docs:
encode64 - ... Line feeds are added to every 60 encoded characters.
strict_encode64 - ... No line feeds are added.
Store it in database (optionaly)
I suggest you to create a separate ActiveRecord model for that, with field of type text to keep base64 representation of image.
Display it in views
You can provide data-url to src attribute of img tag, so, the browser will decode image from base64 and display it just like regular image:
<img src="data:image/png;base64,YOUR_BASE64_HERE"/>
Don't forget to change image format to whatever format you are using in data:image/png section.
UPDATE (2018-08-22): I have tried to use urlsafe_encode64, as suggested by Xornand, and for me it produces the output that is not recognized as image by the browser.
Tried in both Firefox 61.0.2 and Chromium 68.0.3440.106.
For the sake of reference and to enable experimentation, here are results themselves.
Image used as "original" (resized it to be even more small to reduce the size of base64 output):
encode64:
/9j/4AAQSkZJRgABAQEAYABhAAD/4QBARXhpZgAASUkqAAgAAAABAGmHBAAB
AAAAGgAAAAAAAAACAAKgCQABAAAAZAAAAAOgCQABAAAAfwAAAAAAAAD/2wBD
AAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwc
KDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIy
MjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/
/gA0T3B0aW1pemVkIGJ5IEpQRUdtaW5pIDMuMTQuMTQuNzI2NzA4NjAgMHhm
ZjIzZjM3OQD/wAARCAB/AGQDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAA
AAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIh
MUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3
ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKT
lJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi
4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQF
BgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMi
MoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZH
SElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJma
oqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq
8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDyNj85+tBbFEvE7j0Y/wA6YxrA2HIx
Jqwp4rS0LwvfayyspWKFj99z1+grvrf4TCSEMLqVzjqu3+VTKSRUYtnmQrS0
WTy9Tib0DfyNdNq/wz1WwUvbAyp6MMH/AArl7eKaw1JUuImjdcgq4x2NS2mt
CopqSuZkLf6Vdn/ZevR/ADgaHAv/AE9E/qteaQn9/de6NXoHw/c/YLVSetye
PxFKrsXT/wAzoPFigaovH/LIfzNYHet/xW2dVHtGP61gfxVqcKLMNXojjFUI
zVyI0AaSN8tFRKw2iigDxi9+S8mX0c1b0mxa6nDNGXUnCr/eNPNgbu5aduIz
g4HU1oaZdvp16JBErIo2hCe1M3SO20uKGwgVZdW0+G9Yjy4XfcR7EjgfSuv8
M6pqc91LZzWitLD1aORQCOxGOoI9K8t1W90vU9k0cM1vMgySuOvpVrwvrM1l
qguXnYkptLE9QOgqJQW5tGfQ92Sa4+YXMIEZ/vNkiud8Q+GdM1C4U3EAKn7r
rwy/jV3Tdfi1G1yrBiv3sH2qw7xOgjBJ5PfNc8tNUax8zx7xb8OJtAt5tU09
3uLIofMB+9HnufUe9R+BDstrTP8Az8/4V7KGla3aEhJIWUgqwzkHqD7Vxh0z
StHuGitrRYvn3hdxIBP16UOfMrMpQszO8TtnVT7ItYWfmrvP7PttQgDSJGc8
cZyKzbvwaWQvZTZcfwMev41pGtF7nJLDSitNTnY2xVqFqptFLbzNDMjJIhwV
YYIqaNua2Odmkr8UVCrfLRQI8fkuZAkTq7KSuCQfSrumTmdzE7Ev1BJ5PtVc
2h8pR6E01LaRHDoSGByCO1O6Ntbm6I8KSfujnJNLp5Ny2YSMZPzH/CobK9md
TBOvDcbgOPxrS0q3FrMzMg2hSMdATSRdzodD1M6ULhgcuy8Z9feuz8K6m7Rs
8nzDOcV5vZ2sjYeT+Lk5712emSER4XACjtUTjc0gzupbw+SZVXI6kKOgrh9V
kj1LV5o7dlMipn3Fbdtq4is5uN+xGZlxyR3rza21Rv7QkugzbgeOe1Z8hpzn
daTPIlqI5wWKjhgBW7pV1NNOqgDHGD2+hFcPp+tQFCJxt3Hsa6fTr2OJCImK
c8NwSKiUS1I1fFehW+sWMl1arjULZMsg6yIO2PUdq82j4Nes6JdFplaQDzF4
LDoR/hXA+KdM/szxDcIi4hlPmx49D2/A5Fa05aWZx14Wd0ZyniimA8UVqYHI
f2YxUAKT9BT10iU/8smrtFt+igKDj0xUgszj+EAjpnvWWp1XRx8eiTHny8fU
1ZXSJY0JbBXqQCa61bMocMB+dP8AsIOFwSPXrRewHPQSIqtnAA/StbSbyJWd
JABjpmsnVITp95sK5VxxVCZ7hIfNAIReorQSN241rZp14DgE5xjqPpXG2MFz
OGkWKVgxyQqkk0GZ76QDkAnkDvWnYtJB8ylyxGAFOD/Opehe4W1o8TgzFkXI
3BgRj2Nd3oFpaRyrKLndkgDL8EHsa4GUzzXn7wSIrDOWFamlXX2Z1UuXj6Z6
YOf8DWctTRHpVsklndgRO5CneoI6gfeXjrxn8q534mzXE8lk9kkxePdv8sEg
A4xnH0NaFnrCSTRqXDsjI69iCDj/AB/OrmmXMRu78yuMRuEXP4nH6ipS1CTS
VzyIavfLwWGR6rRXsk2q6akhDFWPrRV6kc0exzYt125bJ5x171ZjRMBgBn37
f54pflI+RgWxywPQn61Irp8rE9T9AB/XtTuZFTUL2PT41eU5JbCr/WqTa7HK
h+zwOHI6ucD9OtXdW0watbgCURyq2UbHHTvzWNB4ZvlYia6iSNc5cAnFVFxt
qGpl3yfanNxPK3m5zurIvbtlQxlgQRxiu1j0S0Q4Y/aD3yeled6skkGry26c
DzMADsKaak9B7Ghp8IRN38Zq4kRwcdR2FV7dvL24BNaMO1U8xjggfnQ0NMdF
H5u1pZgoTkD+lSGzDMyxyDAP3QOTVAk4AUkY5GBUwvXRspuLduO/+FQ0aJmi
pW3uLeRWAYdQO/HWqUurEu7hyGdiSB64qPVLosgdk2yDAB/z7YrC8w4OR9DT
SIm+hpSamxcnOfq1FZhznjB+uKKqxlc7mbW4wxUEB3PY9PWs648RyqC0UmRj
GCDz9M1z7Ssw6E+h/wDr0EyDCuyKTwBn/wCtT5A5i/J4mvy52yMoI5Kt/wDr
qnJ4jvSmDO6jOcfhVSQBztMnA7VnzWzgMVOfenyhcsy6nqFyCyzOV6fexVBp
5klRnVt3c9ah8+WDCyLj+tWYbxfNUkgnPemlYNzctb1EjVpGwBST6ukjbUbC
9u1Vbm5glRfkAJ61HDFaMwWWIbCeqnBFLQpXNGOZnXhjnHQ1t2WlyahbkrKq
Mv8AfbFQWekaWsSMNQuMH+BcEj8xXQ28XhuBQLiW6fI5Es23H4Lis5M1Rzl/
avCfIlGHBznOQfpWFMJY3MYHOevtXWavf6NLtis4eQNibXJOe3Wsz7EJYVJ+
WTJ4Y8/lTiRM5vzH7LkfQ0VsGylPKqCP9oYNFWZ2LqaRMedwQcA5GBnPT+dT
x+G55QjuoUYJyT64/wDr126m1FuXWDdEyB2O3qehz9MdqmSKBAxODGg4DKRk
k5+o/Gq5hWOCl8MOoARCcn7w7+1V5fDtwgYlDkLnaRyRz/8AW/OvRzCm5I5F
xtIXDAnaep5/EYqZbeP7SGLFcIV5XIwecE9e4HXHFJTHynlf/CO3kilhbrIh
74yaiTwZNMob7MQu4/cznOcYwP5V69BaLIXO5d5GMrk4JPcdccgjJ4H51KsE
ciufOWNVj+Vwo477h+J7envRzhynkcHw91K4d/LMsSocev6fiK04fhlqu8Aa
hApyAQ69M9Mnp+HuK9KmDKy+RC+TglyflHJwQfX6/wD16mgikmMZSaJdoJYk
gsWHGMgf/XqHJlKKPOovhnqzXBR9aiUA4yik8YznH4VswfCiwkPmXWq3kxUE
lUCqT6DGPr+Vd5GrWrKIwXUqxycFwccf1/Sk3MUiXa6MxCsfKJUYIHfvz19v
yV2By8XgvTLBFitLWJSyk72OW47kfhWTc6TdW7yqU/dISCSpPy8dwTnHNegv
GoAizH5oAJBXIkx6jt25z/KmSxGUqhQLGwJYseoBHyj06nii4Hmi2CwqEMg+
uDz+VFehrpNm67sbc9uf6Yopgc2LlYrUSRmQxbifkB6AjBGMce3epLaJkYzf
vGMq/IrEkD2OO+fTPaq9vP5ixIYkMkyEyehVeP6irULLGsZLO4UGRUXAbAA6
9jyfXvVEk6P587xzLtTftUq3PbB9c85okCLbkvviQkBGwxb1HqfQ/nxUT2zz
xqY3O+QCRMSbQo/75PNKpULDtDP5rELliAcj3z2J/OkMekUtvK0cYRkEahGV
dmck9cn0AOetSLd7IxNK25WRdqqBg5ByTnjoCTUkJkkgl+0IglRsAZ47ZGR6
f0qAXm+KGJCFIUAEOxYDjuVPP40ASIjGTz4RDJLLj5mUAEj7v69xU1tciV18
9CSSzHBGBjscY9/z7GkDTm5t5ZZFbBMYC9/XnjuKhUyyWsaxIsJRmXaVVgMk
jHoec+nA96Vh3L3kjc0gmQdAXTGRjv3HrxxxmnXELXksZlOYo1Mu5xtKk579
CMDseuKy7Uu1vb/ZFZImZm25AJHX6etWiyNdC3mQqzyDYoY4Kjk9OnBzz60h
lkyxzweXMpmGMBk3bDkAgkg9MVdF5Daw5TAUL/q26qOg69ecDnFUCUhLkRvJ
wzZUj5SBgcHHBANKpeUCcp5kY3AggHIHPTgYoAtOZWCGWxt7klQVkZl5Htx0
opvn2CRxLDL5ShB8ihgB+GKKBH//2Q==
strict_encode64:
/9j/4AAQSkZJRgABAQEAYABhAAD/4QBARXhpZgAASUkqAAgAAAABAGmHBAABAAAAGgAAAAAAAAACAAKgCQABAAAAZAAAAAOgCQABAAAAfwAAAAAAAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL//gA0T3B0aW1pemVkIGJ5IEpQRUdtaW5pIDMuMTQuMTQuNzI2NzA4NjAgMHhmZjIzZjM3OQD/wAARCAB/AGQDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDyNj85+tBbFEvE7j0Y/wA6YxrA2HIxJqwp4rS0LwvfayyspWKFj99z1+grvrf4TCSEMLqVzjqu3+VTKSRUYtnmQrS0WTy9Tib0DfyNdNq/wz1WwUvbAyp6MMH/AArl7eKaw1JUuImjdcgq4x2NS2mtCopqSuZkLf6Vdn/ZevR/ADgaHAv/AE9E/qteaQn9/de6NXoHw/c/YLVSetyePxFKrsXT/wAzoPFigaovH/LIfzNYHet/xW2dVHtGP61gfxVqcKLMNXojjFUIzVyI0AaSN8tFRKw2iigDxi9+S8mX0c1b0mxa6nDNGXUnCr/eNPNgbu5aduIzg4HU1oaZdvp16JBErIo2hCe1M3SO20uKGwgVZdW0+G9Yjy4XfcR7EjgfSuv8M6pqc91LZzWitLD1aORQCOxGOoI9K8t1W90vU9k0cM1vMgySuOvpVrwvrM1lqguXnYkptLE9QOgqJQW5tGfQ92Sa4+YXMIEZ/vNkiud8Q+GdM1C4U3EAKn7rrwy/jV3Tdfi1G1yrBiv3sH2qw7xOgjBJ5PfNc8tNUax8zx7xb8OJtAt5tU093uLIofMB+9HnufUe9R+BDstrTP8Az8/4V7KGla3aEhJIWUgqwzkHqD7Vxh0zStHuGitrRYvn3hdxIBP16UOfMrMpQszO8TtnVT7ItYWfmrvP7PttQgDSJGc8cZyKzbvwaWQvZTZcfwMev41pGtF7nJLDSitNTnY2xVqFqptFLbzNDMjJIhwVYYIqaNua2Odmkr8UVCrfLRQI8fkuZAkTq7KSuCQfSrumTmdzE7Ev1BJ5PtVc2h8pR6E01LaRHDoSGByCO1O6Ntbm6I8KSfujnJNLp5Ny2YSMZPzH/CobK9mdTBOvDcbgOPxrS0q3FrMzMg2hSMdATSRdzodD1M6ULhgcuy8Z9feuz8K6m7Rs8nzDOcV5vZ2sjYeT+Lk5712emSER4XACjtUTjc0gzupbw+SZVXI6kKOgrh9Vkj1LV5o7dlMipn3Fbdtq4is5uN+xGZlxyR3rza21Rv7QkugzbgeOe1Z8hpzndaTPIlqI5wWKjhgBW7pV1NNOqgDHGD2+hFcPp+tQFCJxt3Hsa6fTr2OJCImKc8NwSKiUS1I1fFehW+sWMl1arjULZMsg6yIO2PUdq82j4Nes6JdFplaQDzF4LDoR/hXA+KdM/szxDcIi4hlPmx49D2/A5Fa05aWZx14Wd0ZyniimA8UVqYHIf2YxUAKT9BT10iU/8smrtFt+igKDj0xUgszj+EAjpnvWWp1XRx8eiTHny8fU1ZXSJY0JbBXqQCa61bMocMB+dP8AsIOFwSPXrRewHPQSIqtnAA/StbSbyJWdJABjpmsnVITp95sK5VxxVCZ7hIfNAIReorQSN241rZp14DgE5xjqPpXG2MFzOGkWKVgxyQqkk0GZ76QDkAnkDvWnYtJB8ylyxGAFOD/Opehe4W1o8TgzFkXI3BgRj2Nd3oFpaRyrKLndkgDL8EHsa4GUzzXn7wSIrDOWFamlXX2Z1UuXj6Z6YOf8DWctTRHpVsklndgRO5CneoI6gfeXjrxn8q534mzXE8lk9kkxePdv8sEgA4xnH0NaFnrCSTRqXDsjI69iCDj/AB/OrmmXMRu78yuMRuEXP4nH6ipS1CTSVzyIavfLwWGR6rRXsk2q6akhDFWPrRV6kc0exzYt125bJ5x171ZjRMBgBn37f54pflI+RgWxywPQn61Irp8rE9T9AB/XtTuZFTUL2PT41eU5JbCr/WqTa7HKh+zwOHI6ucD9OtXdW0watbgCURyq2UbHHTvzWNB4ZvlYia6iSNc5cAnFVFxtqGpl3yfanNxPK3m5zurIvbtlQxlgQRxiu1j0S0Q4Y/aD3yeled6skkGry26cDzMADsKaak9B7Ghp8IRN38Zq4kRwcdR2FV7dvL24BNaMO1U8xjggfnQ0NMdFH5u1pZgoTkD+lSGzDMyxyDAP3QOTVAk4AUkY5GBUwvXRspuLduO/+FQ0aJmipW3uLeRWAYdQO/HWqUurEu7hyGdiSB64qPVLosgdk2yDAB/z7YrC8w4OR9DTSIm+hpSamxcnOfq1FZhznjB+uKKqxlc7mbW4wxUEB3PY9PWs648RyqC0UmRjGCDz9M1z7Ssw6E+h/wDr0EyDCuyKTwBn/wCtT5A5i/J4mvy52yMoI5Kt/wDrqnJ4jvSmDO6jOcfhVSQBztMnA7VnzWzgMVOfenyhcsy6nqFyCyzOV6fexVBp5klRnVt3c9ah8+WDCyLj+tWYbxfNUkgnPemlYNzctb1EjVpGwBST6ukjbUbC9u1Vbm5glRfkAJ61HDFaMwWWIbCeqnBFLQpXNGOZnXhjnHQ1t2WlyahbkrKqMv8AfbFQWekaWsSMNQuMH+BcEj8xXQ28XhuBQLiW6fI5Es23H4Lis5M1Rzl/avCfIlGHBznOQfpWFMJY3MYHOevtXWavf6NLtis4eQNibXJOe3Wsz7EJYVJ+WTJ4Y8/lTiRM5vzH7LkfQ0VsGylPKqCP9oYNFWZ2LqaRMedwQcA5GBnPT+dTx+G55QjuoUYJyT64/wDr126m1FuXWDdEyB2O3qehz9MdqmSKBAxODGg4DKRkk5+o/Gq5hWOCl8MOoARCcn7w7+1V5fDtwgYlDkLnaRyRz/8AW/OvRzCm5I5FxtIXDAnaep5/EYqZbeP7SGLFcIV5XIwecE9e4HXHFJTHynlf/CO3kilhbrIh74yaiTwZNMob7MQu4/cznOcYwP5V69BaLIXO5d5GMrk4JPcdccgjJ4H51KsEciufOWNVj+Vwo477h+J7envRzhynkcHw91K4d/LMsSocev6fiK04fhlqu8AahApyAQ69M9Mnp+HuK9KmDKy+RC+TglyflHJwQfX6/wD16mgikmMZSaJdoJYkgsWHGMgf/XqHJlKKPOovhnqzXBR9aiUA4yik8YznH4VswfCiwkPmXWq3kxUElUCqT6DGPr+Vd5GrWrKIwXUqxycFwccf1/Sk3MUiXa6MxCsfKJUYIHfvz19vyV2By8XgvTLBFitLWJSyk72OW47kfhWTc6TdW7yqU/dISCSpPy8dwTnHNegvGoAizH5oAJBXIkx6jt25z/KmSxGUqhQLGwJYseoBHyj06nii4Hmi2CwqEMg+uDz+VFehrpNm67sbc9uf6Yopgc2LlYrUSRmQxbifkB6AjBGMce3epLaJkYzfvGMq/IrEkD2OO+fTPaq9vP5ixIYkMkyEyehVeP6irULLGsZLO4UGRUXAbAA69jyfXvVEk6P587xzLtTftUq3PbB9c85okCLbkvviQkBGwxb1HqfQ/nxUT2zzxqY3O+QCRMSbQo/75PNKpULDtDP5rELliAcj3z2J/OkMekUtvK0cYRkEahGVdmck9cn0AOetSLd7IxNK25WRdqqBg5ByTnjoCTUkJkkgl+0IglRsAZ47ZGR6f0qAXm+KGJCFIUAEOxYDjuVPP40ASIjGTz4RDJLLj5mUAEj7v69xU1tciV189CSSzHBGBjscY9/z7GkDTm5t5ZZFbBMYC9/XnjuKhUyyWsaxIsJRmXaVVgMkjHoec+nA96Vh3L3kjc0gmQdAXTGRjv3HrxxxmnXELXksZlOYo1Mu5xtKk579CMDseuKy7Uu1vb/ZFZImZm25AJHX6etWiyNdC3mQqzyDYoY4Kjk9OnBzz60hlkyxzweXMpmGMBk3bDkAgkg9MVdF5Daw5TAUL/q26qOg69ecDnFUCUhLkRvJwzZUj5SBgcHHBANKpeUCcp5kY3AggHIHPTgYoAtOZWCGWxt7klQVkZl5Htx0opvn2CRxLDL5ShB8ihgB+GKKBH//2Q==
urlsafe_encode64:
_9j_4AAQSkZJRgABAQEAYABhAAD_4QBARXhpZgAASUkqAAgAAAABAGmHBAABAAAAGgAAAAAAAAACAAKgCQABAAAAZAAAAAOgCQABAAAAfwAAAAAAAAD_2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL_2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL__gA0T3B0aW1pemVkIGJ5IEpQRUdtaW5pIDMuMTQuMTQuNzI2NzA4NjAgMHhmZjIzZjM3OQD_wAARCAB_AGQDASIAAhEBAxEB_8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL_8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4-Tl5ufo6erx8vP09fb3-Pn6_8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL_8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3-Pn6_9oADAMBAAIRAxEAPwDyNj85-tBbFEvE7j0Y_wA6YxrA2HIxJqwp4rS0LwvfayyspWKFj99z1-grvrf4TCSEMLqVzjqu3-VTKSRUYtnmQrS0WTy9Tib0DfyNdNq_wz1WwUvbAyp6MMH_AArl7eKaw1JUuImjdcgq4x2NS2mtCopqSuZkLf6Vdn_ZevR_ADgaHAv_AE9E_qteaQn9_de6NXoHw_c_YLVSetyePxFKrsXT_wAzoPFigaovH_LIfzNYHet_xW2dVHtGP61gfxVqcKLMNXojjFUIzVyI0AaSN8tFRKw2iigDxi9-S8mX0c1b0mxa6nDNGXUnCr_eNPNgbu5aduIzg4HU1oaZdvp16JBErIo2hCe1M3SO20uKGwgVZdW0-G9Yjy4XfcR7EjgfSuv8M6pqc91LZzWitLD1aORQCOxGOoI9K8t1W90vU9k0cM1vMgySuOvpVrwvrM1lqguXnYkptLE9QOgqJQW5tGfQ92Sa4-YXMIEZ_vNkiud8Q-GdM1C4U3EAKn7rrwy_jV3Tdfi1G1yrBiv3sH2qw7xOgjBJ5PfNc8tNUax8zx7xb8OJtAt5tU093uLIofMB-9HnufUe9R-BDstrTP8Az8_4V7KGla3aEhJIWUgqwzkHqD7Vxh0zStHuGitrRYvn3hdxIBP16UOfMrMpQszO8TtnVT7ItYWfmrvP7PttQgDSJGc8cZyKzbvwaWQvZTZcfwMev41pGtF7nJLDSitNTnY2xVqFqptFLbzNDMjJIhwVYYIqaNua2Odmkr8UVCrfLRQI8fkuZAkTq7KSuCQfSrumTmdzE7Ev1BJ5PtVc2h8pR6E01LaRHDoSGByCO1O6Ntbm6I8KSfujnJNLp5Ny2YSMZPzH_CobK9mdTBOvDcbgOPxrS0q3FrMzMg2hSMdATSRdzodD1M6ULhgcuy8Z9feuz8K6m7Rs8nzDOcV5vZ2sjYeT-Lk5712emSER4XACjtUTjc0gzupbw-SZVXI6kKOgrh9Vkj1LV5o7dlMipn3Fbdtq4is5uN-xGZlxyR3rza21Rv7QkugzbgeOe1Z8hpzndaTPIlqI5wWKjhgBW7pV1NNOqgDHGD2-hFcPp-tQFCJxt3Hsa6fTr2OJCImKc8NwSKiUS1I1fFehW-sWMl1arjULZMsg6yIO2PUdq82j4Nes6JdFplaQDzF4LDoR_hXA-KdM_szxDcIi4hlPmx49D2_A5Fa05aWZx14Wd0ZyniimA8UVqYHIf2YxUAKT9BT10iU_8smrtFt-igKDj0xUgszj-EAjpnvWWp1XRx8eiTHny8fU1ZXSJY0JbBXqQCa61bMocMB-dP8AsIOFwSPXrRewHPQSIqtnAA_StbSbyJWdJABjpmsnVITp95sK5VxxVCZ7hIfNAIReorQSN241rZp14DgE5xjqPpXG2MFzOGkWKVgxyQqkk0GZ76QDkAnkDvWnYtJB8ylyxGAFOD_Opehe4W1o8TgzFkXI3BgRj2Nd3oFpaRyrKLndkgDL8EHsa4GUzzXn7wSIrDOWFamlXX2Z1UuXj6Z6YOf8DWctTRHpVsklndgRO5CneoI6gfeXjrxn8q534mzXE8lk9kkxePdv8sEgA4xnH0NaFnrCSTRqXDsjI69iCDj_AB_OrmmXMRu78yuMRuEXP4nH6ipS1CTSVzyIavfLwWGR6rRXsk2q6akhDFWPrRV6kc0exzYt125bJ5x171ZjRMBgBn37f54pflI-RgWxywPQn61Irp8rE9T9AB_XtTuZFTUL2PT41eU5JbCr_WqTa7HKh-zwOHI6ucD9OtXdW0watbgCURyq2UbHHTvzWNB4ZvlYia6iSNc5cAnFVFxtqGpl3yfanNxPK3m5zurIvbtlQxlgQRxiu1j0S0Q4Y_aD3yeled6skkGry26cDzMADsKaak9B7Ghp8IRN38Zq4kRwcdR2FV7dvL24BNaMO1U8xjggfnQ0NMdFH5u1pZgoTkD-lSGzDMyxyDAP3QOTVAk4AUkY5GBUwvXRspuLduO_-FQ0aJmipW3uLeRWAYdQO_HWqUurEu7hyGdiSB64qPVLosgdk2yDAB_z7YrC8w4OR9DTSIm-hpSamxcnOfq1FZhznjB-uKKqxlc7mbW4wxUEB3PY9PWs648RyqC0UmRjGCDz9M1z7Ssw6E-h_wDr0EyDCuyKTwBn_wCtT5A5i_J4mvy52yMoI5Kt_wDrqnJ4jvSmDO6jOcfhVSQBztMnA7VnzWzgMVOfenyhcsy6nqFyCyzOV6fexVBp5klRnVt3c9ah8-WDCyLj-tWYbxfNUkgnPemlYNzctb1EjVpGwBST6ukjbUbC9u1Vbm5glRfkAJ61HDFaMwWWIbCeqnBFLQpXNGOZnXhjnHQ1t2WlyahbkrKqMv8AfbFQWekaWsSMNQuMH-BcEj8xXQ28XhuBQLiW6fI5Es23H4Lis5M1Rzl_avCfIlGHBznOQfpWFMJY3MYHOevtXWavf6NLtis4eQNibXJOe3Wsz7EJYVJ-WTJ4Y8_lTiRM5vzH7LkfQ0VsGylPKqCP9oYNFWZ2LqaRMedwQcA5GBnPT-dTx-G55QjuoUYJyT64_wDr126m1FuXWDdEyB2O3qehz9MdqmSKBAxODGg4DKRkk5-o_Gq5hWOCl8MOoARCcn7w7-1V5fDtwgYlDkLnaRyRz_8AW_OvRzCm5I5FxtIXDAnaep5_EYqZbeP7SGLFcIV5XIwecE9e4HXHFJTHynlf_CO3kilhbrIh74yaiTwZNMob7MQu4_cznOcYwP5V69BaLIXO5d5GMrk4JPcdccgjJ4H51KsEciufOWNVj-Vwo477h-J7envRzhynkcHw91K4d_LMsSocev6fiK04fhlqu8AahApyAQ69M9Mnp-HuK9KmDKy-RC-TglyflHJwQfX6_wD16mgikmMZSaJdoJYkgsWHGMgf_XqHJlKKPOovhnqzXBR9aiUA4yik8YznH4VswfCiwkPmXWq3kxUElUCqT6DGPr-Vd5GrWrKIwXUqxycFwccf1_Sk3MUiXa6MxCsfKJUYIHfvz19vyV2By8XgvTLBFitLWJSyk72OW47kfhWTc6TdW7yqU_dISCSpPy8dwTnHNegvGoAizH5oAJBXIkx6jt25z_KmSxGUqhQLGwJYseoBHyj06nii4Hmi2CwqEMg-uDz-VFehrpNm67sbc9uf6Yopgc2LlYrUSRmQxbifkB6AjBGMce3epLaJkYzfvGMq_IrEkD2OO-fTPaq9vP5ixIYkMkyEyehVeP6irULLGsZLO4UGRUXAbAA69jyfXvVEk6P587xzLtTftUq3PbB9c85okCLbkvviQkBGwxb1HqfQ_nxUT2zzxqY3O-QCRMSbQo_75PNKpULDtDP5rELliAcj3z2J_OkMekUtvK0cYRkEahGVdmck9cn0AOetSLd7IxNK25WRdqqBg5ByTnjoCTUkJkkgl-0IglRsAZ47ZGR6f0qAXm-KGJCFIUAEOxYDjuVPP40ASIjGTz4RDJLLj5mUAEj7v69xU1tciV189CSSzHBGBjscY9_z7GkDTm5t5ZZFbBMYC9_XnjuKhUyyWsaxIsJRmXaVVgMkjHoec-nA96Vh3L3kjc0gmQdAXTGRjv3HrxxxmnXELXksZlOYo1Mu5xtKk579CMDseuKy7Uu1vb_ZFZImZm25AJHX6etWiyNdC3mQqzyDYoY4Kjk9OnBzz60hlkyxzweXMpmGMBk3bDkAgkg9MVdF5Daw5TAUL_q26qOg69ecDnFUCUhLkRvJwzZUj5SBgcHHBANKpeUCcp5kY3AggHIHPTgYoAtOZWCGWxt7klQVkZl5Htx0opvn2CRxLDL5ShB8ihgB-GKKBH__2Q==

tika returning incorrect line of text for pdf with lots of tables

I am using tika to extract text from a pdf file that has lot of tables.
java -jar tika-app-0.9.jar -t https://s3.amazonaws.com/centraldoc/alg1.pdf
It is returning some invalid text and sometimes it is trimming white space between 2 words; for example it returns
"qu inakli fmyathematical ideas to the real world" instead of "Link mathematical ideas to the real world".
Is there a way to minimize this kind of error? or is there another library that I can use? Does it make sense to use OCR to process these kind of pdf.
Try to control order when using PDFBox parser: PDFTextStripper has a flag that controls the order of lines in the document. By default (in PDFBox) it's set to false for performance reasons (no order preserved), but Tika changed its behavior between releases switching this flag on and off.
More details exactly on this problem in my blog Extracting text from PDF files with Apache Tika 0.9 (and PDFBox under the hood).
To get text from PDF to display in the right order, I had to set the SortByPosition flag to true... (tika-app-1.19.jar)
BodyContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
ParseContext context = new ParseContext();
PDFParser pdfParser = new PDFParser();
PDFParserConfig config = pdfParser.getPDFParserConfig();
config.setSortByPosition(true); // needed for text in correct order
pdfParser.setPDFParserConfig(config);
pdfParser.parse(is, handler, metadata, context);

How to write a simple .txt content processor in XNA?

I don't really understand how Content importer/processor works in XNA.
I need to read a text file (Content/levels/level1.txt) of the form:
x x
x x
x x
where x's are just integers, into an int[,] array.
Any tips on writting a SIMPLE .txt importer??? By searching google/msdn I only found .x/.fbx file importer examples. And they seem too complicated.
Do you actually need to process the text file? If not, then you can probably skip most of the content pipeline.
Something like:
string filename = "Content/TextFiles/sometext.txt";
string path = Path.Combine(StorageContainer.TitleLocation, filename);
string lineOfText;
StreamReader sr = new StreamReader(path);
while ((lineOfText = sr.ReadLine()) != null)
{
// do something
}
Also, be sure to set the "Build Action" to "None" and the "Copy to Output Directory" to "Copy if newer" on the text files you've added. This tells the content pipeline not to compile the text file but rather copy it to the output directory for use as is.
I got this (more or less) from the RacingGame sample provided by Microsoft. It foregoes much of the content pipeline and simply loads and processes text files (XML) for much of its level data.
XNA 4.0 uses
System.IO.Stream stream = TitleContainer.OpenStream("tilename.txt");
See http://msdn.microsoft.com/en-us/library/bb199094.aspx and also http://blogs.msdn.com/b/shawnhar/archive/2010/12/09/reading-files-in-xna-game-studio-4-0.aspx
There doesn't seem to be a lot of info out there, but this blog post does indicate how you can load .txt files through code using XNA.
Hopefully this can help you get the file into memory, from there it should be straightforward to parse it in any way you like.
XNA 3.0 - Reading Text Files on the Xbox
http://www.ziggyware.com/readarticle.php?article_id=69 is probably a good place to start. It covers creating a basic content processor.

Resources