Automate Photoshop to insert text from file - localization

I have a multilanguage website and need automate the process of updating textlayers in psd-files from a csv-source.
I know that there might be glitches in the psp because of changed widths, but anyway it would help a lot to have the text inside the documents.
What are my options?
EDIT:
Murmelschlurmel has a working solution. Here is the link to the Adobe documentation.
http://livedocs.adobe.com/en_US/Photoshop/10.0/help.html?content=WSfd1234e1c4b69f30ea53e41001031ab64-740d.html
The format of the csv-file is not so nice: you need a column for each variable. I would expect a row for each variable.
It works with Umlaut (ä, ö etc)
EDIT 1:
Another solution is to use com to automate Photoshop. Thats nice if you have a couple of templates (buttons) that need changed text. Here is my script in python that might get you startet.
You need to have an excel file with columns:
TemplateFileName, TargetFileName, TargetFormat, Text
(ie template.psd, button1 , gif , NiceButton) .
The first row of the sheet is not used.
The psp template should only have 1 textlayer and can not have layergroups.
import win32com.client
import xlrd
spreadsheet = xlrd.open_workbook("text_buttons.xls")
sheet = spreadsheet.sheet_by_index(0)
psApp = win32com.client.Dispatch("Photoshop.Application")
jpgSaveOptions = win32com.client.Dispatch("Photoshop.JPEGSaveOptions")
jpgSaveOptions.EmbedColorProfile = True
jpgSaveOptions.FormatOptions = 1
jpgSaveOptions.Matte = 1
jpgSaveOptions.Quality = 1
gifSaveOptions = win32com.client.Dispatch("Photoshop.GIFSaveOptions")
for rowIndex in range(sheet.nrows):
if(rowIndex > 0):
template = sheet.row(rowIndex)[0].value
targetFile = sheet.row(rowIndex)[1].value
targetFileFormat = sheet.row(rowIndex)[2].value
textTranslated = sheet.row(rowIndex)[3].value
psApp.Open(r"D:\Design\Produktion\%s" % template )
doc = psApp.Application.ActiveDocument
for layer in doc.Layers:
if (layer.Kind == 2):
layer.TextItem.Contents = textTranslated
if(targetFileFormat == "gif"):
doc.SaveAs(r"D:\Design\Produktion\de\%s" % targetFile, gifSaveOptions, True, 2)
if(targetFileFormat == "jpg"):
doc.SaveAs(r"D:\Design\Produktion\de\%s" % targetFile, jpgSaveOptions, True, 2)

You can use "Data Driven Design" to do this. There is also a concept of data driven design in computer science, but as far as I can see this is not not related to the use of the word in Photoshop.
Here is how to proceed:
Load your image in Photoshop and define your variables with Image > Variable > Define.
Then convert your csv to a format Photoshop can read. I had the best experiences with tab delimted text.
Finally load the text file in Photoshop with Images > Variables > Data Set and let Photoshop save all iterations.
When I tried this first, I found that the Photoshop help file didn't provide enough details. I searched the Internet for photoshop "data set" and found some good tutorials, e.g. this one from digitaltutors.

It might be little bit off too much, but I have used Adobe AlterCast/Grphics server to handle exactly same issue.
Also if its just Text GIF/JPG image, you can use Python+PIL (Python Imaging Library).
Here is a sample code (works on Windows OS with Arial and Osaka fonts installed.)
#!/usr/bin/python
# -*- coding: utf-8 -*-
import ImageFont, ImageDraw, Image
#font = ImageFont.truetype("/usr/share/fonts/bitstream-vera/Vera.ttf", 24)
#font = ImageFont.truetype("futuratm.ttf", 18)
font = ImageFont.truetype("arial.ttf", 18)
im = Image.new("RGB", (365,20), "#fff")
draw = ImageDraw.Draw(im)
draw.text((0, 0), "Test Images", font=font, fill="#000")
im.save("TestImg_EN.gif", "GIF")
font = ImageFont.truetype("osaka.ttf", 18)
im = Image.new("RGB", (365,20), "#fff")
draw = ImageDraw.Draw(im)
draw.text((0, 0), u"テストイメージ", font=font, fill="#000")
im.save("TestImg_JP.gif", "GIF")

Related

Extraction text as csv from scanned pdf file using tesseract

enter image description hereI need help to extract text from scanned pdf. I have tried to extract it using pymupdf and pillow and pytesseract, but I am not getting correct results, there are some text are returned incorrectly.
I tried to increase sharpness and brightness but still did not get a good result.
I have already checked many answers using OpenCV, but I am fairly new to OpenCV. Please help.
def pdf_to_text(pdf_file,text_file_name,rotate_pdf=False,adj_sharpness=False,adj_contract=False,adj_brightness=False):
try:
doc = fitz.open(pdf_file)
zoom_x=2.5
zoom_y=2.5
mat = fitz.Matrix(zoom_x,zoom_y)
files = []
for n in range(doc.page_count):
#print(f'Extracting {n} image')
page = doc.load_page(n)
if rotate_pdf:
page.set_rotation(-90)
#pix = page.get_pixmap(dpi=600)
pix = page.get_pixmap(alpha=False,matrix=mat,dpi=300)
folder=os.path.join(os.getcwd(),"images")
if not os.path.exists(folder):
os.makedirs(folder)
fname = os.path.join(folder,"page-%i.png"%n)
pix.save(fname)
im = Image.open(fname)
im = adjust_sharpness(im,2.5)
im = adjust_brightness(im,1.1)
im = adjust_contrast(im,2.8)
#im = im.filter(ImageFilter.SMOOTH)
im.save(fname)
#remove_lines(fname)
files.append(fname)
#if n>1:
# break
print("Extracting Images Completed")
print("Now Extracting data from image file")
for file in files:
#file = "./images/page-0.png"
text = image_to_string(file, lang_code="eng")
#text = image_to_string(file, lang_code="fra+eng")
make_textfile(text, text_file_name)
print("Extracting and saving text files completed")
except FileNotFoundError:
print(f"File not available {pdf_file}")
return None
pytesseract.image_to_string(image=Image.open(image_name))
The image:
To process tables in Tesseract you are likely to need to remove table lines to help the OCR engine with the segmentation of the image. However, you may try this first to see how Tesseract will perform.
text = image_to_data(file, lang="eng", config="--psm 6")
This will treat your image as a block to avoid missing as much text as possible, but removing the lines and binarizing the image will lead to better results. This link would help you with the removal of lines.

App Script - Exporting Sheets Hyperlinks to Docs

I have a google sheet - and when a new row appears I am writing the output into a Google Document using a predefined template via a merge.
All is working but as I could only work out how to use the .replaceText() function to achieve the merge, the hyperlinks in some of the sheet columns get exported as plain text.
After much fiddling and cribbing of code (thanks all) I managed to cobble together the following function:
function makeLinksClickable(document) {
const URL_PATTERN = "https://[-A-Za-z0-9+&##/%?=~_|!:,.;]+[-A-Za-z0-9+&##/%=~_|]"
const URL_PATTERN_LENGTH_CORECTION = "".length
const body = document.getBody()
var foundElement = body.findText(URL_PATTERN);
while (foundElement != null) {
var foundText = foundElement.getElement().asText();
const start = foundElement.getStartOffset();
const end = foundElement.getEndOffsetInclusive() - URL_PATTERN_LENGTH_CORECTION;
const url = foundText.getText().substring(start,end+1)
foundText.setLinkUrl(url)
foundElement = body.findText(URL_PATTERN, foundElement);
}
}
After writing out all the columns to the document I call this function on the created document to look for a hyperlink and make it hyper :)
As long as each cell only contains one hyperlink my function works.
It also works where there are multiple hyperlinks in the document.
However, some cells can have multiple hyperlinks and writes them out to the document with a new line for each one.
Although the function finds the multiple URLs correctly and makes them clickable in the document there is a problem.
For example, if there are 2 hyperlinks in the cell they get exported to 2 lines in the document, but after running them through the function - both hyperlinks will now link to the same image (the first) even though each hyperlink itself is the unique link from the original cell.
2 converted hyperlinks that link to the same image
(Note - If I don't run my function and leave the exported hyperlinks as text. Then go into the created document and manually add a space to the ends of the exported hyperlinks then they turn blue and become clickable and link to the correct image, I did try to add a space programmatically before this but couldn't work that out either)
I have exhausted my limited coding ability and can't see why my function which "seems" to work its way through each hyperlink correctly doesn't make it then link to the right image in the document.
Any help would be most appreciated.
Thanks
// ----------------------------------------------------------------------
Thank you for taking the time to look at this, I will try to explain the issues further. It is hard to show here as the links actually work properly when copied here they only misbehave in the google document.
A cell in the exported row has multiple hyperlinks separated by a comma.
they get exported from the cell to the document as text strings like this:
Links in single Sheets Cell for exporting:
"hyperlink-1-as-a-string", (links to image 1)
"hyperlink-2-as-a-string", (links to image 2)
"hyperlink-3-as-a-string", (links to image 3)
"hyperlink-4-as-a-string", (links to image 4)
"hyperlink-5-as-a-string" (links to image 5)
I then run my funtion to make them clickable again.
If there are two are more hyperlinks in the same cell when exported then I get the following issue after running the function.
Exported Text links converted by to clickable hyperlinks:
"hyperlink-1-as-a-string", (links to image 5)
"hyperlink-2-as-a-string", (links to image 5)
"hyperlink-3-as-a-string", (links to image 5)
"hyperlink-4-as-a-string", (links to image 5)
"hyperlink-5-as-a-string" (links to image 5)
I "think" what happens is that my function makes all 5 hyperlinks one big hyperlink that happens to use the last hyperlinks image.
If I copy and paste the URLs into a separate document like an email then they appear as one large hyperlink, not 5 separate ones.
// ---------------------------------------------------------------
The function searches for text patterns that are in fact google hyperlinks.
(starting https:// etc)
When it finds one it works out the length to the end of the text string and then uses setLinkUrl() to make the hyperlink - into a clickable hyperlink.
If there is only one text hyperlink then it works.
If there is more than one text hyperlink, separated by commas then it does not.
I worked something out. This is what I ended up with, it is basically put together from a few other questions & answers - It's not very clever but it works.
Thanks to the various posters who enabled me to figure this out.
function sortLinks(colId, mapPoint, myBody) {
var urls = [];
if (colId.includes(",")) { // IE theres more than one URL
var tmp = colId.split(",");
urls = urls.concat(tmp);
}
else {
urls[0] = colId; // 1 URL no "," add to array[0]
}
if (urls.length > 0) {
var tag = mapPoint;
var newLine = "\n";
var element = myBody.findText(tag);
if (element) {
var start = element.getStartOffset();
var text = element.getElement().asText();
text.deleteText(start, start + tag.length - 1);
urls.forEach((url, index) => {
url = url.trim();
var name = "Image-Video" + (index + 1);
text.appendText(name).setLinkUrl(start, start + name.length - 1, URL);
text.appendText(newLine);
start = start + name.length + newLine.length;
});
}

Error in batch merging images (x.tif is not a valid choice for "C2 (green):")

I want to merge two sets of fluorescence microscope images into a green & blue image, but I'm having trouble with the macro (haven't used ImageJ before). I have a folder of FITC-images to be coloured green and a folder of DAPI-images to be coloured blue. I have been using this modified version of a macro I found online:
macro "batch_merge_channel"{
count = 1;
setBatchMode(true);
file1= getDirectory("Choose a Directory");
list1= getFileList(file1);
n1=lengthOf(list1);
file2= getDirectory("Choose a Directory");
list2= getFileList(file2);
n2=lengthOf(list2);
open(file1+list1[1]);
open(file2+list2[1]);
small = n1;
if(small<n2)
small = n2;
for(i=0;i<small;i++)
{
run("Merge Channels...", "c2="+list1[1]+ " c3="+list2[1]+ " keep");
name = substring(list1, 0, 13)+")_merge";
saveAs("tiff", "C:\\Merge\\"+name);
first += 2;
close();
setBatchMode(false);
}
This, however returns an error
x.tif is not a valid choice for "C2 (green):"
with x being the name of the first file in the first folder.
If I merge the images manually, two by two, there is no error. So I'm presuming the problem is in the macro code.
I found several cases of this error online, but none of the solutions that seemed to work for those people work for me.
Any help would be appreciated!
In case you didn't solve this already, a great place to get help on ImageJ questions is the forum.
I can suggest a couple of ideas:
Is your image successfully opened by the macro? You could set the batch mode to false to check this.
It looks to me like the for loop does not employ the variable i. It works on the first pair of
images (list1[1], list2[1]), then closes the merged image, but then
tries to process image 1 again. To actually loop through all the
images in the folder, you have to put inside the loop something
like this (you don't need 'keep' -- better to leave it out so the source images will automatically be closed)
open(file1+list1[i]);
open(file2+list2[i]);
run("Merge Channels...", "c2="+list1[i]+ " c3="+list2[i]);
-- Turning off batch mode should be done after the loop, not within the loop.
Here's a version that works for me.
// #File(label = "Green images", style = "directory") file1
// #File(label = "Blue images", style = "directory") file2
// #File(label = "Output directory", style = "directory") output
// Do not delete or move the top 3 lines! They contain essential parameters
setBatchMode(true);
list1= getFileList(file1);
n1=lengthOf(list1);
print("n1 = ",n1);
list2= getFileList(file2);
n2=lengthOf(list2);
small = n1;
if(small<n2)
small = n2;
for(i=0;i<small;i++)
{
image1=list1[i];
image2=list2[i];
open(file1+File.separator+list1[i]);
open(file2+File.separator+list2[i]);
print("processing image",i);
run("Merge Channels...", "c2=&image1 c3=&image2");
name = substring(image1, 0, 13)+"_merge";
saveAs("tiff", output+File.separator+name);
close();
}
setBatchMode(false);
Hope this helps.

Possible to make a composite symbol?

When editing a vertex I would like to substitute the vertex symbol with SimpleMarkerSymbol and a TextSymbol but that appears to be impossible. Any suggestions on how I could do this? I want the appearance of dragging something like this (text + circle):
After taking some time to look at the API I've come to the conclusion it is impossible. Here is my workaround:
editor.on("vertex-move", args => {
let map = this.options.map;
let g = <Graphic>args.vertexinfo.graphic;
let startPoint = <Point>g.geometry;
let tx = args.transform;
let endPoint = map.toMap(map.toScreen(startPoint).offset(tx.dx, tx.dy));
// draw a 'cursor' as a hack to render text over the active vertex
if (!cursor) {
cursor = new Graphic(endPoint, new TextSymbol({text: "foo"}));
this.layer.add(cursor);
} else {
cursor.setGeometry(endPoint);
cursor.draw();
}
})
You could use a TextSymbol to create a point with font type having numbers inside the circle. Here is one place where you can find such font. http://www.fontspace.com/the-fontsite/combinumerals
Wont be exactly as shown in the image but close enough. Also some limitation it wont work with IE9 or lower (this is as per esri documentation, as I am using halo to get the white border).
Here is the working Jsbin : http://jsbin.com/hayirebiga/edit?html,output use point of multipoint
PS: I have converted the ttf to otf and then added the font as base64, which is optional. I did it as I could not add the ttf or otf to jsbin.
Well, Achieve this seems impossible so far however ArcGIS JS API provides a new Application/platform where you can generate single symbol online for your applications.
We can simply create all kind of symbols(Provide by ESRI) online and it gives you on the fly code which you just need to paste in your application.
This will help us to try different type of suitable symbols for the applications.
Application URL: https://developers.arcgis.com/javascript/3/samples/playground/index.html
Hoping this will help you :)

How to force BE users to paste as plain text in TYPO3 6.x?

CMS users tend to paste anything into CMS text editors. To prevent website destruction - and as log as there's no non-wysiwyg editor (like markitup) for TYPO3, I would like to as least have some good old "force plain text paste" in place.
TYPO3's RTE has a button "pastetoggle, pastebehaviour, pasteastext". But I haven't managed to configure it so it's always active.
Also, there's an extension ad_rtepasteplain, but it produced no result in TYPO3 6.1.
Is there a usable way to implement paste-as-plain-text for TYPO3 6.x?
[EDIT]
I found (for user TSConfig)
setup.default.rteCleanPasteBehaviour
setup.override.rteCleanPasteBehaviour
as well as (for page TSConfig)
buttons.pastetoggle.setActiveOnRteOpen
buttons.pastetoggle.hidden
... none of which I got running yet. If that's the way to go: is there a working tutorial?
Got it. This is my current setup
RTE.default {
enableWordClean = 1
removeTrailingBR = 1
removeComments = 1
removeTags = center, font, o:p, sdfield, u
removeTagsAndContents = link, meta, script, style, title
hidePStyleItems = h5,h6,pre,address,div
// buttons
showButtons = chMode, formatblock, insertcharacter, removeformat, unorderedlist, orderedlist, outdent, indent, link, copy, cut, paste, showhelp, about,line, bold,pastetoggle, pastebehaviour, pasteastext
buttons.pastetoggle.setActiveOnRteOpen = 1
buttons.pastetoggle.hidden = 1
}
as well as setup.override.rteCleanPasteBehaviour=pasteStructure or plainText in user TSConfig

Resources