I am using MYSQL workbench to generate an EER diagram, and to the best of my knowledge, one can not control the dimensions of the canvas, only the size in number of pages. This has the result that you get a huge amount of white space around your diagram, making it nearly unusable. Why anyone would design it this way is beyond me. There are a lot of questions which ask how to crop a pdf, but they are either more complicated (ie. crop to a certain dimension, or crop and output to different format and ratio) or they do not preserve the image quality, or they just plain do not work. My question therefore is this:
How does one create or convert an EER diagram using MySQL Workbench such that there are no white borders AND the image quality is preserved?
Note I asked the question here as it pertains to databasing, but apologies if it is in the wrong place.
Looks like what you are after is a way to limit the output of an image export to a relatively small area, so that it fits nicely in another document. Several options are possible:
1) Export as png and simply cut off the unwanted parts. Depending on the further usage this might be good enough.
2) Export as SVG and use any of the SVG editors to limit the image size to the wanted area only. Then convert it to the format you need in your target document.
3) Set a paper size in the model that encompasses the content as close as possible. E.g. the statement paper type is quite small. Then rearrange your objects. Resize them if you need larger ones. By setting a larger font (via Preferences) you should be able to make the entire appearence larger. Then export as PDF.
Related
I'm currently an MS student in Medical Physics and I have a great need to be able to overlay an isodose distribution from an RTDOSE file onto a CT image from a .dcm file set.
I've managed to extract the image and the dose pixel arrays myself using pydicom and dicom_numpy, but the two arrays are not the same size! So, if I overlay the two together, the dose will not be in the correct position based on what the Elekta Gamma Plan software exported it as.
I've played around with dicompyler and 3DSlicer and they obviously are able to do this even though the arrays are not the same size. However, I think I cannot export the numerical data when using these softwares.I can only scroll through and view it as an image. How can I overlay the RTDOSE to an CT image?
Thank you
for what you want it sounds like you should use Simple ITK (or equivalent - my experience is with sitk) to do the dicom handling, not pydicom.
Dicom has built in a complete system for 3D point and location specifications for all the pixel data in patient coordinates. This uses a bunch of attributes in the dicom files in the Image Plane Module set of tags. See here for a good overview.
The simple ITK library fully understands and uses the full 3D Image Plane tags to identify and locate any images in patient coordinates by default - irrespective of such things as the specific pixel spacing, slice thickness etc etc.
So - in your case - if you use SITK to open your studies, then you should be able to overlay them correctly "out of the box", because SITK will do all the work to parse the Image Plane Module tags and locate the data in patient coordinates - just like you get with 3DSlicer.
Pydicom, in contrast, doesn't itself try to use any of that information at all. It only gives you the raw pixel arrays (for images).
Note I use both pydicom and SITK. This isn't something bad about pydicom, but more a question of right tool for the job. In fact, for many (most?) things I use pydicom, but for any true 3D type work, SITK is the easier toolkit to use.
I'm looking for a way to locate known text within an image.
Specifically, I'm trying to create a tool convert a set of scanned pages into PDFs that support searching and copy+paste. I understand how this is usually done: OCR the page, retaining the position of the text, and then add the text as an invisible layer to the PDF. Acrobat has this functionality built in, and tesseract can output hOCR files (containing the recognized text along with its location), which can be used by hocr2pdf to generate a text layer.
Unfortunately, my source images are rather low quality (at most 150 DPI, with plenty of JPEG artifacts, and non-solid backgrounds behind some of the text), leading to pretty poor OCR results. However, I do have the a copy of the text (sans pictures and layout) that appears on each page.
Matching already known text to it's location on a scanned page seems like it would be much easier to do accurately, but I failed to discover any software with this capability built-in. How can I leverage existing software to do this?
Edit: The text varies in size and font, though passages of it are consistent.
The thought that springs to mind for me would be a cross-correlation. So, I would take the list of words that you know occur on the page and render them one at a time onto a canvas to create a picture of that word. You would need to use a similar font and size as the words in the document - which is what I asked in my comment. Then I would run a normalised cross-correlation of the picture of the word against the scanned image to see where it occurs. I would do all that with ImageMagick which is available for Windows and OSX (use homebrew on OS X) and included in most Linux distros.
So, let's take a screengrab of the second paragraph of your question and look for the word pretty - where you mention pretty poor OCR.
First, you need to render the word pretty onto a white background. The command will be something like this:
convert -background white -fill black -font Times -pointsize 14 label:pretty word.png
Result:
Then perform a normalised cross-correlation using Fred Weinhaus's script from here like this:
normcrosscorr -p word.png scan.png correlation-result.png
Match Coords: (504,30) And Score In Range 0 to 1: (0.999803)
and you can see the coordinates of the match are 504,30.
Result:
Another Idea
Another idea might be to take Google's Tesseract-OCR and replace the standard dictionary with the text file containing the words on the page you are processing...
I am using the most excellent PHP library ePub to on-the-fly create digital books from HTML stored in my database.
As these are part of a collection, I am including a cover image for every book. Everything works fine in the code but depending upon the device/software interpreting the ePub, the image may get cut off. I have seen 600x800 pixels as a recommended size, but it still cuts it off (for example in Aldiko in Android). Is there a standard size that is recommended in the documentation?
Honestly, I would love a good and readable recommendation for documentation of the ePub format.
So, it seems that Aldiko has the problem, and not the other e-Readers I have tested (Calibre, Overdrive).
After trying various ratios, I found that Aldiko only respects the height:100% style I have called out in the height direction. It doesn't scale the image, only sets the height at 100% of the screen width. I am going to have to go with this being a bug in Aldiko, and keep the recommended 600x800 ratio for maximum resolution.
Another interesting thing I discovered as well; the Aldiko reader didn't recover as well from non-standard HTML. On one of the database entries, a <style> tag inside the <body> disappeared, but the style text did not. This is not the same for the other e-Readers.
The best general advice I found on the internet is Preparing Images for Ebooks Project (PIFEP).
I'm making a GUI for selecting regions to crop from images. I have been using Seesaw and cans select rectangular regions, but cannot find a way to set an image to the background of seesaw.canvas. This suggests using icons on labels. Can I make a label paintable and then use it as a canvas? Is there a way to overlap a label and a canvas or somehow use a panel that gives a background to its contents?
I think Quil has this functionality, but I'm not sure how to build a GUI around its draw, setup, sketch form if I want add widgets.
Existing solutions would appreciated as well, as long as I can decompose them. Using GIMP or Photoshop isn't an option for the workflow I want: multiple crops per photo, of different kinds on each page and different metadata added depending on the type of image outlined. Any suggestions for libraries for working with metadata for photos? I was planning on using a shell interface to exiftool, but a more portable option may be better.
You can draw a java.awt.Image (or sub-class) to a canvas with seesaw.graphics/image-shape:
(require '[seesaw.graphics :as g])
(defn paint-canvas [c g2d]
(g/draw g2d (g/image-shape my-image 0 0) (g/style)))
It seems like that should do it.
Also note that labels (and all Seesaw widgets) are paintable. Just set the :paint option like on a canvas and paint away.
I was wondering how would you print an image that's scaled three times its original size without making it look like crap? If you change the dpi to 300 and print it'll look like crap. Is there a way to convert it gracefully?
You may have the problem of trying to add detail that isn't there. Hopefully you're aware of this.
The best way to enlarge an image that I know of is to use bicubic interpolation. If it's any help, Photoshop recommends using 'bicubic smoother' for enlargement.
Also, be careful with DPI vs PPI.
This is called supersampling or interpolation. There's no 'perfect' algorithm, since that would imply generating new information where there was none ('between' the pixels); but some methods are better than others in fooling the eye/brain to fill the voids, or at least not making big square boxes.
Start with the wikipedia articles on Nearest-Neighbor, Bilinear and Bicubic interpolations (the three offered by PhotoShop). A few more Tricubic interpolation, Lanczos resampling could be of interest, also check the theory, and comparison links.
In short, this isn't a cut-and-clear issue; but an active investigation field, full of subjectivity and practical trade-offs.
You should vectorize your image, scale it, and if you wish you may convert it back to the original format (jpg, gif, png...). However this works best for simple images.
Do you know how to vectorize? There are some sites that do it online, just do some Google research and you'll find some.
Changing the DPI won't matter if you don't have enough pixels in your image for the size you are printing. In the biz it's called GIGO (Garbage In, Garbage Out).
If your image is in HTML then create a media="print" stylesheet and feed a high-res image that way.