This question already has an answer here:
How to generate the random default "gravatars" like on Stack Overflow?
(1 answer)
Closed 8 years ago.
What is a suitable algorithm that can be used to generate random, but likely humanly distinguishable, graphic square icons?
Icons, from 57x57 up to 1024 square, such as used for mobile apps, preferably using something like Core Graphics commands/operations? (or an equivalent)
I tried filling square bitmaps with rand(), but they all look like mud, very hard to distinguish between by sight.
Identicon
Random icon you are talking about is an Identicon.
Identicons are icons that are generated from some form of user information.
An Identicon is a visual representation of a hash value, usually of an
IP address, that serves to identify a user of a computer system as a
form of avatar while protecting the users' privacy. The original
Identicon was a 9-block graphic, and the representation has been
extended to other graphic forms by third parties. – Wikipedia
Sample Implementation
You can have a look at:
NIdenticon - a C# library that helps creating simple Identicons. Examine IdenticonGenerator class that has only one method called Create(). You should be able to extract the algorithm/general idea from it.
Contact-Identicons source - Android app source code. The app generated Identicons. This blog post includes a sample of Java code used to generate a 5*5 pixel, horizontally symmetrical identicon much like the ones github uses.
IGIdenticon source - Objective-C identicon generator. A port of identicon library written in Java.
Good luck!
One way to approach this is similar to a random sentence generator: Rather than a random sequence of letters or words, you can use simple grammar templates like "The (adjective) (noun) (transitive verbed) a (adjective) (noun)." Then pick random nouns etc. to fill it in.
So here, you could compose an icon by randomly selecting some small image pieces like a document icon, a person icon, a right arrow, a question mark, etc. Randomly colorize the pieces, using a randomly chosen color scheme. Randomly arrange the pieces together. Add a shadow. Stuff like that.
For avatars, this could work similar to Mr. Potato Head.
Related
Local travel cards in Saint-Petersburg, Russia have got huge id numbers that aren't easy to read and type into a web page when topping up the card online. So I want to build a small app that would take a photo of a travel card and parse the number out.
The task is a bit easier than a free form recognition:
card is of the very well known size
id numbers are of known size, are located in the very well known location on a card and they are number only, no letters (okay, there are two variations I think and maybe they will add 1-2 more in the future)
even the font is known in advance
even the first several numbers are the same for most of the card (so far there are only two prefixes used)
How would you do it? Are there any libraries tuned not for the general OCR, but for a "hinted" OCR like I need?
Best regards,
Artem.
P.S.
Actually a free/cheap web service for this task would also be good enough
Yes Google has a library called Tesseract and there is an iOS SDK on Github you can import into your application. So you can use this SDK and it has some documentation that you can read that will explain how to set it up in your app. It has methods that will return you a string with the text of the card in the string. BUT it will be ALL of the text from the card. So best thing to do would be to:
1 "clip" the original image to extract a sub image that displays only the portion of the card you wish to get the numbers from.
2 Process this sub image through Tesseract to retrieve the string you are looking for.
3 Then parse through the string and pick out the data that you need.
But just be warned, it can be a bit quirky. This SDK tends to recognize words best from images that are scanned, not "taken a picture of". Because although it is an advance piece of technology, it isn't perfect. So to get it to work as perfectly as possible for you, try to get scanned copies of the originals.
Best of luck.
The ideal solution for you would have three components:
1) Detection of the card. This is useful because if you have the detection, then the end users have much easier time actually using the scanner, because they can place the phone above the card in an arbitrary direction
2) Accurate OCR component. Ideally, customizable for this exact font you have on the card, for the exact position on the card.
3) Parsing mechanism. This would enable you to obtain the exact string written on the card without writing huge amount of OCR parsing code.
BlinkID SDK has all this. It has a preset for detection cards in the ID-1 format. It has integrated OCR engine. And it provides RegexParser, where you can define the exact format of the text which you're trying to extract from the document.
BlinkID was initially built for scanning ID documents which have very similar properties as the problem you're trying to solve.
Note. I'm one of the developers working on BlinkID.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have an application in mind that I want to produce. We have wall-mounted schedule boards that are divided into small rectangles using black lines on a white background. Magnetic name tags are placed into a particular partition to indicate this person is to work in that cell. This system works very well for communication among people, but I would like an automatic way of saving this schedule information into a database automatically.
I am envisioning a system where a camera is set in a fix position focusing on the schedule board. Periodically the camera will take a picture of the board. I want to write some code to decipher which name tags are in which area. This would require some OCR or symbol recognition. There are big numbers on each name tag that I will use to identify the person whose name tag it is.
I naturally go to Python when tackling a new programming problem. I found this post -> python image recognition which looks like a good place to start (with PIL and numpy).
Do you know a good way to do this?
Update: I have tried SimpleCV and it seems good for now.
This is actually a pretty hard problem, even though it looks quite simple. But you can make it a lot easier by doing some stuff to your image to make this manageable. I have the following suggestions:
Try to make it so that your camera is looking straight at the board with a reasonable lens so that there is minimal distortion of the image on the edges, and no perspective distortion.
Given that you'll be shooting the occasional image for analysis I think performance is in no way an issue, so shoot high-resolution images, with a flash or with a long exposure time (because everything you're shooting is stationary) to get the best possible picture quality.
If the number of different tags you expect is not too large you might find it easier to just try to match reference images of these tags in your image through template matching rather than going for full OCR of numbers. This is a lot easier to get working if your image is good enough. The python opencv interface is very complete.
High Performance Mark has a good comment to your question about including barcodes on the tags. I would add the option of QR codes, but that is just the same thing. Both are easy to detect and there are good libraries to help you read them.
If you decide you do need OCR, you should look into available OCR packages and not try to roll your own. Try pytesser for the tesseract engine or the OCRopus python interface.
Since you mentioned that you would like to use Python for this problem, perhaps you could take a look at SimpleCV. It will provides you an easy way to grab the image from the camera and do basic image processing.
I strongly agree with jilles de witt that OCR would be an extremely hard image analysis task to develop from scratch. Code reading would be a better option, but that also will be difficult to program and will require sophisticated or somewhat challenging imaging as others have noted. However, for this app you really do not need to implement OCR or formal bar codes, QR or other 2d codes.
Since your application is constrained to a limited number of targets, perhaps you could make your own simple code. For example, you could place 0 to 4 big dots in a 2x2 array after each person's name. This simple example code uniquely identifies 16 unique tags, and the features will be much easier to image, extract and decode than formal codes. Add a locator line if the code position is not consistent.
I downloaded the EverNote API Xcode Project but I have a question regarding the OCR feature. With their OCR service, can I take a picture and show the extracted text in a UILabel or does it not work like that?
Or is the text that is extracted not shown to me but only is for the search function of photos?
Has anyone ever had any experience with this or any ideas?
Thanks!
Yes, but it looks like it's going to be a bit of work.
When you get an EDAMResource that corresponds to an image, it has a property called recognition that returns an EDAMData object that contains the XML that defines the recognition info. For example, I attached this image to a note:
I inspected the recognition info that was attached to the corresponding EDAMResource object, and found this:
the xml i found on pastie.org, because it's too big to fit in an answer
As you can see, there's a LOT of information here. The XML is defined in the API documentation, so this would be where you parse the XML and extract the relevant information yourself. Fortunately, the structure of the XML is quite simple (you could write a parser in a few minutes). The hard part will be to figure out what parts you want to use.
It doesn't really work like that. Evernote doesn't really do "OCR" in the pure sense of turning document images into coherent paragraphs of text.
Evernote's recognition XML (which you can retrieve after via the technique that #DaveDeLong shows above) is most useful as an index to search against; the service will provide you sets of rectangles and sets of possible words/text fragments with probability scores attached. This makes a great basis for matching search terms, but a terrible one for constructing a single string that represents the document.
(I know this answer is like 4 years late, but Dave's excellent description doesn't really address this philosophical distinction that you'll run up against if you try to actually do what you were suggesting in the question.)
As the question states: how is it possible to process some dynamic videostream? By saying dynamic, i actually mean I would like to just process stuff on my screen. So the imagearray should be some sort of "continuous screenshot".
I'd like to process the video / images based on certain patterns. How would I go about this?
It would be perfect if there already was (and there probably is) existing components. I need to be able to use the location of the matches (or partial matches). A .NET component for the different requirements could also be useful I guess...
You will probably need to read up on Computer Visual before you attempt this. There is nothing really special about video that seperates it from still imgaes. The process you might want to look at is:
Acquire the data
Split the data into individual frames
Remove noise (Use a Gaussian filter)
Segment the image into the sections you want
Remove the connected components of the image
Find a way to quantize the image for comparison
Store/match the components to a database of previously found components
With this database/datastore you'll have information on matches later in the database. Do what you like with it.
As far as software goes:
Most of these algorithms are not too difficult. You can write them yourself. They do take a bit of work though.
OpenCV does a lot of the basic stuff, but it won't do everything for you
Java: JAI, JHLabs [for filters], Various other 3rd party libraries
C#: AForge.net
I need a good diagram / image editor for a Delphi application. I need the ability to place an image in the editor, and use freely positioned balloons / tips to describe parts of the image. The result must be exported as an image.
So far, I have evaluated KSDev Block Engine and TMS Diagram studio but am not completely satisfied with both of them. The former seems to have lots of little quirks and bugs and both of them don't seem to be able to export their content as an image (PNG with alpha channels is required).
Are there any other editors you know of that I might evaluate ?
There are two free components that I know of and have evaluated for a very limited period.
1) drawobjects by Angus Johnson # http://angusj.com/delphi/
2) simple graph from the delphiarrea site.
If I remember correctly both have the ability to export to an image format but I do not recall if they support png with alpha.
Regards
jo
PS sorry the anti spam does not allow me to post the link for the second pack and since I hate any kind of sign in just to answer a question my email and name are fake. This is the last time I am going to visit this site. I do understand the need to keep the spammers out but I can't accept any one to assume that I am a spammer. BB everyone.
Have also a look at:
- TeeTree from steema.com
- TCad from codeidea.com