Edge detection to identify artist style? - image-processing

I just wondered if anyone has every tried to use edge detection in regards to using it as a feature on its own for artist identification?
I know often most edge detection methods are used to then perform object detection with a database of objects, but cannot find whether or not edge detection can in isolation act as a feature!?
Any ideas?
Thanks

It takes a little bit more than basic edge detection, and having something better than just a photograph can make a big difference. Check out http://infolab.stanford.edu/~wangz/project/imsearch/ART/SP08/sp_vangogh.pdf

Yes, there's been some work on that (at L2CRMF that I know of, but others also).
Edge detection per se is not enough, though; you use it as a preliminary stage to identify features such as brush strokes, stroke vicinity, arcing, direction, average stroke length.
While talking shop, I was led to believe that this sort of analysis was a thankless task and much prone to errors; one of the many fields where human perception canniness still far outweighed automated processing. On the other hand this was back in 2003-2004, and things may have improved.
I have been able to find this reference, as well as this other which looks more promising.
Of historical interest only, Pliny (Naturalis Historia, XXXV.88, 81) reports of how Protogenes recognized a visitor to be the great painter Apelles from a line he had drawn, as it was so thin and perfect that such an absolute masterpiece could tally with no one else:
Ferunt Protogenem protinus, cum contemplatus esset subtilitatem,
dixisse Apellem venisse; adfirmavit enim tam absolutum opus non cadere
in alium

Related

Cluster Analysis for crowds of people

I have location data from a large number of users (hundreds of thousands). I store the current position and a few historical data points (minute data going back one hour).
How would I go about detecting crowds that gather around natural events like birthday parties etc.? Even smaller crowds (let's say starting from 5 people) should be detected.
The algorithm needs to work in almost real time (or at least once a minute) to detect crowds as they happen.
I have looked into many cluster analysis algorithms, but most of them seem like a bad choice. They either take too long (I have seen O(n^3) and O(2^n)) or need to know how many clusters there are beforehand.
Can someone help me? Thank you!
Let each user be it's own cluster. When she gets within distance R to another user form a new cluster and separate again when the person leaves. You have your event when:
Number of people is greater than N
They are in the same place for the timer greater than T
The party is not moving (might indicate a public transport)
It's not located in public service buildings (hospital, school etc.)
(good number of other conditions)
One minute is plenty of time to get it done even on hundreds of thousands of people. In naive implementation it would be O(n^2), but mind there is no point in comparing location of each individual, only those in close neighbourhood. In first approximation you can divide the "world" into sectors, which also makes it easy to make the task parallel - and in turn easily scale. More users? Just add a few more nodes and downscale.
One idea would be to think in terms of 'mass' and centre of gravity. First of all, do not mark something as event until the mass is not greater than e.g. 15 units. Sure, location is imprecise, but in case of events it should average around centre of the event. If your cluster grows in any direction without adding substantial mass, then most likely it isn't right. Look at methods like DBSCAN (density-based clustering), good inspiration can be also taken from physical systems, even Ising model (here you think in terms of temperature and "flipping" someone to join the crowd)ale at time of limited activity.
How to avoid "single-linkage problem" mentioned by author in comments? One idea would be to think in terms of 'mass' and centre of gravity. First of all, do not mark something as event until the mass is not greater than e.g. 15 units. Sure, location is imprecise, but in case of events it should average around centre of the event. If your cluster grows in any direction without adding substantial mass, then most likely it isn't right. Look at methods like DBSCAN (density-based clustering), good inspiration can be also taken from physical systems, even Ising model (here you think in terms of temperature and "flipping" someone to join the crowd). It is not a novel problem and I am sure there are papers that cover it (partially), e.g. Is There a Crowd? Experiences in Using Density-Based Clustering and Outlier Detection.
There is little use in doing a full clustering.
Just uses good database index.
Keep a database of the current positions.
Whenever you get a new coordinate, query the database with the desired radius, say 50 meters. A good index will do this in O(log n) for a small radius. If you get enough results, this may be an event, or someone joining an ongoing event.

tesseract OCR not 100% accurate

I need to scan a handwritten font. I am using tesseract OCR v3.02 for that. I have trained my OCR using box files and adding dictionary words as well but still I am not able to get 100% accuracy.
I am trying to scan the following image
So far the text file i am obtaining is like this:
We each took baths before bed. Bigfoot was so large that he had to use the bathtub more like a sink trying to clean
up his best. he left a pool of water around the usa It was a mess and I mopped it and removed all of the fur
left behind before taking a bath of my Own. I have to admit it was cguite disgusting. “It”s cguite simple.” Me
assured her. “Mere, I”ll show you.”Iimmy‘s was the favorite burger joint among us kids. Sreat burgers. and
there were video games and pinball machines. Jimmy”s has been around since the fift"es. all our parents used to
go there when they were young. “‘Mot exactly.""Another blanket. Ben. please."“I really hope so. because I feel so
Any help to improve the the OCR?

How to make texts in images sharper using PIL?

I was working with PIL, OpenCV and OCR readers to read texts from Images. The biggest problem I faced is when it comes to Image processing to make texts sharp enough for easier/accurate extraction by the OCR reader.
For that, I thought of increasing the contrast/brightness and do a histogram equlization using PIL but that didn't help the cause either.
So, what would you suggest to do to make the texts appear sharper for better text extraction?
PIL has sharpen and edge enhancing filters. Is this what you want? An example image showing what you are dealing with would be helpful.
Your image has an uneven background color which may be causing problems. Try looking at this solution to create a nice leveled b&w image.
But the black collar is also going to cause problems and you should look at ways of cropping it out.
That said, I get reasonable improvements with a simple PIL SHARPEN filter:
tesseract results after SHARPEN filter:
From what I've learned looking inside people, ^ I've decided human
beings are somewhere ` between a hurricane and an ice cube} in some
respects, permanently mysterious, but in others- with enough science
and careful probingeentirely ' scrutabler It would be as foolish to
think we have reached the limits of human knowledge as it is to 3
think we could ever know everything. There is still room enough to
get better, to ask questions of i even the dead, to learn from
knowing when our i simple certainties are wrong.
And results without filter:
From what I've learned lnnkmg wade maple} Fve deculed lunnuan wlng;.
el'. .y.w.r-a' isbetween a luurrlctuvr null llva laAll.' a. I ll
respects, permanently unyst:-rwntMl ln ms. re with enough scaena)
and turutul pmlulng l~m.rely scrutable. It would he as loallsla to
thank we have reached the llmlts of human knowledge as lt ls to think
we could ever know everything. There ls still room enough to get
better, to ask quesuons of ` even the dead, to learn from knowmg when
our simple certeindes are wrong.

upper bound - display

This is an idea I got in to my mind,
All the display devices(screens which have pixels etc...) have an upper bound for the amount of various images they can generate.
as an example 1024*728 - 32 bit pixel display can only show (2^32)^(1024*768) etc... number of identical frames without duplicating any scene(view).
funny thing is, It's like we could pre generate all the films all the windows we have ever seen in our lives through screens etc...
the question here is can anybody use this abstract idea to create something useful? :D
You're talking of a number about
(2^32)^(1024*768) ~~ ((2^4)^8)^(10^6) ~~ 10^8^(10^6) ~ 10^8000000.
The number of atoms in universe is about
10^80 // http://en.wikipedia.org/wiki/Observable_universe#Matter_content
I think that there is no way we could pre-generate all the screens in our life.
Let me formulate another question. From a number this big, what can we do to reduce it? How to aggregate similar pictures in order to reduce the complexity?
Another nice question is: what kind of data structure we need to store all this information? Suppose we reduce the number of similar images to 10^10. What kind of structure can handle so many different kinds of pictures in an efficient way?
So given some extra information about the scenes you could generate you might be able to pull apart the scenes that no-one has ever seen.
So if you could take all the pictures out on the internet and the statistics about what was popular or viewed a lot then compute your all possible screens you could pull apart that was not viewed much.
With some basic rules about complexity of the image you might be able to come up with images that have not been seen before. Think 80% flesh tones might produce something coupled with a variance to show range might render people naked. :-)
Of course the computation of such an idea is vastly outside our potential. 2^32^(1024*768) is in the superexponential range which is outside the bounds of reality. I tried to compute it in ruby, and it just died. It would have been fun if it had actually worked. :-)

Game Terrain Database Model

I am developing a game for the web. The map of this game will be a minimum of 2000km by 2000km. I want to be able to encode elevation and terrain type at some level of granularity - 100m X 100m for example.
For a 2000km by 2000km map storing this information in 100m2 buckets would mean 20000 by 20000 elements or a total of 400,000,000 records in a database.
Is there some other way of storing this type of information?
MORE INFORMATION
The map itself will not ever be displayed in its entirety. Units will be moved on the map in a turn based fashion and the players will get feedback on where they are located and what the local area looks like. Terrain will dictate speed and prohibition of movement.
I guess I am trying to say that the map will be used for the game and not necessarily for a graphical or display purposes.
It depends on how you want to generate your terrain.
For example, you could procedurally generate it all (using interpolation of a low resolution terrain/height map - stored as two "bitmaps" - with random interpolation seeded from the xy coords to ensure that terrain didn't morph), and use minimal storage.
If you wanted areas of terrain that were completely defined, you could store these separately and use them where appropriate, randomly generating the rest.)
If you want completely defined terrain, then you're going to need to look into some kind of compression/streaming technique to only pull terrain you are currently interested in.
I would treat it differently, by separating terrain type and elevation.
Terrain type, I assume, does not change as rapidly as elevation - there are probably sectors of the same type of terrain that stretch over much longer than the lowest level of granularity. I would map those sectors into database records or some kind of hash table, depending on performance, memory and other requirements.
Elevation I would assume is semi-contiuous, as it changes gradually for the most part. I would try to map the values into set of continuous functions (different sets between parts that are not continues, as in sudden change in elevation). For any set of coordinates for which the terrain is the same elevation or can be described by a simple function, you just need to define the range this function covers. This should reduce much the amount of information you need to record to describe the elevation at each point in the terrain.
So basically I would break down the map into different sectors which compose of (x,y) ranges, once for terrain type and once for terrain elevation, and build a hash table for each which can return the appropriate value as needed.
If you want the kind of granularity that you are looking for, then there is no obvious way of doing it.
You could try a 2-dimensional wavelet transform, but that's pretty complex. Something like a Fourier transform would do quite nicely. Plus, you probably wouldn't go about storing the terrain with a one-record-per-piece-of-land way; it makes more sense to have some sort of database field which can store an encoded matrix.
I think the usual solution is to break your domain up into "tiles" of manageable sizes. You'll have to add a little bit of logic to load the appropriate tiles at any given time, but not too bad.
You shouldn't need to access all that info at once--even if each 100m2 bucket occupied a single pixel on the screen, no screen I know of could show 20k x 20k pixels at once.
Also, I wouldn't use a database--look into height mapping--effectively using a black & white image whose pixel values represent heights.
Good luck!
That will be awfully lot of information no matter which way you look at it. 400,000,000 grid cells will take their toll.
I see two ways of going around this. Firstly, since it is a web-based game, you might be able to get a server with a decently sized HDD and store the 400M records in it just as you would normally. Or more likely create some sort of your own storage mechanism for efficiency. Then you would only have to devise a way to access the data efficiently, which could be done by taking into account the fact that you doubtfully will need to use it all at once. ;)
The other way would be some kind of compression. You have to be careful with this though. Most out-of-the-box compression algorithms won't allow you to decompress an arbitrary location in the stream. Perhaps your terrain data has some patterns in it you can use? I doubt it will be completely random. More likely I predict large areas with the same data. Perhaps those can be encoded as such?

Resources