I am a little curious about the cute little kaleidoscopic images associated with each user on this site.
How are those generated? Possibilities are:
A list of images is already there in some folder and it is chosen randomly.
The image is generated whenever a user registers.
In any case, I am more interested in what kind of algorithm is used to generate such images.
It's called an Identicon. If you entered and e-mail, it's a based on a hash of your e-mail address. If you didn't enter an e-mail, it's based on your IP address.
Jeff posted some .NET code to generate IP based Identicons.
Its usually generated from a hash of either a user name, email address or ip address.
Stackoverflow uses Gravatar to do the image generation.
As far as I know the idea came from Don Parks, who writes about the technique he uses.
IIRC, it's generated from an IP address.
"IP Hashing" I believe it's called.
I remember reading about it on a blog; he made the code available for download. I have no idea where it was from, however. :(
The images are produced by Gravatar and details of them are outlined here, however, they do not reveal how they are doing it.
I bet each tiny tile image is given a set of other tile images it looks good with. Think of a graph with the tiles as nodes. You pick a random node for the corner and fill it's adjacent spots with partners, then rotate it and apply the same pattern four times. Then pick a color.
Instead of a graph, it could also be a square matrix in which each row represents an image, each column represents an image, and cell values are weights.
I believe the images are a 4×4 grid with the upper 2×2 grid repeated 4 times clockwise, just each time rotated 90 degrees, again clockwise. Seems the two colours are chosen randomly, and each 1×1 block is chosen from a predefined set.
EDIT: obviously my answer was ad hoc. Nice to know about identicons.
Try this: http://www.docuverse.com/blog/9block?code=(32-bit integer)8&size=(16|32|64)
substituting appropriate numbers for the parenthesized items.
Related
I want to detect digits on a display. For doing that I am using a custom 19 classes dataset. The choosen model has been yolov5-X. The resolution is 640x640. Some of the objets are:
0-9 digits
Some text as objects
Total --> 17 classes
I am having problems to detect all the digits when I want to detect 23, 28, 22 for example. If they are very close to each other the model finds problems.
I am using roboflow to create diferent folders in which I add some prepcocessings to have a full control of what I am entering into the model. All are checked and entered in a new folder called TRAIN_BASE. In total I have 3500 images with digits and the majority of variance is with hue and brightness.
Any advice to make the model able to catch all the digits besides being to close from each other?
Here are the steps I follow:
First of all, The use of mosaic dataset was not a good choice the purpose of detecting digits on a display because in a real scenario I was never gonna find pieces of digits. That reason made the model not to recognize some digits if it was not shure.
example of the digits problem concept
Another big improvement was to change the anchor boxes of the yolo model to adapt them to small objects. To know which anchor boxes I needed. Just with adding this argument to train.py is enought in the script provided by ultralitics to print custom anchors and add them to your custom architecture.
To check which augmentations can be good and which not, the next article explains it quite visually.
P.D: Thanks for the fast response to help the comunity gave me.
I was given a task to process image files, and analyze data on them.
Imagine an exam paper with A, B, C, D answers to fill in (Picture1).
A vision sensor inspects this paper, and saves an image file of it on the computer. I would like to have this image file analyzed (check for the correct filled in circles) and create a document with the results.
With close to no programming skills, I am kind of clueless on how to even start this project. I basically need something to detect if the red circles are filled in or at least have some % of the area filled (Picture2), and the others in the row are not, and give scores accordingly.
I don't know that this could help you and I even do not know that is the right answer but I can not make comments yet.
So basically you should make some application where you can process every picture and check pixels in some area that covers your template with good answers. Then you store that there was true/false result in inspection and sum this up to store the score.
Maybe this will be helpful: Images and Pixels by Daniel Shiffman
But also I think that with no programming skills this could be very hard to accomplish your task.
I have a very small question which has been baffling me for a while. I have a dataset with interesting features, but some of them are dimensionless quantities (I've tried using z-scores) on them but they've made things worse. These are:
Timestamps (Like YYYYMMDDHHMMSSMis) I am getting the last 9 chars from this.
User IDs (Like in a Hash form) How do I extract meaning from them?
IP Addresses (You know what those are). I only extract the first 3 chars.
City (Has an ID like 1,15,72) How do I extract meaning from this?
Region (Same as city) Should I extract meaning from this or just leave it?
The rest of the things are prices, widths and heights which understand. Any help or insight would be much appreciated. Thank you.
Timestamps can be transformed into Unix Timestamps, which are reasonable natural numbers
User IF/Cities/Regions are nominal values, which has to be encoded somehow. The most common approach is to create as much "dummy" dimensions as the number of possible values. So if you have 100 ciries, than you create 100 dimensions and give "1" only on the one representing a particular city (and 0 on the others)
IPs should rather be removed, or transformed into some small group of them (based on the DNS-network identification and nominal to dummy transformation as above)
This is an idea I got in to my mind,
All the display devices(screens which have pixels etc...) have an upper bound for the amount of various images they can generate.
as an example 1024*728 - 32 bit pixel display can only show (2^32)^(1024*768) etc... number of identical frames without duplicating any scene(view).
funny thing is, It's like we could pre generate all the films all the windows we have ever seen in our lives through screens etc...
the question here is can anybody use this abstract idea to create something useful? :D
You're talking of a number about
(2^32)^(1024*768) ~~ ((2^4)^8)^(10^6) ~~ 10^8^(10^6) ~ 10^8000000.
The number of atoms in universe is about
10^80 // http://en.wikipedia.org/wiki/Observable_universe#Matter_content
I think that there is no way we could pre-generate all the screens in our life.
Let me formulate another question. From a number this big, what can we do to reduce it? How to aggregate similar pictures in order to reduce the complexity?
Another nice question is: what kind of data structure we need to store all this information? Suppose we reduce the number of similar images to 10^10. What kind of structure can handle so many different kinds of pictures in an efficient way?
So given some extra information about the scenes you could generate you might be able to pull apart the scenes that no-one has ever seen.
So if you could take all the pictures out on the internet and the statistics about what was popular or viewed a lot then compute your all possible screens you could pull apart that was not viewed much.
With some basic rules about complexity of the image you might be able to come up with images that have not been seen before. Think 80% flesh tones might produce something coupled with a variance to show range might render people naked. :-)
Of course the computation of such an idea is vastly outside our potential. 2^32^(1024*768) is in the superexponential range which is outside the bounds of reality. I tried to compute it in ruby, and it just died. It would have been fun if it had actually worked. :-)
Say I want to build a check-in aggregator that counts visits across platforms, so that I can know for a given place how many people have checked in there on Foursquare, Gowalla, BrightKite, etc. Is there a good library or set of tools I can use out of the box to associate the venue entries in each service with a unique place identifier of my own?
I basically want a function that can map from a pair of (placename, address, lat/long) tuples to [0,1) confidence that they refer to the same real-world location.
Someone must have done this already, but my google-fu is weak.
Yes, you can submit the two addresses using geocoder.net (assuming you're a .Net developer, you didn't say). It provides a common interface for address verification and geocoding, so you can be reasonably sure that one address equals another.
If you can't get them to standardize and match, you can compare their distances and assume they are the same place if they are below a certain threshold away from each other.
I'm pessimist that there is such a tool already accessible.
A good solution to match pairs based on the entity resolution literature would be to
get the placenames, define and use a good distance function on them (eg. edit distance),
get the address, standardize (eg. with the mentioned geocoder.net tools), and also define distance between them,
get the coordinates and get a distance (this is easy: there are lots of libraries and tools for geographic distance calculations, and that seems to be a good metric),
turn the distances to probabilities ("what is the probability of such a distance, if we suppose these are the same places")(not straightforward),
and combine the probabilities (not straightforward also).
Then maybe a closure-like algorithm (close the set according to merging pairs above a given probability treshold) also can help to find all the matchings (for example when different names accumulate for a given venue).
It wouldn't be a bad tool or service however.