Neural Networks and Image Processing to Shoot Caterpillars w/ Lasers - image-processing

I am somewhat of an amateur farmer and I have a precious cherry tomato plant growing in a pot. Lately, to my chagrin, I have discovered that my precious plant has been the victim of a scheme perpetrated by the evil Manduca Quinquemaculata - also known as the Tomato Hornworm (http://insects.tamu.edu/images/insects/common/images/cd-43-c-txt/cimg308.html).
While smashing the last worm I saw, I thought to myself, if I were to use a webcam connected to my computer with a program running, would it be possible to use some kind of an application to monitor my precious plant? These pests are supremely camouflaged and very difficult for my naive eyes to detect.
I've seen research using artificial neural networks (ANNs) for all sorts of things such as recognizing people's faces, etc., and so maybe it would be possible to locate the pest with an ANN.
I have several questions though that I would like some suggestions though.
1) Is there a ranking of the different ANNs in terms of how good they are at classifying? Are multilayer perceptrons known to be better than Hopfields? Or is this a question to which the answer is unknown?
2) Why do there exist several different activation functions that can be used in ANNs? Sigmoids, hyperbolic tangents, step functions, etc. How would one know which function to choose?
3) If I had an image of a plant w/ a worm on one of the branches, I think that I could train a neural network to look for branches that are thin, get fat for a short period, and then get thin again. I have a problem though with branches crossing all over the place. Is there a preprocessing step that could be applied on an image to distinguish between foreground and background elements? I would want to isolate individual branches to run through the network one at a time. Is there some kind of nice transformation algorithm?
Any good pointers on pattern recognition and image processing such as books or articles would be much appreciated too.
Sincerely,
mj
Tomato Hornworms were harmed during the writing of this email.

A good rule of thumb for machine learning is: better features beat better algorithms. I.e if you feed the raw image pixels directly into your classifier, the results will be poor, no matter what learning algorithm you use. If you preprocess the image and extract features that are highly correlated with "caterpillar presence", then most algorithms will do a decent job.
So don't focus on the network topology, start with the computer vision task.

Do these little suckers move around regularly? If so, and if the plant is quite static (meaning no wind or other forces that make it move), then a simple filter to find movement could be sufficient. That would bypass the need of any learning algorithm, which are often quite difficult to train and implement.

Related

Image Recognition Techniques for Identifying Logos (Brands)

I started learning Image Recognition a few days back and I would like to do a project in which it need to identify different brand logos in Android.
For Ex: If I take a picture of Nike logo in an Android device then it needs to display "Nike".
Low computational time would be the main criteria for me.
For this, I have done some work and started learning OpenCV sample examples.
What would be the best Image Recognition that would be used for me.
1) I came to know from Template Matching that their applicability is limited mostly by the available computational power, as identification of big and complex templates can be time consuming. (and so I don't want to use it)
2) Feature Based detectors like SIFT/SURF/STAR (As per my knowledge this would be a better option for me)
3) How about Deep Learning and Pattern recognition concepts? (I was digging on this and don't know whether it would be an option for me). Can any of you let me know whether I can use this and why it would be an better choice for me when compared with 1 and 2.
4) Haar caascade classifiers (From one of the posts in SO, I came to know that by using Haar it doesn't work in Rotation and Scale invariant and so I haven't concentrated much on this). Does this been a better Option for me If I focus up on.
I’m now running one of my pet projects and it's required face recognition – detecting the area with face on the photo, if it exists with Raspberry pi, so I’ve done some analysis about that task
And I found this approach. The key idea is in avoiding scanning entire picture to help by scanning windows of different sizes like it was in OpenCV, but by dividing an entire photo into 49 (7x7) squares and train the model not only for detecting of presenting one of classes inside each square, but also for determining the location and size of detecting object
It’s only 49 runs of trained model, so I think it's possible to execute this in less than in a second even on non state-of-the-art smartphones. Anyway, it will be a trade-off between accuracy and performance
About the model
I will use vgg –like model, probably a bit simpler than even vgg11A.
In my case ready dataset already exists. So I can train convolutional network with it
Why deep learning approach is better than 1-3 you mentioned? Because of its higher accuracy for such kind of tasks. It’s practically proven. You could check it in kaggle. Majority of the top models for classification competitions are based on convolutional networks
The only disadvantage for you – probably it would be necessary create your own dataset to train the model
Here is a post that I think can be useful for you: Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition. Another one: Logo recognition in images.
2) Feature Based detectors like SIFT/SURF/STAR (As per my knowledge
this would be a better option for me)
Just remember that SIFT and SURF are both patented so you will need a license for any commercial use (free for non-commercial use).
4) Haar caascade classifiers (From one of the posts in SO, I came to know that by using Haar it doesn't work in Rotation and Scale invariant and so I haven't concentrated much on this). Does this been a better Option for me If I focus up on.
It works (if I understand your question right), much of this depends of how you trained your classifier. You could train it to detect all kind of rotations and scales. Anyways, I would discourage you to go for this option as I think the other possible solutions are better meant for the case.

Online machine learning for obstacle crossing or bypassing

I want to program a robot which will sense obstacles and learn whether to cross over them or bypass around them.
Since my project, must be realized in week and a half period, I must use an online learning algorithm (GA or such would take a lot time to test because robot needs to try to cross over the obstacle in order to determine is it possible to cross).
I'm really new to online learning so I don't really know which online learning algorithm to use.
It would be a great help if someone could recommend me a few algorithms that would be the best for my problem and some link with examples wouldn't hurt.
Thanks!
I think you could start with A* (A-Star)
It's simple and robust, and widely used.
There are some nice tutorials on the web like this http://www.raywenderlich.com/4946/introduction-to-a-pathfinding
Online algorithm is just the one that can collect new data and update a model incrementally without re-training with full dataset (i.e. it may be used in online service that works all the time). What you are probably looking for is reinforcement learning.
RL itself is not a method, but rather general approach to the problem. Many concrete methods may be used with it. Neural networks have been proved to do well in this field (useful course). See, for example, this paper.
However, to create real robot being able to bypass obstacles you will need much then just knowing about neural networks. You will need to set up sensors carefully, preprocess data from them, work out your model and collect a dataset. Not sure it's possible to even learn it all in a week and a half.

Training the algorithm for better image recognition

This is a research question not a direct programming question.
I am working on a symbol recognition algorithm, What the software currently does, it takes an image, divide it into contours (blobs) and start matching each contour with a list of predefined templates. Then for each contour it takes the one that has the highest match rate.
The algorithm is doing fairely however I need to train it better. What I mean is this:
I want to use a machine learning algorithm that will train the algorithm to have better matching. So lets take an example:
I run the recognition on a symbol, the algorithm will run and find that this symbol is a car, then I have to confirm that result (maybe by clicking on "Yes" or "No") the algorithm should learn from that. So if I click on NO the algorithm should learn that this is not a car and will have better result next time (maybe try to match something else). while if i click on YES he will know that he was correct and next time he will perform better when searching for a car.
This is the concept I am trying to research. I need documents or algorithm that can achieve this sort of things. I am not looking for implementations or programming, just concept or researches.
I have done many researches and read a lot about machine learning, neural networks, decision trees.... but i was not able to know how can I use any in my scenarion.
I hope I was clear and this type of question is allowed on stack overflow. if not I'm sorry
Thanks a lot for any help or tip
Image recognition is still a challenge in the community. What you described in your process of manually clicking yes/no is just creating labeled data. Since this is a very broad area, I will just point you to a few links that might be useful.
To get start, you might want to use some existing image databases instead of creating your own, which saves you a lot of effort. e.g., this car dataset in UCIC image db.
Since you already have the background of machine learning, you can take a look at some survey paper that exactly match your project interests, e.g., search object recognition survey paper or feature extraction car in google.
Then you can dive into some good papers and see whether they are suitable for your project. For example, you can check the two papers below that linked with the UCIC image db.
Shivani Agarwal, Aatif Awan, and Dan Roth,
Learning to detect objects in images via a sparse, part-based representation.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11):1475-1490, 2004.
Shivani Agarwal and Dan Roth,
Learning a sparse representation for object detection.
In Proceedings of the Seventh European Conference on Computer Vision, Part IV, pages 113-130, Copenhagen, Denmark, 2002.
Also check for some implemented softwares instead of starting from scratch, in your case, opencv should be good one to start with.
For image recognition, feature extraction is one of the most important step. You might want to check some stat-of-the-art algorithms in the community. (SIFT, mean-shift, harr features etc).
Boosting algorithm might also be useful when you reach the classification step. I see a lot of scholars mention this in image recognition community.
As #nickbar suggest, discuss more at https://stats.stackexchange.com/

What's the best approach to recognize patterns in data, and what's the best way to learn more on the topic?

A developer I am working with is developing a program that analyzes images of pavement to find cracks in the pavement. For every crack his program finds, it produces an entry in a file that tells me which pixels make up that particular crack. There are two problems with his software though:
1) It produces several false positives
2) If he finds a crack, he only finds small sections of it and denotes those sections as being separate cracks.
My job is to write software that will read this data, analyze it, and tell the difference between false-positives and actual cracks. I also need to determine how to group together all the small sections of a crack as one.
I have tried various ways of filtering the data to eliminate false-positives, and have been using neural networks to a limited degree of success to group cracks together. I understand there will be error, but as of now, there is just too much error. Does anyone have any insight for a non-AI expert as to the best way to accomplish my task or learn more about it? What kinds of books should I read, or what kind of classes should I take?
EDIT My question is more about how to notice patterns in my coworker's data and identify those patterns as actual cracks. It's the higher-level logic that I'm concerned with, not so much the low-level logic.
EDIT In all actuality, it would take AT LEAST 20 sample images to give an accurate representation of the data I'm working with. It varies a lot. But I do have a sample here, here, and here. These images have already been processed by my coworker's process. The red, blue, and green data is what I have to classify (red stands for dark crack, blue stands for light crack, and green stands for a wide/sealed crack).
In addition to the useful comments about image processing, it also sounds like you're dealing with a clustering problem.
Clustering algorithms come from the machine learning literature, specifically unsupervised learning. As the name implies, the basic idea is to try to identify natural clusters of data points within some large set of data.
For example, the picture below shows how a clustering algorithm might group a bunch of points into 7 clusters (indicated by circles and color):
(source: natekohl.net)
In your case, a clustering algorithm would attempt to repeatedly merge small cracks to form larger cracks, until some stopping criteria is met. The end result would be a smaller set of joined cracks. Of course, cracks are a little different than two-dimensional points -- part of the trick in getting a clustering algorithm to work here will be defining a useful distance metric between two cracks.
Popular clustering algorithms include k-means clustering (demo) and hierarchical clustering. That second link also has a nice step-by-step explanation of how k-means works.
EDIT: This paper by some engineers at Phillips looks relevant to what you're trying to do:
Chenn-Jung Huang, Chua-Chin Wang, Chi-Feng Wu, "Image Processing Techniques for Wafer Defect Cluster Identification," IEEE Design and Test of Computers, vol. 19, no. 2, pp. 44-48, March/April, 2002.
They're doing a visual inspection for defects on silicon wafers, and use a median filter to remove noise before using a nearest-neighbor clustering algorithm to detect the defects.
Here are some related papers/books that they cite that might be useful:
M. Taubenlatt and J. Batchelder, “Patterned Wafer Inspection Using Spatial Filtering for Cluster Environment,” Applied Optics, vol. 31, no. 17, June 1992, pp. 3354-3362.
F.-L. Chen and S.-F. Liu, “A Neural-Network Approach to Recognize Defect Spatial Pattern in Semiconductor Fabrication.” IEEE Trans. Semiconductor Manufacturing, vol. 13, no. 3, Aug. 2000, pp. 366-373.
G. Earl, R. Johnsonbaugh, and S. Jost, Pattern Recognition and Image Analysis, Prentice Hall, Upper Saddle River, N.J., 1996.
Your problem falls in the very broad field of image classification. These types of problems can be notoriously difficult, and at the end of the day, solving them is an art. You must exploit every piece of knowledge you have about the problem domain to make it tractable.
One fundamental issue is normalization. You want to have similarly classified objects to be as similar as possible in their data representation. For example, if you have an image of the cracks, do all images have the same orientation? If not, then rotating the image may help in your classification. Similarly, scaling and translation (refer to this)
You also want to remove as much irrelevant data as possible from your training sets. Rather than directly working on the image, perhaps you could use edge extraction (for example Canny edge detection). This will remove all the 'noise' from the image, leaving only the edges. The exercise is then reduced to identifying which edges are the cracks and which are the natural pavement.
If you want to fast track to a solution then I suggest you first try the your luck with a Convolutional Neural Net, which can perform pretty good image classification with a minimum of preprocessing and noramlization. Its pretty well known in handwriting recognition, and might be just right for what you're doing.
I'm a bit confused by the way you've chosen to break down the problem. If your coworker isn't identifying complete cracks, and that's the spec, then that makes it your problem. But if you manage to stitch all the cracks together, and avoid his false positives, then haven't you just done his job?
That aside, I think this is an edge detection problem rather than a classification problem. If the edge detector is good, then your issues go away.
If you are still set on classification, then you are going to need a training set with known answers, since you need a way to quantify what differentiates a false positive from a real crack. However I still think it is unlikely that your classifier will be able to connect the cracks, since these are specific to each individual paving slab.
I have to agree with ire_and_curses, once you dive into the realm of edge detection to patch your co-developers crack detection, and remove his false positives, it seems as if you would be doing his job. If you can patch what his software did not detect, and remove his false positives around what he has given you. It seems like you would be able to do this for the full image.
If the spec is for him to detect the cracks, and you classify them, then it's his job to do the edge detection and remove false positives. And your job to take what he has given you and classify what type of crack it is. If you have to do edge detection to do that, then it sounds like you are not far from putting your co-developer out of work.
There are some very good answers here. But if you are unable to solve the problem, you may consider Mechanical Turk. In some cases it can be very cost-effective for stubborn problems. I know people who use it for all kinds of things like this (verification that a human can do easily but proves hard to code).
https://www.mturk.com/mturk/welcome
I am no expert by any means, but try looking at Haar Cascades. You may also wish to experiment with the OpenCV toolkit. These two things together do face detection and other object-detection tasks.
You may have to do "training" to develop a Haar Cascade for cracks in pavement.
What’s the best approach to recognize patterns in data, and what’s the best way to learn more on the topic?
The best approach is to study pattern recognition and machine learning. I would start with Duda's Pattern Classification and use Bishop's Pattern Recognition and Machine Learning as reference. It would take a good while for the material to sink in, but getting basic sense of pattern recognition and major approaches of classification problem should give you the direction. I can sit here and make some assumptions about your data, but honestly you probably have the best idea about the data set since you've been dealing with it more than anyone. Some of the useful technique for instance could be support vector machine and boosting.
Edit: An interesting application of boosting is real-time face detection. See Viola/Jones's Rapid Object Detection using a Boosted Cascade of Simple
Features (pdf). Also, looking at the sample images, I'd say you should try improving the edge detection a bit. Maybe smoothing the image with Gaussian and running more aggressive edge detection can increase detection of smaller cracks.
I suggest you pick up any image processing textbook and read on the subject.
Particularly, you might be interested in Morphological Operations like Dilation and Erosion‎, which complements the job of an edge detector. Plenty of materials on the net...
This is an image processing problem. There are lots of books written on the subject, and much of the material in these books will go beyond a line-detection problem like this. Here is the outline of one technique that would work for the problem.
When you find a crack, you find some pixels that make up the crack. Edge detection filters or other edge detection methods can be used for this.
Start with one (any) pixel in a crack, then "follow" it to make a multipoint line out of the crack -- save the points that make up the line. You can remove some intermediate points if they lie close to a straight line. Do this with all the crack pixels. If you have a star-shaped crack, don't worry about it. Just follow the pixels in one (or two) directions to make up a line, then remove these pixels from the set of crack pixels. The other legs of the star will recognized as separate lines (for now).
You might perform some thinning on the crack pixels before step 1. In other words, check the neighbors of the pixels, and if there are too many then ignore that pixel. (This is a simplification -- you can find several algorithms for this.) Another preprocessing step might be to remove all the lines that are too thin or two faint. This might help with the false positives.
Now you have a lot of short, multipoint lines. For the endpoints of each line, find the nearest line. If the lines are within a tolerance, then "connect" the lines -- link them or add them to the same structure or array. This way, you can connect the close cracks, which would likely be the same crack in the concrete.
It seems like no matter the algorithm, some parameter adjustment will be necessary for good performance. Write it so it's easy to make minor changes in things like intensity thresholds, minimum and maximum thickness, etc.
Depending on the usage environment, you might want to allow user judgement do determine the questionable cases, and/or allow a user to review the all the cracks and click to combine, split or remove detected cracks.
You got some very good answer, esp. #Nate's, and all the links and books suggested are worthwhile. However, I'm surprised nobody suggested the one book that would have been my top pick -- O'Reilly's Programming Collective Intelligence. The title may not seem germane to your question, but, believe me, the contents are: one of the most practical, programmer-oriented coverage of data mining and "machine learning" I've ever seen. Give it a spin!-)
It sounds a little like a problem there is in Rock Mechanics, where there are joints in a rock mass and these joints have to be grouped into 'sets' by orientation, length and other properties. In this instance one method that works well is clustering, although classical K-means does seem to have a few problems which I have addressed in the past using a genetic algorithm to run the interative solution.
In this instance I suspect it might not work quite the same way. In this case I suspect that you need to create your groups to start with i.e. longitudinal, transverse etc. and define exactly what the behviour of each group is i.e. can a single longitudinal crack branch part way along it's length, and if it does what does that do to it's classification.
Once you have that then for each crack, I would generate a random crack or pattern of cracks based on the classification you have created. You can then use something like a least squares approach to see how closely the crack you are checking fits against the random crack / cracks you have generated. You can repeat this analysis many times in the manner of a Monte-Carlo analysis to identify which of the randomly generated crack / cracks best fits the one you are checking.
To then deal with the false positives you will need to create a pattern for each of the different types of false positives i.e. the edge of a kerb is a straight line. You will then be able to run the analysis picking out which is the most likely group for each crack you analyse.
Finally, you will need to 'tweak' the definition of different crack types to try and get a better result. I guess this could either use an automated approach or a manual approach depending on how you define your different crack types.
One other modification that sometimes helps when I'm doing problems like this is to have a random group. By tweaking the sensitivity of a random group i.e. how more or less likely a crack is to be included in the random group, you can sometimes adjust the sensitivty of the model to complex patterns that don't really fit anywhere.
Good luck, looks to me like you have a real challenge.
You should read about data mining, specially pattern mining.
Data mining is the process of extracting patterns from data. As more data are gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform these data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery.
A good book on the subject is Data Mining: Practical Machine Learning Tools and Techniques
(source: waikato.ac.nz) ](http://www.amazon.com/Data-Mining-Ian-H-Witten/dp/3446215336 "ISBN 0-12-088407-0")
Basically what you have to do is apply statistical tools and methodologies to your datasets. The most used comparison methodologies are Student's t-test and the Chi squared test, to see if two unrelated variables are related with some confidence.

Best approach to what I think is a machine learning problem [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am wanting some expert guidance here on what the best approach is for me to solve a problem. I have investigated some machine learning, neural networks, and stuff like that. I've investigated weka, some sort of baesian solution.. R.. several different things. I'm not sure how to really proceed, though. Here's my problem.
I have, or will have, a large collection of events.. eventually around 100,000 or so. Each event consists of several (30-50) independent variables, and 1 dependent variable that I care about. Some independent variables are more important than others in determining the dependent variable's value. And, these events are time relevant. Things that occur today are more important than events that occurred 10 years ago.
I'd like to be able to feed some sort of learning engine an event, and have it predict the dependent variable. Then, knowing the real answer for the dependent variable for this event (and all the events that have come along before), I'd like for that to train subsequent guesses.
Once I have an idea of what programming direction to go, I can do the research and figure out how to turn my idea into code. But my background is in parallel programming and not stuff like this, so I'd love to have some suggestions and guidance on this.
Thanks!
Edit: Here's a bit more detail about the problem that I'm trying to solve: It's a pricing problem. Let's say that I'm wanting to predict prices for a random comic book. Price is the only thing I care about. But there are lots of independent variables one could come up with. Is it a Superman comic, or a Hello Kitty comic. How old is it? What's the condition? etc etc. After training for a while, I want to be able to give it information about a comic book I might be considering, and have it give me a reasonable expected value for the comic book. OK. So comic books might be a bogus example. But you get the general idea. So far, from the answers, I'm doing some research on Support vector machines and Naive Bayes. Thanks for all of your help so far.
Sounds like you're a candidate for Support Vector Machines.
Go get libsvm. Read "A practical guide to SVM classification", which they distribute, and is short.
Basically, you're going to take your events, and format them like:
dv1 1:iv1_1 2:iv1_2 3:iv1_3 4:iv1_4 ...
dv2 1:iv2_1 2:iv2_2 3:iv2_3 4:iv2_4 ...
run it through their svm-scale utility, and then use their grid.py script to search for appropriate kernel parameters. The learning algorithm should be able to figure out differing importance of variables, though you might be able to weight things as well. If you think time will be useful, just add time as another independent variable (feature) for the training algorithm to use.
If libsvm can't quite get the accuracy you'd like, consider stepping up to SVMlight. Only ever so slightly harder to deal with, and a lot more options.
Bishop's Pattern Recognition and Machine Learning is probably the first textbook to look to for details on what libsvm and SVMlight are actually doing with your data.
If you have some classified data - a bunch of sample problems paired with their correct answers -, start by training some simple algorithms like K-Nearest-Neighbor and Perceptron and seeing if anything meaningful comes out of it. Don't bother trying to solve it optimally until you know if you can solve it simply or at all.
If you don't have any classified data, or not very much of it, start researching unsupervised learning algorithms.
It sounds like any kind of classifier should work for this problem: find the best class (your dependent variable) for an instance (your events). A simple starting point might be Naive Bayes classification.
This is definitely a machine learning problem. Weka is an excellent choice if you know Java and want a nice GPL lib where all you have to do is select the classifier and write some glue. R is probably not going to cut it for that many instances (events, as you termed it) because it's pretty slow. Furthermore, in R you still need to find or write machine learning libs, though this should be easy given that it's a statistical language.
If you believe that your features (independent variables) are conditionally independent (meaning, independent given the dependent variable), naive Bayes is the perfect classifier, as it is fast, interpretable, accurate and easy to implement. However, with 100,000 instances and only 30-50 features you can likely implement a fairly complex classification scheme that captures a lot of the dependency structure in your data. Your best bet would probably be a support vector machine (SMO in Weka) or a random forest (Yes, it's a silly name, but it helped random forest catch on.) If you want the advantage of easy interpretability of your classifier even at the expense of some accuracy, maybe a straight up J48 decision tree would work. I'd recommend against neural nets, as they're really slow and don't usually work any better in practice than SVMs and random forest.
The book Programming Collective Intelligence has a worked example with source code of a price predictor for laptops which would probably be a good starting point for you.
SVM's are often the best classifier available. It all depends on your problem and your data. For some problems other machine learning algorithms might be better. I have seen problems that neural networks (specifically recurrent neural networks) were better at solving. There is no right answer to this question since it is highly situationally dependent but I agree with dsimcha and Jay that SVM's are the right place to start.
I believe your problem is a regression problem, not a classification problem. The main difference: In classification we are trying to learn the value of a discrete variable, while in regression we are trying to learn the value of a continuous one. The techniques involved may be similar, but the details are different. Linear Regression is what most people try first. There are lots of other regression techniques, if linear regression doesn't do the trick.
You mentioned that you have 30-50 independent variables, and some are more important that the rest. So, assuming that you have historical data (or what we called a training set), you can use PCA (Principal Componenta Analysis) or other dimensionality reduction methods to reduce the number of independent variables. This step is of course optional. Depending on situations, you may get better results by keeping every variables, but add a weight to each one of them based on relevant they are. Here, PCA can help you to compute how "relevant" the variable is.
You also mentioned that events that are occured more recently should be more important. If that's the case, you can weight the recent event higher and the older event lower. Note that the importance of the event doesn't have to grow linearly accoding to time. It may makes more sense if it grow exponentially, so you can play with the numbers here. Or, if you are not lacking of training data, perhaps you can considered dropping off data that are too old.
Like Yuval F said, this does look more like a regression problem rather than a classification problem. Therefore, you can try SVR (Support Vector Regression), which is regression version of SVM (Support Vector Machine).
some other stuff you can try are:
Play around with how you scale the value range of your independent variables. Say, usually [-1...1] or [0...1]. But you can try other ranges to see if they help. Sometimes they do. Most of the time they don't.
If you suspect that there are "hidden" feature vector with a lower dimension, say N << 30 and it's non-linear in nature, you will need non-linear dimensionality reduction. You can read up on kernel PCA or more recently, manifold sculpting.
What you described is a classic classification problem. And in my opinion, why code fresh algorithms at all when you have a tool like Weka around. If I were you, I would run through a list of supervised learning algorithms (I don't completely understand whey people are suggesting unsupervised learning first when this is so clearly a classification problem) using 10-fold (or k-fold) cross validation, which is the default in Weka if I remember, and see what results you get! I would try:
-Neural Nets
-SVMs
-Decision Trees (this one worked really well for me when I was doing a similar problem)
-Boosting with Decision trees/stumps
-Anything else!
Weka makes things so easy and you really can get some useful information. I just took a machine learning class and I did exactly what you're trying to do with the algorithms above, so I know where you're at. For me the boosting with decision stumps worked amazingly well. (BTW, boosting is actually a meta-algorithm and can be applied to most supervised learning algs to usually enhance their results.)
A nice thing aobut using Decision Trees (if you use the ID3 or similar variety) is that it chooses the attributes to split on in order of how well they differientiate the data - in other words, which attributes determine the classification the quickest basically. So you can check out the tree after running the algorithm and see what attribute of a comic book most strongly determines the price - it should be the root of the tree.
Edit: I think Yuval is right, I wasn't paying attention to the problem of discretizing your price value for the classification. However, I don't know if regression is available in Weka, and you can still pretty easily apply classification techniques to this problem. You need to make classes of price values, as in, a number of ranges of prices for the comics, so that you can have a discrete number (like 1 through 10) that represents the price of the comic. Then you can easily run classification it.

Resources