I'm very new into Image Processing libraries. I been looking into OpenCV. But I have a question.
What sort of algorithms could I use if I want to identify few similar objects in a room.
Lets say 3 similar tables.
With a camera I sign an identity to each of those tables, after I move the camera
to a positon where the objects are out of sight, when pointing the camera back
to them, the system can properly identify those objects with the initial ID and trigger action based in each id.
I read about aruco makers, but i would like to try the idea without have to attach markers.
There's plenty of methods to choose from. You could use image features, color matching, shape matching, pattern matching ... and so on. It really depends on the specific use case and the environment. In any case you need something unique to distinguish the tables from each other. Using markers would be one way to artificially create uniqueness.
Maybe you wanna start reading here to get a feeling how one method works:
https://docs.opencv.org/3.4.1/dc/dc3/tutorial_py_matcher.html
Could you provide an example set of images of the scenario?
Related
I want to recognize 3d objects from models I provide.
There is no way I can scan the objects because they are mounted on & inside aircraft engines...
But we do have the 3d models in fbx & obj & other formats.
Would it be possible to somehow convert these to ARReferenceObjects so we don't have to use Unity (it's working there...), but can use ios?
If I understand your question correct, you want the ability to scan the real world for a 3Dimensional object recognised from an 3D model. Apples ARKit on IOS have the ability to recognise objects in the real world based on 3D objects. Though, it do need the .arobject file that is created from a scan. Also, it is best suited for small objects, that fits on a table. Objects with strong textures is better to recognise compared to mono coloured parts. So this might not be the solution yet. Though, there is also Vuforia plugin, the overview don't mention the need for pre-scanning your object.
https://library.vuforia.com/content/vuforia-library/en/features/overview.html
The title is my direct question, but I will elaborate to provide context, or detail what is needed for a helpful workaround if the answer is: You can't, log Radar with Apple
Use case is simple: Have a behavior driven GKAgent3D avoid a plane in my ARKit/SceneKit/GameKit app.
This requires I add a 'toAvoid' GKGoal as a behavior of the agent. Apple currently provides two things to avoid, other GKAgents or GKObstacles. The problem I am having is that I see no way to create a GKAgent3D or GKObstacle for use in SceneKit that is not a sphere. GKAgent3D only has a .radius property to define it's "occupied space" and GKObstacle only has one 3D concrete subclass (GKSphereObstacle) and the obstacles(from:) functions use SpriteKit objects.
I have many agents that all have complex behaviors and there are many planes I'd like them to avoid (ARKit detected). I would rather not resort to manual collision detection, since the goal is to have the agents alter their behavior driven path as a result of the object being in the way. It is not enough to just know that the agent is going to hit the object, I need to have that fact influence it's movement considering all the other goals it has in it's behavior.
I am hoping I am missing something and there is a way to do this, or that someone has a clever workaround. The only workaround I have thought of (but hate for performance reasons) is creating a massive number of small sphere obstacles in a regular array to approximate the surface of the plane.
I'm trying to use opencv to implement a feature in my app. Basically, my app allows users to authenticate by using their face. Live video will be captured and frames are extracted. Using these extracted images, the model is learned. Next time when a user logs in, frames are sent to the model for deciding if this is the authenticated user.
I found this example from opencv site which uses FaceRecognizer. However, they use an existing dataset with 10 classes (10 persons). In my case, only one class is considered (Or we can consider two classes including the authenticated user and unknown users). Could you please suggest me a solution?
Thank you.
First of all, I would suggest you look at other methods for face recognition (DNN-based) since the OpenCV FaceRecognizer stuff (ex eigen) is not particularly good.
However, if you want to use it, note that FaceRecognizer::predict has an overload that outputs a "confidence" value. This is the value you would need to look at to decide if the match was right. You'll need to experiment to find your sweet spot between false positives and false negatives.
I was searching on Google and StackOverflow to see if anyone have solution for my problem, but didn't found anyone with same problems.
So, currently I'm running Debian machine with Mapserver installed on it. The server also run webserver for displaying map data over the browser. The generation of map is dynamic, based on layers definition in database I built mapfile in PHP and based on that generated PHP the map is shown to user. The data is defined in database and as a SHP files (both combined in single mapfile).
It is fully dynamic, what I mean with that is that user can enable/disable any of layers or click inside polygon (select some points on map) it color the selection (generate new mapfile based on selection and re-generate tiles).
So the execution of all that code from selecting some area to coloring selected items somtimes take too much time for good user experience.
For solution I'd like to use some kind of temporary tiles cache, that can be used for single user, and to be able to delete it's content when user select some items on map or enable/disable one of the layers.
P.S. I already did all the optimizations provided from Mapserver documentation.
Thanks for any help.
It sounds to me like your problem is not going to be helped by server-side caching. If all of the tiles depend on user selections, then you're going to be generating a bunch of new tiles every time there's an interaction.
I've been using MapCache to solve a similar problem, where I am rendering a tileset in response to a user query. But I've broken up my tiles into multiple logical layers, and I do the compositing on the browser side. This lets me cache, server-side, the tiles for various queries, and sped up performance immensely. I did seed the cache down to zoom level 12, and I needed to use the BerkeleyDB cache type to keep from running out of inodes.
I'm using Leaflet.js for the browser-side rendering, but you should also consider OpenLayers.
After looking at the source code, I have some other ideas.
It looks like you're drawing each layer the same way each time. Is that right? That is, the style and predicate of a particular layer never change. Each user sees the image for that layer the same way, if they have selected the layer. But the combination of layers you show does change, based on OpenLayers control? If that's the case, you don't need per-user caching on the server. Instead, use per-layer caching, and let the user's browser figure out the client side caching.
A quick technique for finding slow layers is to turn them all of. Then reenable them one by one to find the culprit. Invoke Mapserver from the command line, and time the runs, for greater precision than you'll get by running it from your webserver.
You mentioned you're serving the images in Google 3857 while the layers are in Gauss-Kruger/EPSG 3912. Reprojecting this on the fly is expensive. Reprojecting the rasters on the fly is very expensive. If you can, you should reproject them ahead of time, and store them in 3857 (add an additional geometry column).
I don't know what a DOF file is--maybe Digital Obstacle File? Perhaps preload the DOF file into PostGIS too? That would eliminate the two pieces you think are problematic.
Take a look at the SQL queries that PostGIS is performing, and make sure those are using indexes
In any case, these individual layers should go into MapCache, in my opinion. Here is a video of a September 2014 talk by the MapCache project leader.
As the question states: how is it possible to process some dynamic videostream? By saying dynamic, i actually mean I would like to just process stuff on my screen. So the imagearray should be some sort of "continuous screenshot".
I'd like to process the video / images based on certain patterns. How would I go about this?
It would be perfect if there already was (and there probably is) existing components. I need to be able to use the location of the matches (or partial matches). A .NET component for the different requirements could also be useful I guess...
You will probably need to read up on Computer Visual before you attempt this. There is nothing really special about video that seperates it from still imgaes. The process you might want to look at is:
Acquire the data
Split the data into individual frames
Remove noise (Use a Gaussian filter)
Segment the image into the sections you want
Remove the connected components of the image
Find a way to quantize the image for comparison
Store/match the components to a database of previously found components
With this database/datastore you'll have information on matches later in the database. Do what you like with it.
As far as software goes:
Most of these algorithms are not too difficult. You can write them yourself. They do take a bit of work though.
OpenCV does a lot of the basic stuff, but it won't do everything for you
Java: JAI, JHLabs [for filters], Various other 3rd party libraries
C#: AForge.net