OpenCV tracking people from overhead view - image-processing

I have a broad but interesting OpenCV question and I'm wondering where to start.
I am looking for any strategies or white papers that might help.
I need to get the position of people sitting at a conference table from a fixed overhead view. Ideally, I will assign a persistent ID to each person, and maintain a list of people with ID and coordinates. This problem could be easy in a specific case - for example, if designed for a single conference room table - but it gets harder in the general case, especially with people entering and leaving the scene.
My first question: is it a detection or a motion tracking problem? Or some combination of the two?

Well it seems like both to me. I would think you would need to take a long average of the visible area which becomes the background. Then based on your background information you can track movement of other objects.
Assigning an ID may become difficult if objects merge together (at least as far as the camera is concerned) and then separate again, say someone removing a hat placing it down and placing it back on.
But all that in mind it is possible even if it presents a challenge. I once saw a similar project tracking people in a train station using a similar approach (it was in a lecture so I can't provide a link sorry)

Related

SceneKit objects not showing up in ARSCNView about 1 out of 100 times

I can't share much code because it's proprietary, but this is a bug that's been haunting me for awhile. We have SceneKit geometry added to the ARKit face node and displayed inside an ARSCNView. It works perfectly almost all of the time, but about 1 in 100 times, nothing shows up at all. The ARSession is running, and none of the parent nodes are set to hidden. Further, when I look at Debug Memory Graph function in Xcode, the geometry appears to be entirely visible there (and doesn't seem to be set to hidden). I can see all the nodes attached to the face node perfectly within the ARSCNView of the memory graph, but on the screen, nothing shows up. This has been an issue for multiple iOS versions, so it didn't just appear with a recent update.
Has anybody run into a similar problem, or does anybody have any ideas to look into? Is it an apple bug, or is there a timing issue I might not be aware of? It's been really hard to debug because of how infrequent it is, and I haven't found it discussed on any other forums (but point me in the right direction if there is a previous discussion). Thanks!
This is pretty common practice if AR tracking is poor for some reason.
I ran into a similar problem too. I think it's definitely a tracking error which arises due to the fault of the user of AR app. Sometimes, if you're using World Tracking Config in ARKit and track a surrounding environment offhandedly or if you are tracking under inappropriate conditions – you get a sloppy tracking data which results in situation when your World Grid/Axis may be unpredictably shifted aside and your model may fly away somewhere. If such a situation arises - look for your model somewhere nearby – maybe it’s behind you.
If you're using a gadget with a LiDAR, the aforementioned situation is almost impossible, but if you're using a gadget with no LiDAR you need thoroughly track your room. Also there must be good lighting conditions and high-contrast real-world objects with distinguishable non-repetitive textures.

Optimized RPG inventory parsing using OpenCV

I'm trying to develop an OpenCV-based Path of Exile inventory parser. The inventory looks like this, with items left and right. The round things on items are called "sockets", are randomized, but they can be hidden.
There are two options for this:
When you hover an item in game, and press CTRL-C, a description of the item is copied to your clipboard. A solution would be to do this on every single inventory cell, to re-create the whole inventory, bit per bit. There is an issue with this, however: the "item copy" action is probably logged somewhere, and having 12 * 5 = 60 actions like this in under 2 seconds would definitely look fishy on GGG's (the devs) end.
Using image-recognition software to decompose the inventory like a human being would. There are several methods with this, and I'm struggling to find the most effective.
Method 1: Sprite detection
This is the "obvious" method. Store the sprite of every single item in the game (I think there must be around 900-ish sprites for all the bases and unique items, so probably around 250 sprites if we exclude the unique items), and perform sprite detection for each of them, on the whole inventory. This is without a doubt extremely overkill. And it requires tons of storage. Discarded.
Method 2: Reverse sprite detection
For every single sprite in the game, calculate an associated md5, and store it in a file. Using OpenCV, cut the inventory's items one by one, calculate their md5, and match against the file to detect which item it is. It's probably faster this way, but still needs a ton of processing power.
Method 3: Same than #2 but smarter
Use OpenCV to cut the items one by one, and based on their size, optimize the search (2x2 item means that it's either a helmet, boots, gloves, or a shield. 2x1 item is always a belt. 1x1 is always a ring/amulet, and so on).
Is there another method that I'm forgetting? All of these look like they need both heavy processing, code, and before-hand work from me.
Thanks in advance.

Placing Virtual Object Behind the Real World Object

In ARKit for iOS. If you display a virtual item then it always comes before any real item. This means if I stand in front of the virtual item then I would still see the virtual item. How can I fix this scenario?
The bottle should be visible but it is cutting off.
You cannot achieve this with ARkit only. It offers no off the shelve solution for solving occlusion, which is a hard problem.
Ideally you'd know the depth of each pixel projected on the camera, and would use that to determine those that are in front and those that are behind. I would not try something with the feature points ARKit is exposing since 1) their position is innacurate 2) there's no way to know between two frames which feature point of frame A is which feature point in frame B. It's way to noisy data to do anything good.
You might be able to achieve something with third party options that'd process the captured image and understand depth or different depth levels in the scene, but I don't know any good solution. There's some SLAM technique that yields dense depth map like DTAM (https://www.kudan.eu/kudan-news/different-types-visual-slam-systems/) but that'd be redoing most of what arkit is doing. There might be other approaches that I'm not aware off. Apps like snap do this in their own way so it is possible!
So basically your question is to mapping the coordinate of the virtual item on real world coordinate system, in short, you want to see the virtual item blocked by the real item, and you can only see the virtual item once you pass the real item.
If so, you need to know the physical relations of each object in this environment, and then you need to know exactly where you are to decide if the virtual item is blocked.
It's not an intuitive way to fix this, however, it's the only way I can think of.
Cheers.
What you are trying to achieve is not easy.
You need to detect the parts of the real world that "should be visible" using some kind of image processing. Or maybe the ARKit feature points that have the depth information, then based on this you have to add "an invisible virtual object" that cuts the drawing of things behind it. This will represent your "real object" inside the "virtual world" so that the background (camera feed) remains visible in places where this invisible virtual object is present.

Memory Usage of SKSpriteNodes

I'm making a tile-based adventure game in iOS. Currently my level data is stored in a 100x100 array. I'm considering two approaches for displaying my level data. The easiest approach would be to make an SKSpriteNode for each tile. However, I'm wondering if an iOS device has enough memory for 10,000 nodes. If not I can always create and delete nodes from the level data as needed.
I know this is meant to work with Tiled, but the code in there might help you optimize what you are looking to do. I have done my best to optimize for big maps like the one you are making. The big thing to look at is more so how you are creating textures I know that has been a big killer in the past.
Swift
https://github.com/SpriteKitAlliance/SKATiledMap
Object-C
https://github.com/SpriteKitAlliance/SKAToolKit
Both are designed to load in a JSON string too so there is a chance you could still generate random maps without having to use the Tiled Editor as long as you match the expected format.
Also you may want to consider looking at how culling works in the Objective-C version as we found more recently removing nodes from the parent has really optimized performance on iOS 9.
Hopefully you find some of that helpful and if you have any questions feel free to email me.
Edit
Another option would be to look at Object Pooling. The core concept is to create only sprites you need to display and when you are done store them in a collection of sorts. When you need a new sprite you ask the collection for one and if it doesn't have one you create a new one.
For example you need a grass tile and you ask for one and it doesn't have one that has been already created that is waiting to be used so it creates one. You may do this to fill a 9 x 7 grid to fill up your screen. As you move away grass that gets moved off screen gets tossed into the collection to be used again when the new row comes in and needs grass. This works really well if all you are doing is displaying tiles. Not so great if tiles have dynamic properties that need to be updated and are unique in nature.
Here is a great link even if it is for Unity =)
https://unity3d.com/learn/tutorials/modules/beginner/live-training-archive/object-pooling

Improving the performance of MKOverlayViews

I asked a similar question here a while ago about boosting the speed at which MKOverlays are added to an MKMapView by using threading during their creation, but I soon realized that the part of the process that was really dragging me down was not the creation of the overlays, but their addition to the map. Creating many overlays (even 3000+) takes an acceptable amount of time, but adding them all to the map takes far too long (15 seconds).
I know 'what are your favorite' questions usually aren't considered 'right' for Stack Overflow, but I think this question is okay because although it is subjective in a way, there is still a 'right' answer- the one that provides significant changes in the performance of an MKMapView with lots of MKOverlayViews.
Basically, I'd love to know if anyone has any tips or tricks (any at all) for tuning the speed of the addition of many different MKOverlays to a map view. Right now my alternative is combining them all into one big line, which is much faster, but then I lose the ability to treat each segment as an individual line (i.e. being able to show a callout for each segment), which is one of the cooler features in the app, so I'd really like to find a way to make this work. Right now, all of the lines load, given enough time, but even after, scrolling is a nightmare.
I'd really like to hear your thoughts! Thanks!

Resources