I have written a specification for an iOS app that will include a character on the screen that guides the user through a series of steps. I've had images of the character drawn in Illustrator however they're currently flat drawings only.
In the final app I'd like the character to be animated with a more 3D appearance, similar to the Tommy Cat character seen in this video
http://www.youtube.com/watch?v=BgLMdBh4-eQ
While I don't envisage the animation to be as extensive I'd like the character to make small gestures (hand waving, tail wagging) and facial expressions (happy, sad, etc)
I don't expect the app developer will also do the character animation so I need to know what format (type/dimensions) I should be requesting the animations to be created in so they will be suitable for inclusion in the finished app.
Longer term it would be useful if the same animations could also be used in an Android version of the app - not sure if the one format would be ok across both platforms.
The most simple and straightforward thing to do is hire an animator and have them export a Quicktime video that makes use of the lossless Animation codec. Typically, a 24BPP video is used. But, if you need an alpha channel then have the artist export a 32BPP video with an alpha channel included. Your developer will then have to deal with importing the video content, the most simple means would be to export a series of PNG images (this is not optimal but works just fine for a prototype).
Related
As an introduction and context, I'm currently a novice iOS app developer and I want to make sure I'm not reinventing the wheel too much as I make this app (reinventing wheels can get very expensive.)
The app will allow the user to download our videos off the internet and will allow storage for offline usage. The problem with storing these videos on the device is that many of them will be too long and thus too big to be practical to store.
The videos are quite simple however, consisting of a couple short "real" video clips at the beginning and end, with the bulk of the video being still images animated around the screen. The animations would consist solely of opacity and simple transformation keyframes (translate, scale, rotate around static anchor point), and would require a variety of easing functions for each transition.
The hardest part likely would be that the "video" player will also have to be able to track with an audio player's timecode, and will have to support seeking to any arbitrary point like a normal video player.
So, now that I've described the problem, here's the solution I've come up with so far. Hopefully doing it this way will reduce the probability of XY problems. :)
The idea is to basically do a dumbed-down version of what Final Cut and other editing programs do with animations—have a bunch of clips, sometimes overlapping, and be able to animate the position, scale, rotation, and opacity of each using keyframes.
My first instinct as far as implementation goes is to use some of iOS's game engine stuff to do animations (maybe SceneKit because it seems to allow animations to use scene time as opposed to real time, despite the fact that it's primarily 3d and I am doing 2d animations) and manually handle syncing time with the audio player, as well as manually handling the adding and removing of nodes from the scene when seeking through the video and when clips begin/end.
What are some built-in systems, plugins, etc. that I can take advantage of to make this easier and faster to develop and maintain? Double points if I don't have to transcode the animations by hand to some custom format.
As I mentioned in my comment your question is rather broad and contains multiple questions in one, I will address what you mentioned to be likely the hardest part:
https://developer.apple.com/documentation/avfoundation/avplayeritem
https://developer.apple.com/documentation/avfoundation/avasset
Instead of SceneKit, take a look at SpriteKit and its SKVideoNode.
Also, research Metal video processing. There are quit a few example projects available you could use as a starting point.
What's the common modern standard for animated video overlays? (e.g. if you want to add an animated logo to video recorded from the camera)
During research, I've found the following options:
GIF - seems to be pretty outdated technology
FLV - supports alpha-channel, but no longer supported by Adobe.
Requires FFMPEG.
PNG sequence - the downside of this is having multiple files for each
frame.
What's the right format/technology to use?
Ideally, what is natively supported on iOS (doesn't require FFMPEG)?
If you want to overlay your custom video animation over video which user will be recorded I suggest to use GPUImage framework which allow a lot video/photo customization's and different graphic effects. For example how to mix two videos: nice article. Also I suggest you to read article about Chroma key which are something like standard of video/photo mixing. (because as I understand you just want make something like watermark?). GPUImage also has Chroma key filter which you can use in your purpose.
By default Apple supports h264 codec in mp4 container. So your video should be in this codec.
Hope I fully answered on your question
The best way to add overlays using the AVFoundation framework supplied by apple itself. Speaking about the other ways such as GIF, FLV, they are not supported natively by APPLE which puts you out of luck.
Apple suggests various tools such as AVVideoCompositionCoreAnimationTool that lets you stitch the Core Animations and the videos together.
Here is a link that explains how to add various effects such as
Colored borders with custom sizes.
Multiple overlays.
Text for subtitles or captions.
Tilt effects.
Twinkle, rotate, and fade animation effects!
I am not sure how much of this is application for the application that wanted to add animations while recording. May be some one else could help in it. I hope this helps you about the native way to add animations in recorded videos in iOS.
We want to allow the user to place animated "stickers" over video that they record in the app and are considering different ways to composite these stickers.
Create a video in code from the frame-based animated stickers (which can be rotated, and have translations applied to them) using AVAssetWriter. The problem is that AVAssetWriter only writes to a file and doesn't keep transparency. This would prevent us from being able to overly it over the video using AVMutableComposition.
Create .mov files ahead of time for our frame based stickers and composite them using AVMutableComposition and layer instructions with transformations. The problem with this is that there are no tools for easily converting our PNG based frames to a .mov while maintaining an alpha channel and we'd have to write our own.
Creating separate CALayers for each frame in the sticker animations. This could potentially create a very large number of layers per frame rate of the video.
Or any better ideas?
Thanks.
I would suggest that you take a look at my blog post on this specific subject. Basically, this example shows how RGBA video data can be loaded from a file attached to the app resources. This is imported from a .mov that contains Animation RGBA data on the desktop. A conversion step is required to get the data from the Desktop into iOS, since plain H.264 cannot support an Alpha channel directly (as you have discovered). Note that older hardware may have issues decoding a H.264 user recorded video and then another one on top of that, so this approach of using the CPU instead of the H.264 hardware for the sticker is actually better.
I have some device which streams h264 video in following format: top half of picture is even lines of video, and bottom half of picture is odd lines of video. So the question is - how can I play this video in normal visibility, using standart players, ffplay for example.
I know about "tinterlace:merge" plugin in ffmpeg, but it combines video from two pictures following one by one. So my task is make a correct video from single frame.
Regards,
Alexey.
I recently had to deal with the exact same problem.
there are many different methods and the optimum solution completely depends on your situation,
the simplest fastest method is weaving two fields together which is perfect for immobile parts but create comb effect in moving object.
more complicated methods use motion detection methods.
what I did was merging two fields then applying Edge-Line averaging (ELA) for moving segments to reduce comb effect.
check this link for a detailed explanation of the problem
It would be good if you could provide a sample video file. You describe very well what the picture looks like, but the file may contain other information that is helpful for playback.
Furthermore, the format you describe doesn't sound like a standard format, so it's unlikely you will get a regular player to play it the way you want, out-of-the-box. If you're using ffplay, it's likely that you will have to write your own plugin to re-order the scanlines prior to displaying them.
Alternatively, you could re-encode the video into a standard format (interlaced or deinterlaced) using ffmpeg. You could then play it back in any regular player, like ffplay or VLC.
Finally, I recommend asking your question on the ffmpeg mailing list.
I would like to extract out all the slides from a video lecture, using OpenCV. Here is an example of a lecture: http://www.youtube.com/watch?v=-hxOpz9c0bY.
What approaches would you recommend? So far, I've tried:
Comparing the change in grayscale intensity from frame to frame. This can have problems when an object in the foreground moves around. For example, in this lecture, there's a hand that moves around: http://www.youtube.com/watch?v=mNzu42FrlHo#t=07m00s.
Using SURF features and doing comparisons frame by frame. This approach seems kind of slow.
Does anyone have other ideas?
Most of this work is most likely already done by video encoder. You just need to extract key-frames and check how well compressed are frames between them.
It should be also fairly easy to distinguish still images. You can save lot of time by examining just the key-frames. Slides are likely to have high contrast, solid shapes, solid background. Lecture hall has blurry shapes and low contrast.
What you need is a scene change detection. After that, you'll have to classify scenes as "lecture hall" or "presentation". As for the problem with hands - you could use background subtraction with an adaptive background (just make sure you mask the foreground... you don't want the foreground to become a part of the background).
You could try an edge detection and look for a rectangular object - the slides (above a certain area threshold). You could further reduce FPs by looking for some text within the rectangle.
There are several reasons to extract slides/frames from a video presentation, especially in the case of education or conference related videos. It allows you to access the study notes without watching the whole video.
I have faced this issue several times, so I decided to create a solution for it myself using python. I have made the code open-source, you can easily set up this tool and run it in few simple steps.
Refer to this for a youtube video tutorial. Steps on how to use this tool.
Clone this project video2pdfslides
Set up your environment by running "pip install -r requirements.txt"
Copy your video path
Run "python video2pdfslides.py <video_path>"
Boom! the pdf slides will be available in the output folder Make notes and enjoy!