I've integrated a libPd patch in iOS.
When entering a text field, and presenting the keyboard there's some crackling sounds.
How would I go about debugging this?
NB I've tagged this question with Objective-C and iOS, however this question may require knowledge in all four tags - libPd and Pure Data well:
What is Pure Data
Pure Data is a powerful programming language for the manipulating of audio from core mathematical concepts. It's widely used games as well as DJ and other music focused applications. Some example apps that are built with Pure Data and libPd are: The Rj Voyager app from RjDj and the Inception App from Warner Brothers.
libPD is a method of embedding Pure Data patches (developed using the visual interface) within an iOS app. Controlling the Pd interface is done via a publish/subscribe message interface similar to OSC or MIDI. .
The GitHub page for libPd is here: https://github.com/libpd
What help am I looking for?
I'm not sure where to start debugging this. Someone who has integrated and used libPd on iOS could surely share experience. It could be related to the following:
How threading works, and how it interacts with the main queue
What sample rates work best given the target devices
What debugging tools are available.
Other advice earned through deep experience.
I don't know anything about PD, but it seems likely that the presentation of the keyboard is causing you to be CPU-starved for some reason. You might try:
verifying this still happens when in release and not attached to a debugger (log messages cause long delays when attached to the debugger, which alone can cause hiccups like this)
profiling your code using Instruments to see if you're inadvertently using a whole lot of CPU at once or
increasing buffer sizes so PD doesn't need CPU as often.
I was experiencing the same symptoms in an app I'm working on. I did manage to ascertain a couple of things early on. My recent changes involved sending alot of messages to pd during app init. I noticed when debugging that when I reduced the amount of messages sent, the sound improved. Also, I didn't see this in the simulator, only on the device.
The libpd example PolyPatch was pretty useful in this case, if you increase the amount of patches that can be generated. I found that the sound was breaking up with many patches open, in exactly the same way as in my app. This is quite simply where the overhead of using libpd takes its toll on performance. What's also clear is that simplifying a patch (so it contains less objects) impacts performance. But by far the biggest hit is creating a new, separate patch. So you won't want to be creating huge numbers of patches. Debugging does of course take a toll too.
44.1khz works pretty much everywhere as far as sample rates go (it's the pd standard too). And there's nothing to stop you debugging the libpd code right there in xcode, i've done that a few times. Other than that, there is the issue of debugging patches. You can either set up your patch with test versions of your objects directly in pd, or you should be able to set up libpd to view the same output as you would normally see in pd's main window in the console (you just need to make sure that you have something like this
[PdBase setDelegate:_dispatcher];
in your code - it's all in the dox of course). Then you just pepper your patch with print messages as required...
Hope it helps, and is still relevant after 3 mths...!
Related
My goal is as follows: I have to read in a video that is stored on the sd card, process it frame for frame and then store it in a new file on the SD card again,In each image to do image processing.
At first I wanted to use opencv for android but I did not seem to be able to read the video
here.
I am guessing you already know that doing this on a mobile device or any compute limited devices is not ideal, simply because video manipulation is very computer intensive which translates to slow execution and heavy battery usage on many devices. If you do have the option to do the processing on the server side it is definitely worth considering.
Assuming that for your use case you need to do it on the mobile device, then OpenCV on Android will now allow you to read in a video and access each frame - #StephenG mentions this in his answer to the question you refer to above.
In the past, functionality like this did not get ported to the Android OpenCv as the guidance was to use ffmpeg for frame grabbing on Android devices.
According to more recent documentation, however, this should be available for Android now using the VideoCapture class (note I have not used this myself...):
http://docs.opencv.org/java/2.4.11/org/opencv/highgui/VideoCapture.html
It is worth noting that OpenCV Android examples are all currently based around Eclipse and if you want to use Studio, getting things up an running initially can be quite tricky. The following worked for me recently, but as both studio and OpenCV can change over time you may find you have to do some forum hunting if it does not work for you:
https://stackoverflow.com/a/35135495/334402
Taking a different approach, you can use ffmpeg itself, in a wrapper in Android, for tasks like this.
The advantage of the wrapper approach is that you can use all the usual command line syntax and there is a lot of info on the web to help you get the right parameters.
The disadvantage is that ffmpeg was not really designed to be wrapped in this way so you do sometimes see issues. Having said that it is a common approach now and so long as you choose a well used wrapper library you should at least have a good community to discuss any issues you come across with. I have used this approach in a hand crafted way in the past but if I was doing it again I would use one of the popular examples such as:
https://github.com/WritingMinds/ffmpeg-android-java
I'm nxj beginner.
I have some questions about bluetooth communication between PC and brick.
First, when bluetooth communication occurs, where is the birthplace processing this datas?
In other words, I want to know whether these datas will be processed on CPU or brick.
Second, what is exact roles CPU and brick in bluethooth communication?
That means what is processed on CPU and what is processed on brick.
I have searched almost web site but I can't find this anywhere.
Please help me. Thanks.
You can see it in the package structure.
lejos.nxt.*
This package contains classes running on the NXT-brick. All code in this package will be compiled for the brick and will run on the brick.
lejos.pc.*
Here the difference is not that clear. This is java-code you compile for personal computer. So most code runs on your computer. But some classes (e.g: RemoteMotorController) only send messages to the NXT-brick which gives commands to the motors.
lejos.pc.comm provides API's that allow you to communicate/control the nxt robot from the PC.
When importing the the libs to an Android project, it allows you to build an instance of the same environment used on a pc, but within android.
I agree it can be tough finding some things out. It would be great if there was as stronger lejos presence on SO
This question is months old and has remained un-answered I actually have a lot of questions about it myself, but I might be able to provide some insight for utter novices.
when using bluetooth with Android and NXJ robots, you use either lejos.pc.comm or lejos.NXJ.
Both provide APi's to do almost the same thing, but work a little differently. I don't know nearly enough about the NXJ api, but I do know that it is the one that lets you manipulate the robot much more effectively, such as outputting data to it's LCD screen, which you can't do with the pc.comm api
As far as I can tell, the pc.comm API uses both Android Bluetooth API's and it's own protocols to allow communication with Lego LCP commands.
(I want to come back to this, but I'm writing a dissert on the topic so I'll try to update it in a couple of days. Seems not many are interested though, shame)
There seems to be a lot of conflicting information out there. It might be that support has increased recently, or changes to adobe.com/air have made some information difficult to find - but I can't track down a definitive list of things to avoid.
I know that actionscript won't run in loaded SWFs, I know that some people say that filters and blendmodes and halo components won't work. I've also read many posts saying they will (at least that blendmodes will, and that halo will run, but slowly so still use spark)
I have a large amount of AS3 code to plan for upgrading to work on iOS, but at the moment I have no idea what things will break (or what things will break when those things have been fixed!)
Is there a list of unsupported APIs, or iOS dos and don'ts?
Thanks
:S
First, yes. Externally loaded SWF's will not run. You can however embed SWF's/SWC's into your project and include them inside of your package.
As far as Flex components, stay away from Halo. You should use Flex 4.6 and stick to components with mobile skins. I recommend downloading Tour de Flex http://www.adobe.com/devnet/flex/tourdeflex.html to get an idea of whats available.
As far as blend modes go, I'm not really sure. I haven't used them in mobile yet. However filters are supported but they are expensive. For drop shadows on rectangles there is something called RectangularDropShadow. This is actually a component and therefor less expensive. However it can only be used on rectangular groups.
You should have access all of the AIR API's. You will however be restricted when using some of File related classes since I don't believe you can leave your Appliaction Storage Directory.
One big performance tip I can give is to use AS3 over MXML whenever possible, ESPECIALLY when creating item renderers. Use BitmapImage over Image whenever possible, again especially in item renderers. Use cacheAsBitmap whenever you have images that don't change often. And stay away from any Flex component that doesn't have a mobile skin.
You may also want to read up on View and destruction policies.
http://www.adobe.com/devnet/flex/articles/flex-mobile-development-tips-tricks-pt1.html
This link also has some more performance tips
http://www.adobe.com/devnet/flex/articles/flex-mobile-performance-checklist.html
I want to port a good OpenCV code on an embedded platform. Earlier such stuffs were very difficult to perform but now TI has come up with nice embedded platforms which are comparatively hassle free as they say.
I want to know following things:
Given that :
The OpenCV code is already running on PC smoothly. (obviously)
Need to determine these before purchasing the device.
Can't put the code here in stackoverflow. :P
To chose from Texas Instruments: C6000.
Questions:
How to make it sure that the porting will be done?
What steps to be taken to make it sure that after porting the code, will run (at least).
to determine whether the code might require some changes to make its run smooth.
The point 3 above is optional.
I need info which will at least give me some start up in this regard.
What I thought I should do?
I am to list the inbuilt functions down.
Then to find available online bench marking for those functions for the particular device like as shown towards the end of this doc.
...
Need to know how to proceed further?
However C6-Integra™ DSP+ARM Processor seems the best.
The best you can do is to try a device simulator (if it is available), but what you'll see there is far from perfect.
Actually, nothing can tell you how fast and how well the app will run on the embedded device before running you specific app on that specific device.
So:
Step 1 Buy it
Step 2 Try it
Things to consider:
embedded CPU architecture: Your app needs a big cache? how big is the embedded cache?
algorithm: do you use a lot of floating point operations? how good is the device at floating point ops?
do you have memory transfers? data bus on a PC is waaay faster than on embedded
hardware support: do you use a lot of double-precision calculations? they are emulated on ARMs. They are gonna kill your app (from millisecons on a PC it can go to seconds on a ARM)
Acceleration. Do your functions use SSE? (many OpenCV funcs are SSEd, even if you don't know). Do you have the NEON counterpart? (OpenCV does not have much support for that). The difference can be orders of magnitude from x86 SSE to embedded without NEON.
and many, many others.
So, again: no one can tell you how it will work. Just the combination between the specific app and the real device tells the truth.
even a run on a similar device is not relevant. It can run smoothly on a given processor, and with another, with similar freq or listed memory, it will slow down too much
This is an interesting question but run is a very generic word in this context, therefore I feel the need to break it down to other 2 questions:
Will it compile in an embedded device?
Will it run as fast/smooth as in a PC?
I've used OpenCV in a lot of different devices, including ARM, SH4, MIPS and I found out that sometimes the manufacturer of the device itself provides a compiled version of OpenCV (for my surprise), which is great. That's something you can look into, maybe the manufacturer of your device provide OpenCV binaries.
There's no way to know for sure how smooth your OpenCV application will be on the target device unless you are able to find some benchmark of OpenCV running in there. PCs have far better processing power than embedded devices, so you can expect less performance from the target device.
There are 3rd party applications like opencv-performance, that you can use to test/benchmark the environment once you get your hands on it. And if performance is such a big deal in this project, you might also be interested in this nice article which explain some timing tests done on couple of OpenCV features comparing implementations using the C and C++ interfaces of OpenCV.
I am searching for an algorithm to determine whether realtime audio input matches one of 144 given (and comfortably distinct) phoneme-pairs.
Preferably the lowest level that does the job.
I'm developing radical / experimental musical training software for iPhone / iPad.
My musical system comprises 12 consonant phonemes and 12 vowel phonemes, demonstrated here. That makes 144 possible phoneme pairs. The student has to sing the correct phoneme pair 'laa duu bee' etc in response to visual stimulus.
I have done a lot of research into this, it looks like my best bet may be to use one of the iOS Sphinx wrappers ( iPhone App › Add voice recognition? is the best source of information I have found ). However, I can't see how I would adapt such a package, can anyone with experience using one of these technologies give a basic rundown of the steps that would be required?
Would training be necessary by the user? I would have thought not, as it is such an elementary task, compared with full language models of thousands of words and far greater and more subtle phoneme base. However, it would be acceptable (not ideal) to have the user train 12 phoneme pairs: { consonant1+vowel1, consonant2+vowel2, ..., consonant12+vowel12 }. The full 144 would be too burdensome.
Is there a simpler approach? I feel like using a fully featured continuous speech recogniser is using a sledgehammer to crack a nut. It would be far more elegant to use the minimum technology that would solve the problem.
So really I'm hunting for any open source software that recognises phonemes.
PS I need a solution which runs pretty much real-time. so even as they are singing the note, firstly it blinks on to illustrate that it picked up the phoneme pair that was sung, and then it glows to illustrate whether they are singing the correct note pitch
If you are looking for a phone-level open source recogniser, then I would recommend HTK. Very good documentation is available with this tool in the form of the HTK Book. It also contains an entire chapter dedicated to building a phone level real-time speech recogniser. From your problem statement above, it seems to me like you might be able to re-work that example into your own solution. Possible pitfalls:
Since you want to do a phone level recogniser, the data needed to train the phone models would be very high. Also, your training database should be balanced in terms of distribution of the phones.
Building a speaker-independent system would require data from more than one speaker. And lots of that too.
Since this is open-source, you should also check into the licensing info for any additional details about shipping the code. A good alternative would be to use the on-phone recorder and then have the recorded waveform sent over a data channel to a server for the recognition, pretty much something like what google does.
I have a little bit of experience with this type of signal processing, and I would say that this is probably not the type of finite question that can be answered definitively.
One thing worth noting is that although you may restrict the phonemes you are interested in, the possibility space remains the same (i.e. infinite-ish). User training might help the algorithms along a bit, but useful training takes quite a bit of time and it seems you are averse to too much of that.
Using Sphinx is probably a great start on this problem. I haven't gotten very far in the library myself, but my guess is that you'll be working with its source code yourself to get exactly what you want. (Hooray for open source!)
...using a sledgehammer to crack a nut.
I wouldn't label your problem a nut, I'd say it's more like a beast. It may be a different beast than natural language speech recognition, but it is still a beast.
All the best with your problem solving.
Not sure if this would help: check out OpenEars' LanguageModelGenerator. OpenEars uses Sphinx and other libraries.
http://www.hfink.eu/matchbox
This page links to both YouTube video demo and github source.
I'm guessing it would still be a lot of work to mould it into the shape I'm after, but is also definitely does do a lot of the work.