I am using stanford ner for removing the identity from the essays.
It is detecting the names like Werner..But indian names such as ram, shyam etc. goes undetected.
What i should do to make them recognizable.
You should train NER for Indian names. I could not find detailed information for how to achieve that. But this FAQ page ( http://nlp.stanford.edu/software/crf-faq.shtml#a ) has some information which may be a starting point for you. Especially the questions 2-3 are directly related to your question.
Related
I watched a YouTube video about voice cloning: https://www.youtube.com/watch?v=Kfr_FZof_hs
It's an interesting topic, but this project's repository only supports English: https://colab.research.google.com/drive/1NxiY3zHN4Nd8J3YAqFsbYaOB71IiLE04?usp=sharing#scrollTo=JrK20I32grP6
I want to adapt it for Italian.
I am a beginner in machine learning.
What do I need to do to get TTC to "learn" Italian?
Is it necessary to train the model on audio files or rebuild the model, or what needs to be done?
Can you advise me)
You can check the following where the creator answers this question.
Creator's answer:
Here is what you need to train this:
a wav2vec or similar asr model for your language
at least 10,000 hours of usable spoken language, with no environmental noises, music, etc. This does not need to be transcribed. I used audiobooks and podcasts for english.
approximately 16 months total of v100 time
https://github.com/neonbjb/tortoise-tts/issues/5
I mean recorded data of multiple Neurons with multi-electrode. I need this data as the input for my experiment.
Google just released a Beta Dataset Search Engine.
You can find data Here
A good way to tackle stuff like this is to email the various labs that work with or generate data of your particular interest. A lot of labs I have worked for in the past have tons of data lying around that is not used, or usefull for them in any current way, and people are generally enthousiastic about your interest in their study.
Adittionally, there are many projects funded with the idea of sharing data and tools for the benefit of science. One such project is the miniscope project from UCLA (which has a ton of calcium imaging data lying around and have very helpfull people willing to share and assist you in the analysis. I am sure a quick google around can help you find similar labs more specialized in electrofysiology over calcium imaging.
I hope you find what your are looking for!
I am looking to decode intents from many strings from many conversations I have stored in a database, so I can use machine learning to create an intelligent chatbot. I have heard and tested tools like Amazon Lex, but I am looking to receive the intent from a string not create my own intents. Here is a sample starting question from the data I am working with:
Hi, can I please find out the location of the nearest Depot to Meadow Springs WA 6210?
Is there any chance we could get 34 cases from Melbourne to Sydney by Friday. I am hoping it will be a Sydney to Sydney tomorrow but if not can anyone do this by Friday from Melbourne?
can you tell me how the warranty claim is going on booking number 9528 thanks
Intents are usually created for a specific application by providing examples. However, some services do provide pre-defined intents you can use.
LUIS provides prebuilt "domains" which include some place related queries
Snips has an "intents library" that you can use
That may be able to get you started. If that doesn't work, this guide to a from-scratch implementation may be useful.
I am working on speech recognition for Indian Accent. For better recognition, I want to create language model for Indian accent.
The tutorial I got describes about Linux OS. Is there any way to do in Windows for acoustic model adaptation ?? Is there any alternative other than recorded sound to create a acoustic model ??
I found online language model tool here >>http://www.speech.cs.cmu.edu/tools/lmtool.html
But it is US accent.
Is there any online tool to create Indian/UK accent language model ??
First a quick clarification. Creating a new language model will not help you with accents. The language model only specifies in which orders the words can appear. You are looking for acoustic, not language, model adaptation.
Because adaptation is a complex process, there are not good online tools for it (the one you linked is a language model tool). I would recommend downloading SphinxTrain at http://cmusphinx.sourceforge.net for a good all-purpose adaptation/training tool. There are many tutorials and forums to help you get started.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Greetings
I may have imagined this but does anyone know if Last.fm previously used some form of open source project to perform analysis on music to determine similar music.
As its now moved to a pay version I'd like to make something which can add known music to my playlist. (I hate scanning my computer for similar music manually)
Failing that - does anyone know of any system that I could use to replace this ? Ideally I'd like some form of API / Source code that I can use to automate the whole process into batch jobs.
Thanks,
[edit]
Ideally I was looking for something more along the lines of content matching. I'm the type of person who just throws all my music into one unorganized location. Then being lazy I would ideally expect a playlist to be generated giving me a similar music type of playlist.
Last.fm uses http://www.audioscrobbler.net/ - it also provides access to its database via an API.
[/edit]
Music similarity is not an easy problem.
There are two general approaches to solving this problem.
Approach 1.
Throw data at the problem. This is the approach LastFM and Pandora take. It's basically one huge database which is maintained by either a community or group of experts. Note that to use this approach you will need clean metadata or some kind of audio fingerprinting solution like musicbrainz. Once you have the feature database you can use algorithms such as Pearson correlation coefficient to find similar items.
Approach 2.
Throw algorithms at the problem. In particular, computer audition algorithms. This means you calculate vectors of various features a song contains and using neural nets and a variety of other techniques you find other songs with similar vectors. This approach has been used successfully for automatic genre classification and query by example.
If you are looking for open source software for music analysis, marsyas can do pretty much everything the commercial stuff can do. Its the brain child of George Tzanetakis and on his web site you can find many papers about the state of affairs with computer audition.
There's a web API at The Echo Nest that includes a get_similar web service that allows you to retrieve similar artists to a set of seed artists. You can use this to help build playlists. The Echo Nest also has a set of web APIs that will perform a detailed analysis of a track (similar to the aforementioned Marsyas) that one could use as the basis for an acoustic-based song similarity method. (Caveat, I work at the Echo Nest). Of course, if you use iTunes, there's some canned solutions. iTunes now has a music recommender / playlist generator that will build playlists of songs from simliar artists. Similarly, the company Mufin has an iTunes add on which will perform acoustic analysis of your tracks and use this analysis to build playlists.
If you are interested in building your own music similarity system, I suggest that you take a look at the proceedings for ISMIR (the International Society of Music Information Retrieval). There's quite a bit of research around music similarity and playlisting that you'll find helpful. You can find the proceedings at ismir.net
Wouldn't it be simpler/more efficient to query(build?) some internet database based on genre/style/etc? I used last.fm and similar sites but never felt they did anything more then this (at least the results weren't indicating that) ;)
I am not very sure what exactly you want, but how about MusicBrainz?
To be clear, AudioScrobbler is the tech built by Last.fm to run their service. They collect stats on the tracks which people listen to (also 'Like's of tracks and artists).
So Last.fm does social similarity... users who listened to X also listened to Y - you like X so maybe you will also like Y.
Given a large enough user base submitting stats, social similarity is likely to provide better results than computer analysis approaches. For example, try querying the Last.fm API for similar artists to someone you know - probably comes up with some good matches and a few obscure or oddball ones, which nonetheless reflect real people's listening habits. The more obscure the artist you search for the more likely you'll get weird matches.
Even if you could get the automatic genre classification method described by George Tzanetakis to work well you are missing out on the subjective judgements of quality supplied by real people. eg two tracks both look like 'Jazz' but there are many different kinds of Jazz... and I might be interested in non-Jazz albums that a favourite jazz musician has played on. Social similarity would be more likely to capture that info.
I used to use Predixis Magic Mixer. It will perform a brief analysis of the audio in a file, produce a "finger print" and compared it to fingerprints in a central database. If listed, it would set an identification code which is the result of the analysis of the entire file into the client copy. If not, it would do a full analysis on the client computer (takes a while) and upload that to the central database and keep the local copy as well. From that information it can set up a play list that relates tunes, one to another' depending upon the actual sounds. I have not used it for a few years so I don't know if the central database servers still are in operation, but a web search says no. It should still work, but every file will require full analysis.