Best 3rd Party Resume Parser Tool [closed] - parsing

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
We are working on a hiring application and need the ability to easily parse resumes. Before trying to build one, was wondering what resume parsing tools are available out there and what is the best one, in your opinion? We need to be able to parse both Word and TXT files.

I suggest looking at some AI tools. Three that I'm aware of are
ALEX
Sovren
Resume Mirror
I think all the products handle Word, txt, and pdf along with a bunch of other document types. Although I've never used it, I've heard unfavorable things about Resume Mirror's accuracy and customer support. I'm a contract recruiter and have used both Sovren's and Hireability's parsers in different ATS's. From my view I thought Hireability did a better job, with Sovren it seemed like I was always fixing errors. And when there was a goof with Hire's I gave it to my ATS vendor and it seemed like it was fixed pretty quickly. Good luck.

Don't try to build one unless you want to dedicate your life to it. Don't re-invent wheels!
We build and sell a recruitment system. I did a long evaluation a few years ago and went for Daxtra - the other one in the frame was Burning Glass but I got the impression that Daxtra did non-US resumes better.
Anyway, we're re-evaluating it. Some parts it does brilliantly (name, address, phone numbers, work history) as long as the resume is culturally OK. But if it's not then it fails. What do I mean: Well, if the resume has as the first line:
Name: Sun Yat Sen
then Daxtra is smart enough to figure out that Sun Yat Sen is the guy's name. (Girl's?)
But if it has as the first line:
Sun Yat Sen
It can't figure it out.
On the other hand if the first line is
Johnny Rotten
then Daxtra works out his name.
Also, it works really well on UK addresses, fairly well on Australian addresses, crashes and burns on Indonesian addresses. That said, we've just parsed 35,000 Indonesian resumes relatively well - CERTAINLY far better than not doing it at all, or doing it manually!
On Skilling: I reckon if someone really tried to make the Skills section work then it would take 3 man-months or so and it would work really well.
Summary: Don't write it yourself, do some really good research on real resumes that you want parsing and dive in.
The key thing is: Don't expect any tool to be anywhere near 100% accurate - but it's a lot better than not having it.
Neil

FWIW I just ran 650 international resumes through Rchilli and found the accuracy to be very poor. Names & addresses were mangled and the detail fields were hit and miss.
This was a mix of pdfs & Word docs, primarily from Europe & Asia.

I have seen a lot of resumes in PDF format. Are you sure you don't care about them?
I'd recommend something simple:
Download google desktop search or
similar tool (i.e. Copernic)
Drop the files in a directory
Point the index tool to that
directory, and punch in your search
terms.

You may want to have a look at egrabber and rchilli these are two best tools out in the market.

I was wondering if any one update these list. Seems all are 2010 old almost 3 yrs old.
We integrated RChilli, and found them no flaw, support is best, and product is easier to use.
We tested RChilli, Hireability, and Daxtra. Sovren never responded to our emails.
Integration was smooth, and support is best in there.

Related

Human annotation tool for corpora in NLP [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am trying to build my own training corpus for Named Entity Recognition, but I don't know if there is already an existing tool for this or if I have to implement one myself.
Basically, what I need to do is take a corpus and manually tag it word by word, which is pretty tedious, but it has to be done.
Can anyone tell me if there is already an existing one and where to get it?
I had a good experience working with BRAT.
GATE is also a very complex tool for annotating, steeper learning curve.
We had a nice experience using DataTurks . They provide nice intuitive UI which allows to add collaborator, insights into data, leaderboard for annotators and some other funky features.
https://dataturks.com
For online annotation of text or HTML corpus of relatively short documents I also recommend BRAT. You will have to go under the hood of the python web application if you want to do anything custom. It also failed to work for me on large HTML documents (100 or so pages).
I have also used stand-alone apps:
Protege + Knowtator: a bit cumbersome to setup / use, but it
works;
Gate: also cumbersome, and it somewhat works. Backup
your annotations at regular intervals as you might get
surprised by a stacktrace that also wiped or corrupted your annotated
corpus (which is just serialized Java objects).
If you are dealing with PDF documents, we built a web-based PDF Annotation Tool: NOTA. It accepts anything printed to PDF, including scans. We do commercial OCR on our end to recover text from images. There is a REST API to create color-coded annotation schemas and pre-populate documents with annotations, as well as a REST API for exporting formatted text and annotation offsets. There is also a JS API you can use to customize any annotation workflows, add metadata to annotations, etc. Relationships are not supported out of the box. Large documents, 200+ pages are supported. Email us and we can give you an API key to try it out. Details and documentation links can be found here. It is free for small research projects.
Here is a screenshot of what the annotations looks like :
I co-develop myself the web-based text annotation tool: tagtog.net
There is nothing to install, and you can define the type of entities you want to annotate. Additionally you can annotation relationships, document labels, and much more. You can upload your documents in many different formats, including PDF or markdown. You can annotate together with your team collaboratively. We have put great care in making the interface easy and beautiful. It looks like this:
You can start right away with a free account. Also I would be happy to help you with any doubt or issue you may have; just ping me or write us an email to the address shown on the website, tagtog.net.
Our annotation tool Prodigy is very scriptable, and is designed for active learning. It integrates especially well with our NLP library spaCy.
We've paid particular attention to the Named Entity Recogntion (NER) annotation workflows, as entity recognition can otherwise be very slow. I have a tutorial video on this:
https://www.youtube.com/watch?v=l4scwf8KeIA
There is this tool called, Dataturks is super simple to use, fully online NLP annotation tool, so that I even can easily push my teammates to complete datasets for our projects.
try TagEditor ,
It is a desktop application designed to annotate text for training with spaCy library.
You can tag Named Entities, Dependencies, Parts of speech, text categories
and print json file.
Example

Tools for searching full text in iOS bundle

Sorry for the generalized question...I have been hunting for a long time and haven't found anything I can use or easily adapt yet. I'd really appreciate any pointers!
I'm building a reference app that will contain several textbooks in plain-text format. I want the user to be able to perform a search, and get a table back with a list of results. I have a working prototype, but the search logic that I wrote isn't all that smart and it's been hell trying to make it better.
This is obviously a fairly common problem so I'm looking for a tool that I could adapt to the task. So far I've found Lucene (http://vafer.org/blog/20090107014544/) and Locayta (http://www.locayta.com/iOS-search-engine/locayta-search-mobile/)
Lucene appears to have been last updated for iOS 2...I don't even know if I'll be able to rework it myself. Maybe.
Locayta would probably work great, but a commercial license is $1,000 and I may not soon recoup that with this app, as it's a niche market.
Thanks!
We stumbled upon the same predicament where I work, and have yet to decide on a solution.
Locayta seems promising, but barring that, I've looked into SQLite's FTS3/FTS4 as well.
The only issue seemed the lack of a way to match partial words. It's easy to search for fields that contain whole words (eg. "paper" matches "printer paper", "paper punch", and "sketch paper"), or words that start with something (eg. "bi*" matches "binder", and "bicycle"), but there's no built in way to match a suffix.
If you don't require that functionality, FTS3/FTS4 might work.
I see you mentioned in the follow-up that your SQLite didn't recognize FTS3(), and I had the same issue at first.
Apparently it's not bundled into the iOS version by default, instead you have to download the SQLite3 amalgamation, and include it in the project manually. As found at is FTS available in the iOS build of SQLite?
Also note, the SQLITE_ENABLE_FTS3 variable is not enabled by default, you just have to add it to the configuration as detailed at http://www.sqlite.org/fts3.html#section_2
Hope this helps.
If you can translate plain C code to iOS Objective-C, then Apache Lucy (a loose "C" port of Lucene) might be worth a look.

Music analysis software [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Greetings
I may have imagined this but does anyone know if Last.fm previously used some form of open source project to perform analysis on music to determine similar music.
As its now moved to a pay version I'd like to make something which can add known music to my playlist. (I hate scanning my computer for similar music manually)
Failing that - does anyone know of any system that I could use to replace this ? Ideally I'd like some form of API / Source code that I can use to automate the whole process into batch jobs.
Thanks,
[edit]
Ideally I was looking for something more along the lines of content matching. I'm the type of person who just throws all my music into one unorganized location. Then being lazy I would ideally expect a playlist to be generated giving me a similar music type of playlist.
Last.fm uses http://www.audioscrobbler.net/ - it also provides access to its database via an API.
[/edit]
Music similarity is not an easy problem.
There are two general approaches to solving this problem.
Approach 1.
Throw data at the problem. This is the approach LastFM and Pandora take. It's basically one huge database which is maintained by either a community or group of experts. Note that to use this approach you will need clean metadata or some kind of audio fingerprinting solution like musicbrainz. Once you have the feature database you can use algorithms such as Pearson correlation coefficient to find similar items.
Approach 2.
Throw algorithms at the problem. In particular, computer audition algorithms. This means you calculate vectors of various features a song contains and using neural nets and a variety of other techniques you find other songs with similar vectors. This approach has been used successfully for automatic genre classification and query by example.
If you are looking for open source software for music analysis, marsyas can do pretty much everything the commercial stuff can do. Its the brain child of George Tzanetakis and on his web site you can find many papers about the state of affairs with computer audition.
There's a web API at The Echo Nest that includes a get_similar web service that allows you to retrieve similar artists to a set of seed artists. You can use this to help build playlists. The Echo Nest also has a set of web APIs that will perform a detailed analysis of a track (similar to the aforementioned Marsyas) that one could use as the basis for an acoustic-based song similarity method. (Caveat, I work at the Echo Nest). Of course, if you use iTunes, there's some canned solutions. iTunes now has a music recommender / playlist generator that will build playlists of songs from simliar artists. Similarly, the company Mufin has an iTunes add on which will perform acoustic analysis of your tracks and use this analysis to build playlists.
If you are interested in building your own music similarity system, I suggest that you take a look at the proceedings for ISMIR (the International Society of Music Information Retrieval). There's quite a bit of research around music similarity and playlisting that you'll find helpful. You can find the proceedings at ismir.net
Wouldn't it be simpler/more efficient to query(build?) some internet database based on genre/style/etc? I used last.fm and similar sites but never felt they did anything more then this (at least the results weren't indicating that) ;)
I am not very sure what exactly you want, but how about MusicBrainz?
To be clear, AudioScrobbler is the tech built by Last.fm to run their service. They collect stats on the tracks which people listen to (also 'Like's of tracks and artists).
So Last.fm does social similarity... users who listened to X also listened to Y - you like X so maybe you will also like Y.
Given a large enough user base submitting stats, social similarity is likely to provide better results than computer analysis approaches. For example, try querying the Last.fm API for similar artists to someone you know - probably comes up with some good matches and a few obscure or oddball ones, which nonetheless reflect real people's listening habits. The more obscure the artist you search for the more likely you'll get weird matches.
Even if you could get the automatic genre classification method described by George Tzanetakis to work well you are missing out on the subjective judgements of quality supplied by real people. eg two tracks both look like 'Jazz' but there are many different kinds of Jazz... and I might be interested in non-Jazz albums that a favourite jazz musician has played on. Social similarity would be more likely to capture that info.
I used to use Predixis Magic Mixer. It will perform a brief analysis of the audio in a file, produce a "finger print" and compared it to fingerprints in a central database. If listed, it would set an identification code which is the result of the analysis of the entire file into the client copy. If not, it would do a full analysis on the client computer (takes a while) and upload that to the central database and keep the local copy as well. From that information it can set up a play list that relates tunes, one to another' depending upon the actual sounds. I have not used it for a few years so I don't know if the central database servers still are in operation, but a web search says no. It should still work, but every file will require full analysis.

What do you use to capture webpages, diagram/pictures and code snippets for later reference? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
What do you use to capture webpages, diagram/pictures and code snippets for later reference?
Evernote http://www.evernote.com and delicious http://www.delicious.com
Evernote
Notepad2's clipboard feature (Notepad2.exe /c as a link in Launchy)
Windows Clippings or PrintKey
Firefox extension Page Saver
Delicious
Microsoft OneNote.
I just have an emacs instance running on my home machine, under screen. Whereever I am (and have network) I can connect to it remotely. I stick all useful urls, birthday present ideas, future dates, code snippets, ideas for docs etcetc in there.
I rarely have doodles/diagrams I need to capture, I tend to draw them in ascii in my file if needed.
I must admit I'm a bit stuck if I have no network/wifi somewhere, but that's rarely the case.
I find google notebook is very good for drive by code snippeting and google bookmarks especially as when used with the google toolbar, for web pages.
The benefit of these tools are that they are available from any pc on the web, though a good use of semantic organisation using labels is recommended.
Here's my response to a similar question:
The combination of OneNote with a tablet PC is awesome! I was a bit of a skeptic at first. I used the trial version and then forgot about it. A year later I had an unruly collection of files, project related emails, notebooks and scraps of paper all scattered throughout my life. I went back to OneNote and all my problems went away. Some highlights:
Everything is searchable. The character recognition is good enough that my chicken-scratch meeting notes can be searched. Text within images is searchable.
OneNote syncs with Outlook so finding meeting notes is a breeze.
I now embed all files into OneNote - pdfs, spreadsheets, word docs, images, web clippings.
OneNote is constantly saving all changes so, combined with a scheduled automated backup, everything is in one place and is safe.
There are some built-in collaboration tools I have yet to try but that look useful.
It is SO worth the price. It allows you to get started on a project and avoid all that time spent deciding how to organize things.
Zotero, is a nice plugin for Firefox.
SnagIt
captures everything you could want, and lets you annotate it.
I prefer to use the good old url for delicious
Apart from that i use the Scrapbook extension in firefox when i want to save something on the disk. It's possible to tag the page, edit it and remove those stupids ads before saving it.
I also have a Wiki on a stick that i carry around on a usbkey for code snippets that should go to other clients when i'm travelling around
Mostly, my code snippets are embedded into projects i carry on the same usb key, which allows me to demonstrate some technologies right off to the client and get his advice based on a demonstration, not a listing of code...
For screen shots, I use a mix between ScrapBook and ScreenGrab. They are both firefox plugins that are pretty amazing when you need to get a screenshot of a page for editing. Works great for consulting.
https://addons.mozilla.org/en-US/firefox/addon/427
https://addons.mozilla.org/en-US/firefox/addon/1146
Delicious Bookmarks extension for Firefox
It's a little primitive, but I've been using tiddlywiki (self-contained, single-file wiki) http://www.tiddlywiki.com/ which works good for basic text and markup. I combine it with a plugin to sync it with Outlook's notes (http://syncoutlooknotes.tiddlyspot.com/#SyncOutlookNotes) so that I can then sync it to my blackberry using the standard outlook-blackberry sync mechanism. This has the significant advantage that I can look at my notes and even write new notes when I'm out and about, away from my laptop, or just don't feel like lugging the laptop around to a meeting that I don't really need it for.
I'd prefer using something more advanced like Onenote, but being able to take my notes with my in the little blackberry has turned out to be a significant advantage.
Google Notebook is very convenient tool. You can clip and save any parts of web pages without leaving your browser tab. The Notebook plug-in automatically saves them as separate notes in your notebooks and keep the links back to the original web pages. You can organize your clippings later by moving them between your notebooks and/or tagging them. Very good for code snippets and references.

Touch Typing Software recommendations [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Since the keyboard is the interface we use to the computer, I've always thought touch typing should be something I should learn, but I've always been, well, lazy is the word. So, anyone recommend any good touch typing software?
It's easy enough to google, but I'ld like to hear recommendations.
Typing of the Dead!
It's a good few years old so you may have to hunt around, but it's a lot of fun and as well as the main game there are numerous minigames to practice specific areas you may be weak on.
I trained my typing on GNU Typist. It comes with exercises for various languages and keyboard layouts, if you're so inclined.
One of the most fun typing programs I used is dvorak7min. It has a nastiness mode where for each typo you make, the cursor goes back by 1. So if you don't watch your typing, you'll be back to square 1….
If you want some motivation to learn to touch type read Steve Yegge's Blog rant:
Programming's Dirtiest Little Secret
Find a long document on the web, using Firefox
Press CTRL+F
Type along with the document. Try it, it works.
Mavis Beacon.
Although not nearly as fun as Typing of the Dead!
I've been touchtyping since I was 10 years old (on a real typewriter at that!) but one thing that helped my sister learn touchtyping is hanging out in IRC channels. You want to be able to "talk" as fast as you can speak and that trained her in typing a lot faster.
I know it's a lame answer and not really a software solution or what, but it worked for a lot of people I know. :)
Try http://keybr.com/? It is a little different from the usual format of free typing tutors. If you create an account, it keeps track of your progress as well. No add-ware and no pop-ups, or other useless junk.
If you want to learn by getting thrown in the deep end... DasKeyboard ultimate will have you touch typing in no time :)
I use Rapid Typing to learn touch typing. It has excellent visuals and it's even somewhat relaxing to type.
About the recommendation to use the DasKeyboard, I just started using one today! But be aware that it makes a lot of noise. I was mortified how much noise it was making in my super quiet office filled with other people, who are engineers but mostly not developers. I asked the person across from me if it was too noisy. She hesitated for a fraction of a second before insisting it was fine, and when I said I would put it away she barely protested. So I packed it up. Maybe if you are just surrounded by other devs it would be OK. I'd love to hear of contrary experiences. I'm banging away from home right now though, as loud as possible, and loving it!
Oh, and you will definitely learn to touch type! Right now I have a picture of a labeled keyboard as my desktop image, but am referring to it less and less.
Mike
I have a really weird habitual way of typing where I use several fingers on my left hand but only one or two on the right. This has served me for years and apparently gives me 80+ words per minute, but it does seem an incredibly weird way to type. This is touch typing but not using the "standard" finger arrangement. While it's probably not a great idea to try and fix something that already works, I thought I'd try and relearn to type the proper way (left fingers on asdf and right fingers on jkl;).
I've been trying out Mavis Beacon and it seems alright, it slowly adds more and more letters to your repertoire allowing you to gain the muscle memory or whatever, and then focuses on speed. The "games" seem a little pointless (is this program designed for kids?), but I guess for someone who doesn't know where the keys are it does a good job showing you which fingers to use and where to move them. As I already knew where the keys are most of the program didn't really aid me. Once you know where the keys are you probably just want to practice typing out text and a program like that won't really aid any more than notepad apart from counting your words per minute and giving you something to type. I agree with Typing of the dead being pretty awesome though, and will definitely help with your speed once you've got the finger arrangement down.
Do all you touch typers use the standard finger arrangement or do you just do your own thing? I think I've come to the conclusion I'll just stick with what I know, it seems to work anyway.
For the sake of completeness, my wife used KP Typing Tutor, worked great.
+1 on chatting more
I used the TTCoach plugin for Vim and have been very happy with it. It doesnt come with any exercises for numbers and symbols however, but it is easy to just make some text files yourself and write :TTCustom file.txt to use it for exercising.
Just learn a couple of characters at a time and when you got them nailed, learn a couple more and so on...
I've been using TypeFaster. It's not pretty, but one nice feature is that it can load lessons in different keyboard layouts, like Colemak (layout files here) or Dvorak.

Resources