Need local SDK tool for parsing native pdf file with large tables - parsing

User needs to parse native-pdf(selectable data, not scanned, no OCR required) in local. The pdf files may be over 400 pages with large tables. Some tables may not have clear borders. Is there any API I could use?
Thanks!

Now that I know you don't want an API, I might recommend that you check out ItextSharp, from nuget. I have used this several times in the past, and there are many stack overflow forums on how to use it. https://www.nuget.org/packages/iTextSharp/5.5.13.1
EDIT: I apologize, it looks like iTextSharp has been replaced with iText 7 https://itextpdf.com/en/products/itext-7

It seems there are several PDF parser APIs out there you could use. PDFTron looks promising, and they offer a free trial: https://www.pdftron.com/pdf-sdk/parsing-library/
DocParser may also be helpful for you, https://docparser.com/features.
I found all of these through a simple google search, so it may benefit you to do some research for yourself. As we can only make broad suggestions based on the information in your question.

Related

Play Framework 2: Printable Documentation

While this question is asking for a downloadable documentation in general, I'm currently trying to find a good way to print the official Play framework documentation. My problem is that the whole documentation (available online) is split into small chunks of information and printing the whole documentation would mean hundreds of print jobs, each wasting a significant amount of paper. Is there some way to convert the whole documentation in a single/compact printable format? This would make a nice holiday reading :).
Apparently PDF documentation disappeared since Play version 2.1.0, I can see that is still available in 2.0.x if this satisfies you...
Anything I can advice is making a static copy of the pages so you can read it with some smartphone or tablet.

Charts library for Ruby

I am looking for a quite specific tool for generating charts, within Ruby on Rails application. I have done a research and couldn't find a solution that suits me.
Maybe you've bumped upon it and could just point me to it with a link? :)
My requirements for a solution are:
it has to feature basic chart types
like Pie, Bar, Stacked Bar, Line.
it has to have basic configuration of
a chart like legend, axis
description.
it has to be able to generate and save chart into image file without
actually rendering it in a browser
being a Ruby library would be nice,
but it is not obligatory
not being Gruff Graphing Library, I
am looking for something more up to
date, with less issues.
If you are aware about something, please post a link - it'll take you just a few seconds.
I think, that it is what you're looking for.
http://highcharts.com/
https://github.com/loudpixel/highcharts-rails
I've made a number of useful charts with the Google Chart API. There are a few gems: googlecharts, gchartrb. I haven't used them but they look like a good first cut.
As to not displaying it you can just make the call and save the result. No need to render, just make the call, get the URL, and fetch the file.
The only one that has satisfied all these conditions:
I have been able to make it works without need of install weird dependencies
Not has an HTML context requirement (all JS libraries)
Not depends on GCharts (net traffic dependency)
Exports to static graphic format
is gerbilcharts
Chartkick!!!
It's super nice. =)
It's been a long time since I asked this question and I see new people coming and posting new answers, which his great. This small post of mine turned out to be a small compendium of available libraries.
I decided to add my two cents. Nowadays whenever I am dealing with charts I usually use Highcharts. Highcharts is a very pleasant library by itself, but additionally there is incredible gem highcharts_on_rails which facilitates creating charts using DSL written in Ruby.
If you found this question and you're looking for options, consider highcharts_on_rails.
This might be able to do what you want:
Gruff
You can use rchart for plotting various chart.
I am using openflashchart
http://pullmonkey.com/projects/open_flash_chart2/
you can save generated json data in database and render when needed

TIGER shapefiles - using and interpreting

I know of the US GIS TIGER file format from years ago, but have never used it.
I'm very shortly going to need to very quickly implement simple geocoding and vector graphics of roads and other features.
Where do I go for information - are there tutorials, example queries, etc?
Are there other ways to include geocoding and basic mapping in a mobile (no internet) device?
-Adam
As far as I'm aware of, there aren't many applications that make use of the TIGER/Line format directly. Most apps use TIGER files that have been translated into ESRI's shapefile format.
Edited to add:
Is there information on ESRI's format available?
There's an ESRI whitepaper describing the file format.
If you're planning to use shapefiles in an application, there are various libraries out there.
The OpenStreetMap project imported TIGER data, you might find useful code snippets there. See the TIGER page on the OpenStreetMap wiki for more information and links

What is your preferred way to produce charts in a Ruby on Rails web application?

I'd like to add some pie, bar and scatter charts to my Ruby on Rails web application. I want want them to be atractive, easy to add and not introduce much overhead.
What charting solution would you recommend?
What are its drawbacks (requires Javascript, Flash, expensive, etc)?
Google Charts is an excellent choice if you don't want to use Flash. It's pretty easy to use on its own, but for Rails, it's even easier with the gchartrb gem. An example:
GoogleChart::PieChart.new('320x200', "Things I Like To Eat", false) do |pc|
pc.data "Broccoli", 30
pc.data "Pizza", 20
pc.data "PB&J", 40
pc.data "Turnips", 10
puts pc.to_url
end
If you don't need images, and can settle on requiring JavaScript, you could try a client-side solution like the jQuery plugin flot.
I am a fan of Gruff Graphs, but Google Charts is also good if you don't mind relying on an external server.
It requires flash and isn't free (though inexpensive): amcharts.
I've used it successfully and like it. I evaluated a number of options a while back and chose it. At the time, however, Google Charts wasn't as mature as it seems to be now. I would consider that first if I were to re-evaluate now.
There's also Scruffy. I took a look at the code recently and it seemed easy to modify/extend. It produces svg and (by conversion) png.
Have you tried the Google Charts API? - web service APIs don't really come much simpler. It's free to use, simple to implement, and the charts don't look too shoddy.
Open Flash Chart II is a free option that gives very nice output. It does, as you'd expect, require Flash.
Fusion Charts is even nicer, but is $499. In researching this, I found a cut-down free version that might serve your needs.
I 2nd the vote for flot. The latest version lets you do some animations and actions that I previously thought would only be possible via Flash. The documentation is fantastic. It simple to write by hand, but for simple cases it gets even easier with a Rails plugin called flotilla. You should check out the examples page for a better idea of what it's capable of. The zooming and hover capabilities are especially impressive.
The new Google Visualization appears to produce charts that are of more varied type, better looking and interactive than Google Graphs.
http://code.google.com/apis/visualization/
Morris.js is nice and open source. I would like to choose it comparing to highcharts. There is a new great video tutorial from Railscasts
I've just found ZiYa produces some really sexy charts and is Rails specific.
The downsides are it uses Flash and if you don't want the sites to link to XML/SWF page it costs $50 per site.
[I've not decided on it yet, but wanted to throw it out there in case people want to vote it up]
I've used Fusion Charts extensively from within a Java web application, but it should work the same way from Rails since you're just embedding a Flash via HTML or JavaScript and passing it XML data. It's a slick package and their support has always been very responsive.
You should take a look at Dmitry Baranovskiy's Javascript library called Raphaël.
Google charts is very nice, but it's not a rails only solution. You simple use the programming language of your choice to dynamically produce urls that contain the data and google returns you back a nice image with your chart.
http://code.google.com/apis/chart/
In the old days, I decided to roll my own (using RVG/RMagick), mainly because Gruff didn't have everything I wanted. The downside was that finding and eliminating all the bugs in graphing code is a pain. These days Gruff is my choice as it's really gone forward in terms of customization and flexibility.
The standard Gruff templates/color choices suck though, so you'll need to get your hands dirty for best results.
Regarding amcharts, there's a "free" version with a very few restrictions that generates Flash charts including the 'chart by amCharts.com' mention.
And there's a nice plugin, ambling, that provides you with some helper methods to easily add charts to your views. Please note that amCharts.com reference documentation is still a must to tailor the chart to your requirements.
GoogleCharts and Gruff charts are great, but sometimes they lack some features that I need for more scientific plotting. There is a gem for gnuplot which may be helpful for some of these situations.
http://rgplot.rubyforge.org/
I have started using protovis to generate SVG charts with javascript. My basic approach in rails is to have a controller that returns the data to be charted as JSON, and scoop it up with a bit of javascript and protovis.
Only downside, is that full IE support (Since it is based on SVG) is currently unavailable straight out of the box... However, current patches go a fair way to providing IE support, details of which can be found here.
I personally prefer JavaScript-based charts over Flash. If that's ok, also check out High Charts. A Rails plugin is also available.
The gchartrb gem is no longer maintained, it seems. The author points to these gems:
googlecharts
gchart (seems abandoned as well)
We do this by shelling out to gnuplot to generate the charts as PNGs server-side. It's a bit old-school and the charts aren't interactive but it works and is cacheable.
(The other reason we do this is so we can put exactly the same chart in the PDF version of the report).
This isn't specifically RoR however, it is pretty slick port of Gruff to javascript: http://bluff.jcoglan.com/
ChartDirector. Ugly API, but good, server-side image results. Self contained binary.
FWIW, I'm not a fan of using Google Charts when fit & finish is important. I find that the variables for sizing, in particular, are unpredictable - the chart does its own thing.
I haven't yet played with Gruff/Bluff/etc., but for a higher-profile project I won't use Google Charts.
If you want quite sexy charts, easy to generate, and you can enable Flash, then you should definitely have a look at maani.us xml/swf charts.
Some XML builder behind it and you're ready to go.
FusionCharts is a very good charting product. Works well with RoR. Their support and forums are good. The free version of this product has limited number of charts and features, but no watermark.
I just started using googlecharts for my rails 3 project. It is nice and clean, and seems to be the only google visualization api based gem which is alive. Others are inactive and mostly use the old google charts api (released somewhere in 2007-2008).
https://github.com/mattetti/googlecharts
D3 has become my preferred way add great looking charts to web apps. You have to do a little mroe work that some other frameworks, but the appearance and control outweighs that.
I primarily use SVG, which means no IE8, but that is becoming less of an issue.
HighChart - A charting library written in pure JavaScript
Gems like highchart-rails, lazy-high-chart makes the integration with rails easier
gem 'chart' makes it easy to add ChartJS and NVD3 charts to rails.

What do you use to capture webpages, diagram/pictures and code snippets for later reference? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
What do you use to capture webpages, diagram/pictures and code snippets for later reference?
Evernote http://www.evernote.com and delicious http://www.delicious.com
Evernote
Notepad2's clipboard feature (Notepad2.exe /c as a link in Launchy)
Windows Clippings or PrintKey
Firefox extension Page Saver
Delicious
Microsoft OneNote.
I just have an emacs instance running on my home machine, under screen. Whereever I am (and have network) I can connect to it remotely. I stick all useful urls, birthday present ideas, future dates, code snippets, ideas for docs etcetc in there.
I rarely have doodles/diagrams I need to capture, I tend to draw them in ascii in my file if needed.
I must admit I'm a bit stuck if I have no network/wifi somewhere, but that's rarely the case.
I find google notebook is very good for drive by code snippeting and google bookmarks especially as when used with the google toolbar, for web pages.
The benefit of these tools are that they are available from any pc on the web, though a good use of semantic organisation using labels is recommended.
Here's my response to a similar question:
The combination of OneNote with a tablet PC is awesome! I was a bit of a skeptic at first. I used the trial version and then forgot about it. A year later I had an unruly collection of files, project related emails, notebooks and scraps of paper all scattered throughout my life. I went back to OneNote and all my problems went away. Some highlights:
Everything is searchable. The character recognition is good enough that my chicken-scratch meeting notes can be searched. Text within images is searchable.
OneNote syncs with Outlook so finding meeting notes is a breeze.
I now embed all files into OneNote - pdfs, spreadsheets, word docs, images, web clippings.
OneNote is constantly saving all changes so, combined with a scheduled automated backup, everything is in one place and is safe.
There are some built-in collaboration tools I have yet to try but that look useful.
It is SO worth the price. It allows you to get started on a project and avoid all that time spent deciding how to organize things.
Zotero, is a nice plugin for Firefox.
SnagIt
captures everything you could want, and lets you annotate it.
I prefer to use the good old url for delicious
Apart from that i use the Scrapbook extension in firefox when i want to save something on the disk. It's possible to tag the page, edit it and remove those stupids ads before saving it.
I also have a Wiki on a stick that i carry around on a usbkey for code snippets that should go to other clients when i'm travelling around
Mostly, my code snippets are embedded into projects i carry on the same usb key, which allows me to demonstrate some technologies right off to the client and get his advice based on a demonstration, not a listing of code...
For screen shots, I use a mix between ScrapBook and ScreenGrab. They are both firefox plugins that are pretty amazing when you need to get a screenshot of a page for editing. Works great for consulting.
https://addons.mozilla.org/en-US/firefox/addon/427
https://addons.mozilla.org/en-US/firefox/addon/1146
Delicious Bookmarks extension for Firefox
It's a little primitive, but I've been using tiddlywiki (self-contained, single-file wiki) http://www.tiddlywiki.com/ which works good for basic text and markup. I combine it with a plugin to sync it with Outlook's notes (http://syncoutlooknotes.tiddlyspot.com/#SyncOutlookNotes) so that I can then sync it to my blackberry using the standard outlook-blackberry sync mechanism. This has the significant advantage that I can look at my notes and even write new notes when I'm out and about, away from my laptop, or just don't feel like lugging the laptop around to a meeting that I don't really need it for.
I'd prefer using something more advanced like Onenote, but being able to take my notes with my in the little blackberry has turned out to be a significant advantage.
Google Notebook is very convenient tool. You can clip and save any parts of web pages without leaving your browser tab. The Notebook plug-in automatically saves them as separate notes in your notebooks and keep the links back to the original web pages. You can organize your clippings later by moving them between your notebooks and/or tagging them. Very good for code snippets and references.

Resources