I have a requirement to show the words from an OWL file in a web page as an autocomplete field.
So when the user types in a term, it should query the ontology file created using Protege and show the matches in a autocomplete text field.
How can I do this? Is this possible using the Jena API?
Can someone provide examples? I am completely new to Ontology.
You can solve this problem in multiple ways:
You can have a Web service (REST API) which accepts a keyword and it responses with an auto-complete field. here web service can use Jena API (SPARQL) to make a query to ontology.
you can query ontology using link if your data is in the form of RDF then use link
SPARQL query language(it's part of Jena API) helps in retrieving data from ontology.
Hope this information helps.
Related
I have a host running neo4j engine, say: neo4jhost:7474
I would like certain users to be able see query results in a browser, from a pre-generated link. Thus, user could explore the graph interactively, without messing with the query syntax.
For example: let the query be
(n)-[r]->(m) where n.id=123 return n,r,m
I need a URL link that produces the above mentioned query, but displays the result in a browser, in neo4j graph visualization format.
Currently, Neo4j Browser does not have this feature.
However, you can use a graph visualization library to embed the graph visualization into your web application. Some examples of JavaScript graph visualization libraries:
D3.js
VivaGraphJS
Sigma
KeyLines
Alchemy.js
Alternatively, since Neo4j Browser is an Open-source tool you can checkout the project and modify it to achieve your goal.
As already mentioned, Neo4j browser does not have this functionality. However, you can have a look at the popoto.js. It might not be exactly what you are looking for, but it has dynamic natural language query tracker and Cypher Query generation functionalities.
I suggest to inspect and do some re-engineering on the functions to figure out more details. Maybe it gives new ideas as well.
Customers use web queries to grab data in the tables directly from our website and place them into excel where they can automatically work on it. while trying to grab data from our website, we noticed that table markers were not shown. excel is unable to recognize the tables on the web page.
Website was developed using RoR.
can someone help us with this issue?
You may want to consider using Power Query instead of regular web queries. With the "from Web" option you can get the DOM into Power Query and access any DOM element.
I'm looking at visualization options for a graph database project that I have coming up. Part of the job is to provide an interactive visualization of the data for public website visitors.
The standard Neo4j Server Web Interface does all I would need it to and more. I was wandering if I could simply embed it in a webpage or provide a public url (that could be accessed without a login) that general users could use to view the visualization without being able to edit it or add nodes/relationships? If you know of any examples of how this can be done, I would be very grateful.
Thanks!
The Neo4j browser is an Angular.js application using d3.js as visualization. The code is all open source an on https://github.com/neo4j/neo4j/tree/2.2/community/browser/lib/visualization so you can check it out there.
In general http://maxdemarzi.com is a good source for visualization blog posts as is http://neo4j.org/develop/visualization
Check out Neo4j GraphGists. A GraphGist allows you to embed a Neo4j database, Cypher queries, and visualize the results in a web page. Lots of examples listed on the GraphGist wiki.
Also take a look at Mashed Datatoes, a bar chart, pie chart like visualization software for Neo4j.
It uses Movie database for demo. Try selecting "Person" as start label.
I am using ravendb for my intranet website. I need to implement full text search on whole website ? I can use ravendb's linq search queries for documents which is lucene based in the background.
Other approach is to use Lucene.Net library to implement fulltext search independently.
Whatever approach I choose, it should be able to search through attachments stored in blob format in ravendb.
Any ideas or suggestions please ?
RavenDB is fully integrated with Lucene. There would be little point to using it independently.
But by definition, attachments are not searchable. You can certainly store very large documents that are fully searchable, but they wouldn't be attachments. The whole point of attachments are for things that you wouldn't want to search. Example: videos, photos, music, etc.
Review:
http://ravendb.net/docs/client-api/attachments
http://ravendb.net/docs/client-api/querying/linq-extensions/search
http://ravendb.net/docs/appendixes/lucene-indexes-usage
Revised Answer
I have written a bundle that uses IFilters to have RavenDB automatically extract the contents of attachments and index them with Lucene. It is available here.
Enjoy!
I want to know if there is a better way of extracting info from a web page than parsing the HTML for what i'm searching. ie: Extracting movie rating from 'imdb.com'
I'm currently using the IndyHttp components to get the page and i'm using strUtils to parse the text but the content is limited.
I found plain simple regex-es to be highly intuitive and simple when dealing with good web-sites, and IMDB is a good web site.
For example the movie rating on the IMDB's movie HTML page is in a <DIV> with class="star-box-giga-star". That's VERY easy to extract using a regular expression. The following regular expression will extract the movie rating from the raw HTML into capture group 1:
star-box-giga-star[^>]*>([^<]*)<
It's not pretty, but it does the job. The regex looks for the "star-box-giga-star" class id, then it looks for the > that terminates the DIV, and then captures everything until the following <. To create a new regex like this you should use a web browser that allows inspecting elements (for example Crome or Opera). With Chrome you can simply look at the web-page, right-click on the element you want to capture and do Inspect element, then look around for easily identifiable elements that can be used to create a good regex. In this case the "star-box-giga-star" class is obviously easily identifiable! You'll usually have no problem finding such identifiable elements on good web sites because good web sites use CSS and CSS requires ID's or class'es to be able to style the elements properly.
Processing RSS feed is more comfortable.
As of the time of posting, the only RSS feeds available on the site are:
Born on this Date
Died on this Date
Daily Poll
Yet, you may make a call for adding a new one by getting in touch with the help desk.
Resources on RSS feed processing:
Relevant post here on SO.
Super Object
Wikipedia.
When scraping websites, you cannot rely on the availability of the information. IMDB may detect your scraping and attempt to block you, or they may frequently change the format to make it more difficult.
Therefore, you should always try to use a supported API Or RSS feed, or at least get permission from the web site to aggregate their data, and ensure that you're abiding by their terms. Often, you will have to pay for this type of access. Scraping a website without permission may open you up to liability on a couple legal fronts (Denial of Service and Intellectual Property).
Here's IMDB's statement:
You may not use data mining, robots, screen scraping, or similar
online data gathering and extraction tools on our website.
To answer your question, the better way is to use the method provided by the website. For non-commercial use, and if you abide by their terms, you can download the IMDB database directly and use the data from there instead of scraping their site. Simply update your database frequently, and it's a better solution than scraping the site. You could even wrap your own web API around it. Ratings are available as a standalone table.
Use HTML Tidy to convert any HTML to valid XML and then use an XML parser, maybe using XPATH or developing your own code (which is what I do).
All the answers posted cover well your generic question. I usually follow an strategy similar to the one detailed by Cosmin. I use wininet and regex for most of my web extraction needs.
But let me add my two cents at the specific subquestion on extracting imdb qualification. IMDBAPI.COM provides a query interface returning json code, which is very handy for this type of searches.
So a very simple command line program for getting a imdb rating would be...
program imdbrating;
{$apptype console}
uses htmlutils;
function ExtractJsonParm(parm,h:string):string;
var r:integer;
begin
r:=pos('"'+Parm+'":',h);
if r<>0 then
result:=copy(h,r+length(Parm)+4,pos(',',copy(h,r+length(Parm)+4,length(h)))-2)
else
result:='N/A';
end;
var h:string;
begin
h:=HttpGet('http://www.imdbapi.com/?t=' + UrlEncode(ParamStr(1)));
writeln(ExtractJsonParm('Rating',h));
end.
If the page you are crawling is valid XML, i use SimpleXML to extract infos. Works pretty well.
Resource:
Download link.