List of all movie title, actors, directors, writers on Imdb - movie

I am working on a web app which lets users tell their favourite movies, directors, movie- writers, and actors. For this I want to provide them a dropdown list or auto complete for each of them so that they can just pick their choices.
For this:
I need a list of all movie titles, actors, directors, writers present on Imdb.
I checked Imdbpy and it does not seem to provide methods to get this data.
Would using imdbpy2sql.py to create a database and using sql to query the db, provide the required data? Is there any other way to do this?
Thanks!

Using imdbpy2sql.py to create a database and using SQL to query the db, will provide you the required data.
You can also try using Java Movie Database or imdbdumpimport to read in the text files to SQL.
The last option to do this is parsing the plain text files provided by IMDb yourself.

I think your best option is to parse the plain text files distributed here: imdb interfaces.
You probably just need the 'movies', 'actors', 'actresses' and 'director' file; they are quite easy to parse.

Related

Umbraco Extending Tags

I am new to Umbraco CMS, so this question may sound silly. The requirement is to show the tags (based on a specific tag group) to editors in a dropbox, and they could able to choose these tags and save it to the article. What is the best way of achieving this?
The tag property allows free text entry. If you only want editors to select from a predefined list of options you will probably need to do this a different way. You may want to try the dropdown property with the multiple choice option turned on instead. Then you can setup the prevalues (options) ahead of time and your editors can only select from that list.
You could also create a container in the tree for your categories and then allow your admins to select the appropriate items from that list using a multinode tree picker. Here's kind of an example of how you might structure this for a blog:
Blog
Authors
Author
Categories
Category
Posts
Post
Then when an admin adds a new post you would allow them to select the appropriate authors/categories for the post using a multinode tree picker.

Apache Solr: Merging documents from two sources before indexing

I need to index data from a custom application in Solr. The custom app stores metadata in an Oracle RDBMS and documents (PDF, MS Word, etc.) in a file store. The two are linked in the sense that the metadata in the database refers to a physical document (PDF) in the file store.
I am able to index the metadata from the RDBMS without issues. Now I would like to update the indexed documents with an additional field in which I can store the parsed content from the PDFs.
I have considered and tried the following
1. Using Update RequestHandler to try and update the indexed document with . This didn't work and the original document indexed from the RDBMS was overwritten.
2. Using SolrJ to do atomic updates but I am not sure if this is a good approach for something like this
Has anyone come across this issue before and what would be the recommended approach?
You can update the document, but it requires that you know the id of the existing document. For example:
{
"id": "5",
"parsed_content":{"set": "long text field with parsed content"}
}
Instead of just saying "parsed_content":"something" you have to wrap the value in "parsed_content":{"set":"something"} to trigger adding it to the existing document.
See https://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22field.22 for documentation on how to work with multivalued fields etc.

Ruby on Rails: How to have multiple controllers for one table AND multiple models

I'm new to Ruby and to Rails. I have played a bit with Sinatra but I think that Rails is a more complete framework for my project. However, I am running into trouble with this.
I am working with an fairly substantial existing, and heavily used, mySQL database and I am trying to build an API for this that will report on certain features. The features that are needed are, for the most part, counts of records by certain groupings, then drilling down into details.
For example we have a table - tableA, that contains lots of information relating to documentation. One piece of information we want to report on from that is the number of items in a given language. The language code is stored against each item and based on a get request I would like to return JSON.
Request: /languages/:code/count/:tablename
There are two variables in that most specific URL - the code we are counting and the table we are counting from.
I understand that in routes.rb I can set up a mapping:
get '/languages/:code/count/:table', :controller=>'languages', :action=>'count'
I have a controller - languages_controller.rb with a count method in it. this then matches to a corresponding view file count.html.erb
In all the tutorials I have read and examples I have followed the main point seems that 'languages' would be a table in the database and would therefore be available under the 'magic' Rails approach.
My issue is that it is not a table, rather the results of the call should be a limited subset of the fields in tableA. Such as languagecode and count(id).
The description of the language needs to be looked up 'manually' as it is stored as an internal code that is not in a database anywhere (historic decision/madness).
The questions:
how do I have a model that is only a subset of fields, plus some that are manually populated - languagecode, isocode, description, count
Am I right in thinking that once I have the model defined as such as I could use ActiveRecord to get data from the database and then in the controller add the extra information in?
Can I change table in the model based on the parameter sent in the URL?
Essentially, I am at a loss at the moment on what to do with this. I have the routes defined, the view templates in place and the controller there and ready to go. The database component - getting some data from a pre-existing table seems mysterious to me.
Any help is greatly appreciated, it seems that the framework is currently getting in my way and I know that I can't be the only one trying this sort of thing so if you have any advice please share.
There's really no need for a model here, at all. This isn't what ORMs are for. What you should be doing is just running raw SQL against the database, and iterating over the results. Consider doing something like this: https://stackoverflow.com/a/14840547/229044

Adding custom attributes to Task?

How can i add custom attributes/data to Task via API . for example we wanted to add field like customer contact number or deal amount e.t.c
We don't currently support adding arbitrary metadata to tasks, though it's something we're thinking about. In the meantime, what many customers do is to simply put data in the note field in an easily-parseable form, which works well and also lets humans reading the task see the e.g. ticket number.
It's not a terribly elegant solution, but it works.
https://asana.com/developers/documentation/getting-started/custom-external_data
Custom external data allows a client application to add app-specific metadata to Tasks in the API. The custom data includes a string id that can be used to retrieve objects and a data blob that can store character strings.
See the external field at https://asana.com/developers/api-reference/tasks

How to design a system like google base with dynamic attributes? in Ruby on Rails

I am wanting to create an application that can allow users to add products for sale.
I want to make it so that a user can add whatever type of product he/she likes and let them also create stored and searchable attributes for their products - alot like google base does.
Does anyone know of the best way to do this ie model it.
I don't really want a table for each category as this would be possibly 1000s of tables.
What is the best way to do this? has anyone got good / bad experiences of this?
Is there any plugins that does this?
Any help would be great
thanks
rick
It sounds like what you want is a tag system.
If you want something more flexible you might want to look at using a document store instead of a database, for example CouchDB.
If you don't want to keep this in a relational database I'd suggest creating a Model called "Descriptor" that would contain the ID of the item being added, the name of the attribute "Color" and the value "Red".
To help keeps things consistent you could also structure pre-set groups of descriptors (for cars: make, model, color) as well as provide auto-completes for the value entry text fields.

Resources