IMDBpy - Director name is coming with the characters all divided - movie

I'm trying to get some details about movies from IMDB.
For that I'm using IMDBpy with following code:
import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
# First, retrieve the movie object using its ID
movie = ia.get_movie(topmovie.movieID)
cast = movie.get('cast')
topActors = 3
i = i+1;
actor_names = [actor['name'] for actor in cast[:topActors]]
#director_name = [director['director'] for director in cast[:topActors]]
if i <= 10:
print(movie, ';', ' | '.join(movie['genres']),
';', ' | '.join(actor_names),
';', ' | '.join(str(movie['director']))
);
else:
break;
However when I run my code I am getting my results with this format:
The Shawshank Redemption ; Drama ; Tim Robbins | Morgan Freeman | Bob Gunton ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 1 | 0 | 4 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | D | a | r | a | b | o | n | t | , | | F | r | a | n | k | _ | > | ]
The Godfather ; Crime | Drama ; Marlon Brando | Al Pacino | James Caan ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | C | o | p | p | o | l | a | , | | F | r | a | n | c | i | s | | F | o | r | d | _ | > | ]
The Godfather: Part II ; Crime | Drama ; Al Pacino | Robert Duvall | Diane Keaton ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | C | o | p | p | o | l | a | , | | F | r | a | n | c | i | s | | F | o | r | d | _ | > | ]
The Dark Knight ; Action | Crime | Drama | Thriller ; Christian Bale | Heath Ledger | Aaron Eckhart ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 6 | 3 | 4 | 2 | 4 | 0 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | N | o | l | a | n | , | | C | h | r | i | s | t | o | p | h | e | r | _ | > | ]
12 Angry Men ; Crime | Drama ; Martin Balsam | John Fiedler | Lee J. Cobb ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 4 | 8 | 6 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | L | u | m | e | t | , | | S | i | d | n | e | y | _ | > | ]
Schindler's List ; Biography | Drama | History ; Liam Neeson | Ben Kingsley | Ralph Fiennes ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 2 | 2 | 9 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | S | p | i | e | l | b | e | r | g | , | | S | t | e | v | e | n | _ | > | ]
The Lord of the Rings: The Return of the King ; Action | Adventure | Drama | Fantasy ; Noel Appleby | Ali Astin | Sean Astin ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 3 | 9 | 2 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | J | a | c | k | s | o | n | , | | P | e | t | e | r | _ | > | ]
Pulp Fiction ; Crime | Drama ; Tim Roth | Amanda Plummer | Laura Lovelace ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 2 | 3 | 3 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | T | a | r | a | n | t | i | n | o | , | | Q | u | e | n | t | i | n | _ | > | ]
The Good, the Bad and the Ugly ; Western ; Eli Wallach | Clint Eastwood | Lee Van Cleef ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 4 | 6 | 6 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | L | e | o | n | e | , | | S | e | r | g | i | o | _ | > | ]
Fight Club ; Drama ; Edward Norton | Brad Pitt | Meat Loaf ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 3 | 9 | 9 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | F | i | n | c | h | e | r | , | | D | a | v | i | d | _ | > | ]
As you can see the columns for Director is returning with multipli characters...
How can I solve this?
Already solve this issue:
import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
# First, retrieve the movie object using its ID
movie = ia.get_movie(topmovie.movieID)
cast = movie.get('cast')
directors = movie.get('director')
topActors = 3
i = i+1;
actor_names = [actor['name'] for actor in cast[:topActors]]
director_names = [director['name'] for director in directors[:1]]
if i <= 10:
print(movie, ' ; ', ' | '.join(movie['genres']),
' ; ', ' | '.join(actor_names),
' ; ', ' | '.join(director_names)
);
else:
break;
Thanks!

movie['director'] is a list of Movie objects; casting it to str you will get something like "[<Object1>, <Object2>]" and then you use this string as an iterable for the join method.
You should get the directors' names exactly like you do with the cast names.
For example:
print(movie, ';', ' | '.join(movie['genres']),
';', ' | '.join(actor_names),
';', ' | '.join([d['name'] for d in movie['director']])
);

Related

Finding root of a tree in a directed graph

I have a tree structure like node(1)->node(2)->node(3). I have name as an property used to retrieve a node.
Given a node say node(3), i wanna retrieve node(1).
Query tried :
MATCH (p:Node)-[:HAS*]->(c:Node) WHERE c.name = "node 3" RETURN p LIMIT 5
But, not able to get node 1.
Your query will not only return "node 1", but it should at least include one path containing it. It's possible to filter the paths to only get the one traversing all the way to the root, however:
MATCH (c:Node {name: "node 3"})<-[:HAS*0..]-(p:Node)
// The root does not have any incoming relationship
WHERE NOT (p)<-[:HAS]-()
RETURN p
Note the use of the 0 length, which matches all cases, including the one where the start node is the root.
Fun fact: even if you have an index on Node:name, it won't be used (unless you're using Neo4j 3.1, where it seems to be fixed since 3.1 Beta2 at least) and you have to explicitly specify it.
MATCH (c:Node {name: "node 3"})<-[:HAS*0..]-(p:Node)
USING INDEX c:Node(name)
WHERE NOT (p)<-[:HAS]-()
RETURN p
Using PROFILE on the first query (with a numerical id property instead of name):
+-----------------------+----------------+------+---------+-------------------------+----------------------+
| Operator | Estimated Rows | Rows | DB Hits | Variables | Other |
+-----------------------+----------------+------+---------+-------------------------+----------------------+
| +ProduceResults | 0 | 1 | 0 | p | p |
| | +----------------+------+---------+-------------------------+----------------------+
| +AntiSemiApply | 0 | 1 | 0 | anon[23], c -- p | |
| |\ +----------------+------+---------+-------------------------+----------------------+
| | +Expand(All) | 1 | 0 | 3 | anon[58], anon[67] -- p | (p)<-[:HAS]-() |
| | | +----------------+------+---------+-------------------------+----------------------+
| | +Argument | 1 | 3 | 0 | p | |
| | +----------------+------+---------+-------------------------+----------------------+
| +Filter | 1 | 3 | 3 | anon[23], c, p | p:Node |
| | +----------------+------+---------+-------------------------+----------------------+
| +VarLengthExpand(All) | 1 | 3 | 5 | anon[23], p -- c | (c)<-[:HAS*]-(p) |
| | +----------------+------+---------+-------------------------+----------------------+
| +Filter | 1 | 1 | 3 | c | c.id == { AUTOINT0} |
| | +----------------+------+---------+-------------------------+----------------------+
| +NodeByLabelScan | 3 | 3 | 4 | c | :Node |
+-----------------------+----------------+------+---------+-------------------------+----------------------+
Total database accesses: 18
and on the second one:
+-----------------------+----------------+------+---------+-------------------------+------------------+
| Operator | Estimated Rows | Rows | DB Hits | Variables | Other |
+-----------------------+----------------+------+---------+-------------------------+------------------+
| +ProduceResults | 0 | 1 | 0 | p | p |
| | +----------------+------+---------+-------------------------+------------------+
| +AntiSemiApply | 0 | 1 | 0 | anon[23], c -- p | |
| |\ +----------------+------+---------+-------------------------+------------------+
| | +Expand(All) | 1 | 0 | 3 | anon[81], anon[90] -- p | (p)<-[:HAS]-() |
| | | +----------------+------+---------+-------------------------+------------------+
| | +Argument | 1 | 3 | 0 | p | |
| | +----------------+------+---------+-------------------------+------------------+
| +Filter | 1 | 3 | 3 | anon[23], c, p | p:Node |
| | +----------------+------+---------+-------------------------+------------------+
| +VarLengthExpand(All) | 1 | 3 | 5 | anon[23], p -- c | (c)<-[:HAS*]-(p) |
| | +----------------+------+---------+-------------------------+------------------+
| +NodeUniqueIndexSeek | 1 | 1 | 2 | c | :Node(id) |
+-----------------------+----------------+------+---------+-------------------------+------------------+
Total database accesses: 13

"Extract" intervals from series in Google Sheets

If I in Google Sheets have a series defined as
[29060, 29062, 29331, 29332, 29333, 29334, 29335, 29336, 29337, 29338, 29339, 29340, 29341,
29342, 29372, 29373].
How do I make them line up in intervals like this?
|To |From |
|29060 |29062 |
|29331 |29342 |
|29372 |29373 |
I can't find any good answers for this anywhere. Please, help!
Data/Formulas
A1:
29060, 29062, 29331, 29332, 29333, 29334, 29335, 29336, 29337, 29338, 29339, 29340, 29341,
29342, 29372, 29373
B1: =transpose(split(A1,",")). Converts the input text is an a vertical array.
C1: =FILTER(B1:B16,mod(ROW(B1:B16),2)<>0). Returns values in odd rows.
D1: =FILTER(B1:B16,mod(ROW(B1:B16),2)=0). Returns values in even rows.
E1: =ArrayFormula(FILTER(C1:C8,{TRUE();C2:C8<>D1:D7+1})). Returns values that start a range.
F1: =ArrayFormula(FILTER(D1:D8,{D1:D7+2<>D2:D8;TRUE()})). Returns values that end a range.
Result
Note: A1 values are not shown for readability.
+----+---+-------+-------+-------+-------+-------+
| | A | B | C | D | E | F |
+----+---+-------+-------+-------+-------+-------+
| 1 | | 29060 | 29060 | 29062 | 29060 | 29062 |
| 2 | | 29062 | 29331 | 29332 | 29331 | 29342 |
| 3 | | 29331 | 29333 | 29334 | 29372 | 29373 |
| 4 | | 29332 | 29335 | 29336 | | |
| 5 | | 29333 | 29337 | 29338 | | |
| 6 | | 29334 | 29339 | 29340 | | |
| 7 | | 29335 | 29341 | 29342 | | |
| 8 | | 29336 | 29372 | 29373 | | |
| 9 | | 29337 | | | | |
| 10 | | 29338 | | | | |
| 11 | | 29339 | | | | |
| 12 | | 29340 | | | | |
| 13 | | 29341 | | | | |
| 14 | | 29342 | | | | |
| 15 | | 29372 | | | | |
| 16 | | 29373 | | | | |
+----+---+-------+-------+-------+-------+-------+

list of dates associated with name

What would be a good approach to report of all the dates a name occurs in a list? Can this be done with a single array formula?
Example (column A and B are input, columns C through G are to be auto-generated):
| A | B | C | D | E | F | G |
+---------+--------+--------+---------+---------+---------+---------+
| Episode | Stars | Name | Date | Date | Date | Date |
+---------+--------+--------+---------+---------+---------+---------+
| 7/24/15 | Bart | Bart | 7/24/15 | 7/18/15 | 8/15/15 | 3/29/15 |
| 8/09/15 | Maggie | Homer | 1/10/15 | | | |
| 7/24/15 | Marge | Lisa | 7/20/15 | 6/04/15 | | |
| 7/18/15 | Bart | Maggie | 8/09/15 | | | |
| 1/10/15 | Homer | Marge | 7/24/15 | | | |
| 8/15/15 | Bart | | | | | |
| 7/20/15 | Lisa | | | | | |
| 6/04/15 | Lisa | | | | | |
| 3/29/15 | Bart | | | | | |
|^^^^^^|
| |
| |
| (o)(o)
# _)
| ,___| - Thanks Dude!
| /
/___\
/ \
I don't think this is easily possible in a single arrayformula. However, as an alternative you could try this formula in cell C2:
=SORT(UNIQUE(QUERY(FILTER(B$2:B,LEN(B$2:B)))),1,1)
Then try this formula in cell D2 and drag down:
=TRANSPOSE(QUERY(A$2:B,"select A where B='"&C2&"'"))
See this example sheet to see it working: https://goo.gl/0u41u5

Truth Table For Switching Functions

Can someone explain how these concept works?
I have 1 question. But I don't know have any ideas on constructing the truth table.
f(A,B,C) = AB + A’C
The answer given was ABC + ABC' + A'BC + A'B'C
And i have no idea how it get there. :-(
1. Create a column for each of the inputs, each intermediate functions, and the final function:
A B C | AB | A' | A'C | AB + A'C
--------------------------------
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
2. Enumerate all input possibilities, and start filling in the intermediate function values and then the final function value:
A B C | AB | A' | A'C | AB + A'C
--------------------------------
0 0 0 | 0 | 1 | 0 | 0
0 0 1 | | | |
0 1 0 | | | |
0 1 1 | | | |
1 0 0 | | | |
1 0 1 | | | |
1 1 0 | | | |
1 1 1 | | | |
3. Now, you finish the truth table.
Update per OP's edit of question:
The "answer given" can be reduced as follows using Boolean Algebra:
ABC + ABC' + A'BC + A'B'C
AB(C + C') + A'C(B + B')
AB + A'C
...which is the same as the given f(A,B,C). Not sure why ABC + ABC' + A'BC + A'B'C would be considered to be the "answer," but this does show equivalence between the two formulae.

URL BNF search part does not make sense

While implementing a Java regular expression for URL based on the URL BNF published by W3C, I've failed to understand the search part. As quoted:
httpaddress h t t p : / / hostport [ / path ] [ ?
search ]
search xalphas [ + search ]
xalphas xalpha [ xalphas ]
xalpha alpha | digit | safe | extra | escape
alpha a | b | c | d | e | f | g | h | i | j | k |
l | m | n | o | p | q | r | s | t | u | v |
w | x | y | z | A | B | C | D | E | F | G |
H | I | J | K | L | M | N | O | P | Q | R |
digit 0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
safe $ | - | _ | # | . | & | + | -
extra ! | * | " | ' | ( | ) | ,
Search claims it is xalphas seperated by a plus sign.
xalphas can contain plus signs by it self, as claimed by safe.
Thus according to my understanding , it should be:
search xalphas
Where am I wrong here?
That's pretty clearly a mistake (+ is a reserved delimiter for URIs), but the BNF you're linking to seems to be out of date. Probably best to use the one included at the end of the latest RFC 3986.

Resources