Python cyclic import unexpected behavior - python-import

I have discovered something unexpected when playing with cyclic imports. I have two files in the same directory:
a.py
import b
print("hello from a")
b.py
import a
print("hello from b")
Running either python3 a.py and python3 b.py does not result in a cyclic import related error. I know that the first imported module is imported under the name __main__, but I still do not understand this behavior. For example, running python3 a.py or python -m a produces the following output:
hi from a
hi from b
hi from a
Looking at the output of print(sys.modules.keys()), I can see that both modules are somehow already imported when checking it, even when importing the sys module as the first thing in one of the modules.
I did not use sys.modules properly before answering my own question.
This does not happen if neither of the cyclic imported modules is the __main__ module. My Python version is Python 3.6.3 on Ubuntu 17.10.
It still happens, but there is a visible error only if there is actually something you use from one of the cyclically imported modules.
See my own answer for clarifications.

The answer to my question
I have discovered the answer. I will try to sketch an explanation:
Executing python3 a.py imports the module in file a.py as __main__:
import b in module __main__:
import a in module b -> Imports the module in file a.py as a
import b in module a -> Nothing happens, already imported that module
print('hello from a') in a.py (executing module a)
import a in module b finished
print('hello from b') in b.py (executing module b)
import b in module __main__ finished
print('hello from a') in a.py(executing module __main__)
The problem is that there is no cyclic import error per se. A module is imported only once, and after that, other imports of the same module can be seen as no-ops.
This operation can be seen as adding a key to the sys.modules dictionary corresponding to the name of the imported module and then setting attributes on the module object associated with that key as it gets executed. So if the key is already present in the dictionary (on a second import of the same module), nothing happens on the second import. The already imported above means already present in the sys.modules dictionary. This reflects the procedural nature of Python (being originally implemented in C) and the fact that anything in Python is an object.
The lurking problem
In order to show the fact that the problem associated with cyclic imports is still present, let's add a function to module b and try to use it from module a.
a.py
import b
b.f()
b.py
import a
def f():
print('hello from b.f()')
Executing now python a.py imports the module in file a.py as __main__:
import b in module __main__:
import a in module b -> Imports the module in file a.py as a
import b in module a -> Nothing happens, already imported that module
b.f() -> AttributeError: module 'b' has no attribute 'f'
Note: The line b.f() can be further simplified to b.f and the error will still occur. This is because b.f() first accesses the attribute f of module object b, which happens to be a function object, and then tries to call it. I wanted to point out again the object oriented nature of Python.
The from ... import ... statement
It is interesting to mention that using the from ... import ... form gives another error, even though the reason is the same:
a.py
from b import f
f()
b.py
import a
def f():
printf('hello from b.f()')
Executing python a.py imports the module in file a.py as __main__:
from b import f in module __main__ actually imports the whole module (adds it to sys.modules and executes its body), but binds only the name f in the current module namespace:
import a in module b -> Imports the module in file a.py as a
from b import f in module a -> ImportError: cannot import name f (because the first execution of from b import f did not get to see the definition of the function object f in module b)
In this last case, the from ... import ... itself fails with an error because the interpreter knows earlier in time that you are trying to access something in that module which does not exist. Compare it to the first AttributeError, where the program did not see any problem until it tried to access attribute f (in the expression b.f).
The double execution problem of the code in the main module
When importing the module in the file used to start the program (imported as __main__ first) from another module, the code in that module gets executed twice and any side effects in that module execution will happen twice too. This is why it is not recommended to import the main module of the program again in other modules.
Using sys.modules to confirm my conclusions above
I will show how checking the contents of sys.modules can clarify this problem:
a.py
import sys
assert '__main__' in sys.modules.keys()
print(f'{__name__}:')
print('\ta imported:', 'a' in sys.modules.keys())
print('\tb imported:', 'b' in sys.modules.keys())
import b
b.f()
b.py
import sys
assert '__main__' in sys.modules.keys()
print(f'{__name__}:')
print('\ta imported:', 'a' in sys.modules.keys())
print('\tb imported:', 'b' in sys.modules.keys())
import a
assert False # Control flow never gets here
def f():
print('hi from b.f()')
The output of python3 a.py:
__main__:
a imported: False
b imported: False
b:
a imported: False
b imported: True
a:
a imported: True
b imported: True
Traceback (most recent call last):
File "a.py", line 8, in <module>
import b
File "/home/andrei/PycharmProjects/untitled/b.py", line 8, in <module>
import a
File "/home/andrei/PycharmProjects/untitled/a.py", line 10, in <module>
b.f()
AttributeError: module 'b' has no attribute 'f'

Related

What are the requirements for position of custom attributes in an Erlang module?

It seems:
Dialyzer and erlc both error when there is a user-defined attribute in a source file before -module
Shell session demonstrating:
~ cat sample.erl
-my_attr(my_value).
-module(sample).
-compile([export_all, nowarn_export_all]).
main(_) ->
ok.
~ erlc sample.erl
sample.erl:1:2: no module definition
% 1| -my_attr(my_value).
% | ^
~ dialyzer sample.erl
Checking whether the PLT /Users/mheiber/Library/Caches/erlang/.dialyzer_plt is up-to-date... yes
Proceeding with analysis...
dialyzer: Analysis failed with error:
Could not scan the following file(s):
/Users/mheiber/sample.erl:1:2: no module definition
Last messages in the log cache:
Reading files and computing callgraph...
Is the contract for order of user-defined attributes documented anywhere? Or is this a bug?
All I could find in the docs is that predefined attributes must come before function declarations:
https://www.erlang.org/doc/reference_manual/modules.html#pre-defined-module-attributes
-module(Module).
Module declaration, defining the name of the module. The name Module,
an atom, is to be same as the file name minus the extension .erl.
Otherwise code loading does not work as intended.
This attribute is to be specified first and is the only mandatory attribute.

Why can the following environment module not be loaded?

I get an error loading environment modules (4.2.4) I do not understand. With three modules A, B and C where B depends on A and C and C depends only on A:
A
#%Module1.0
B
#%Module1.0
module load A C
C
#%Module1.0
module load A
it is not possible to load the modules in the following manner:
module load A B
The error that is printed to stdout is:
Error: B cannot be loaded due to missing prereq.
HINT: the following modules must be loaded first: C
A module load A C B is working.
Is this a bug of the module environment or am I missing something?
You clearly hit a bug. module load A B should work as you expect.
I have reported it to the project on GitHub
As a work-around, you could also pass the --auto command-line switch:
$ module load --auto A B
Loading B
Loading requirement: C
$ module list
Currently Loaded Modulefiles:
1) A 2) C 3) B
Another work-around is to write B modulefile with 2 separate module load commands:
#%Module1.0
module load A
module load C
UPDATE: Environment Modules 4.2.5 is now released and includes a fix for this issue. So module load A C command in B modulefile correctly loads A and C modulefiles.

Animation instances aren't cleaned up

Enviroment
Python 3.6.3
Kivy master
OS: Linux Mint 18.2(based on Ubuntu 16.04 LTS)
Code
Hi, I'm writing unittest of kivy.animation. When I ran the code below
import unittest
from time import time, sleep
from kivy.animation import Animation
from kivy.uix.widget import Widget
from kivy.clock import Clock
class AnimationTestCase(unittest.TestCase):
SLEEP_DURATION = .3
TIMES = 2
def sleep(self, t):
start = time()
while time() < start + t:
sleep(.01)
Clock.tick()
def test_animation(self):
for index in range(self.TIMES):
print('----------------------------------')
with self.subTest(index=index):
w = Widget()
a = Animation(x=100, d=.2)
print('a:', a)
a.start(w)
self.sleep(self.SLEEP_DURATION)
print('instances_:', Animation._instances)
self.assertEqual(len(Animation._instances), 0)
output is
----------------------------------
a: <kivy.animation.Animation object at 0x7f0afb31c660>
instances_: set()
----------------------------------
a: <kivy.animation.Animation object at 0x7f0afc20b180>
instances_: {<kivy.animation.Animation object at 0x7f0afc20b250>, <kivy.animation.Animation object at 0x7f0afb31c660>}
======================================================================
FAIL: test_animation (kivy.tests.test_animations.AnimationTestCase) (index=1)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/firefox/kivy/kivy/tests/test_animations.py", line 34, in test_animation
self.assertEqual(len(Animation._instances), 0)
AssertionError: 2 != 0
----------------------------------------------------------------------
Ran 1 test in 0.822s
FAILED (failures=1)
Either of
Increase SLEEP_DURATION (for example SLEEP_DURATION = 2) or
TIMES = 1
will fix this error.
Is this correct behavior or bug?
The cause of this error is kivy.modules.inspector.
After I removed this line from config.ini,
[modules]
inspector = # <= remove this line
the program behave how I expect. Seems like ScrollView inside inspector creates Animation internally, and it makes my test fails.

How do I resolve a Pickling Error on class apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum?

A PicklingError is raised when I run my data pipeline remotely: the data pipeline has been written using the Beam SDK for Python and I am running it on top of Google Cloud Dataflow. The pipeline works fine when I run it locally.
The following code generates the PicklingError: this ought to reproduce the problem
import apache_beam as beam
from apache_beam.transforms import pvalue
from apache_beam.io.fileio import _CompressionType
from apache_beam.utils.options import PipelineOptions
from apache_beam.utils.options import GoogleCloudOptions
from apache_beam.utils.options import SetupOptions
from apache_beam.utils.options import StandardOptions
if __name__ == "__main__":
pipeline_options = PipelineOptions()
pipeline_options.view_as(StandardOptions).runner = 'BlockingDataflowPipelineRunner'
pipeline_options.view_as(SetupOptions).save_main_session = True
google_cloud_options = pipeline_options.view_as(GoogleCloudOptions)
google_cloud_options.project = "project-name"
google_cloud_options.job_name = "job-name"
google_cloud_options.staging_location = 'gs://path/to/bucket/staging'
google_cloud_options.temp_location = 'gs://path/to/bucket/temp'
p = beam.Pipeline(options=pipeline_options)
p.run()
Below is a sample from the beginning and the end of the Traceback:
WARNING: Could not acquire lock C:\Users\ghousains\AppData\Roaming\gcloud\credentials.lock in 0 seconds
WARNING: The credentials file (C:\Users\ghousains\AppData\Roaming\gcloud\credentials) is not writable. Opening in read-only mode. Any refreshed credentials will only be valid for this run.
Traceback (most recent call last):
File "formatter_debug.py", line 133, in <module>
p.run()
File "C:\Miniconda3\envs\beam\lib\site-packages\apache_beam\pipeline.py", line 159, in run
return self.runner.run(self)
....
....
....
File "C:\Miniconda3\envs\beam\lib\sitepackages\apache_beam\runners\dataflow_runner.py", line 172, in run
self.dataflow_client.create_job(self.job))
StockPickler.save_global(pickler, obj)
File "C:\Miniconda3\envs\beam\lib\pickle.py", line 754, in save_global (obj, module, name))
pickle.PicklingError: Can't pickle <class 'apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum'>: it's not found as apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum
I've found that your error gets raised when a Pipeline object is included in the context that gets pickled and sent to the cloud:
pickle.PicklingError: Can't pickle <class 'apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum'>: it's not found as apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum
Naturally, you might ask:
What's making the Pipeline object unpickleable when it's sent to the cloud, since normally it's pickleable?
If this were really the problem, then wouldn't I get this error all the time - isn't a Pipeline object normally included in the context sent to the cloud?
If the Pipeline object isn't normally included in the context sent to the cloud, then why is a Pipeline object being included in my case?
(1)
When you call p.run() on a Pipeline with cloud=True, one of the first things that happens is that p.runner.job=apiclient.Job(pipeline.options) is set in apache_beam.runners.dataflow_runner.DataflowPipelineRunner.run.
Without this attribute set, the Pipeline is pickleable. But once this is set, the Pipeline is no longer pickleable, since p.runner.job.proto._Message__tags[17] is a TypeValueValuesEnum, which is defined as a nested class in apache_beam.internal.clients.dataflow.dataflow_v1b3_messages. AFAIK nested classes cannot be pickled (even by dill - see How can I pickle a nested class in python?).
(2)-(3)
Counterintuitively, a Pipeline object is normally not included in the context sent to the cloud. When you call p.run() on a Pipeline with cloud=True, only the following objects are pickled (and note that the pickling happens after p.runner.job gets set):
If save_main_session=True, then all global objects in the module designated __main__ are pickled. (__main__ is the script that you ran from the command line).
Each transform defined in the pipeline is individually pickled
In your case, you encountered #1, which is why your solution worked. I actually encountered #2 where I defined a beam.Map lambda function as a method of a composite PTransform. (When composite transforms are applied, the pipeline gets added as an attribute of the transform...) My solution was to define those lambda functions in the module instead.
A longer-term solution would be for us to fix this in the Apache Beam project. TBD!
This should be fixed in the google-dataflow 0.4.4 sdk release with https://github.com/apache/incubator-beam/pull/1485
I resolved this problem by encapsulating the body of the main within a run() method and invoking run().

javac will not compile enum, ( Windows Sun 1.6 --> OpenJDK 1.6)

package com.scheduler.process;
public class Process {
public enum state {
NOT_SUBMITTED, SUBMITTED, BLOCKED, READY, RUNNING, COMPLETED
}
private state currentState;
public state getCurrentState() {
return currentState;
}
public void setCurrentState(state currentState) {
this.currentState = currentState;
}
}
package com.scheduler.machine;
import com.scheduler.process.Process;
import com.scheduler.process.Process.state;
public class Machine {
com.scheduler.process.Process p = new com.scheduler.process.Process();
state s = state.READY; //fails if I don't also explicitly import Process.state
p.setCurrentState(s); //says I need a declarator id after 's'... this is wrong.
p.setCurrentState(state.READY);
}
Modified the example to try and direct to the issue. I cannot change the state on this code. Eclipse suggests importing Process.state like I had on my previous example, but this doesn't work either. This allows state s = state.READY but the call to p.setCurrentState(s); fails as does p.setCurrentState(state.READY);
Problem continued.... Following Oleg's suggestions I tried more permutations:
package com.scheduler.machine;
import com.scheduler.process.Process;
import com.scheduler.process.Process.*;
public class Machine {
com.scheduler.process.Process p = new com.scheduler.process.Process();
public state s = Process.state.READY;
p.setCurrentState(s);
p.setCurrentState(state.READY);
}
Okay. It's clear now that I'm a candidate for lobotomy.
package com.scheduler.machine;
import com.scheduler.process.Process;
import com.scheduler.process.Process.state;
public class Machine {
public void doStuff(){
com.scheduler.process.Process p = new com.scheduler.process.Process();
state s = state.READY; //fails if I don't also explicitly import Process.state
p.setCurrentState(s); //says I need a declarator id after 's'... this is wrong.
p.setCurrentState(state.READY);
}
}
I needed to have a method in the class--but we're still missing something (probably obvious) here. When I go via the command line and run javac on the Machine class AFTER compiling Process, I still get the following error:
mseil#context:/media/MULTIMEDIA/Scratch/Scratch/src/com/scheduler/machine$ javac Machine.java
Machine.java:3: package com.scheduler.process does not exist
import com.scheduler.process.Process;
^
So I guess the question now becomes, what idiot thing am I missing that is preventing me from compiling this by hand that eclipse is doing for me behind the scene?
======
Problem solved here:
Java generics code compiles in eclipse but not in command line
This has just worked for me:
Download latest Eclipse
Create new project
Create two packages com.scheduler.process and com.scheduler.machine
Create class Process in package com.scheduler.process and class Machine in com.scheduler.machine and copy their contents from your post modifying them to conform to Java language syntax, like this:
Everything compiles right away.
------ to answer the previous version of the question ------
To answer the question as it is right now: you need to either
import com.scheduler.process.Process.status or import com.scheduler.process.Process.* and refer to status as just status
or
import com.scheduler.process.* or import com.scheduler.process.Process and refer to status as Process.status
------ to answer the original version of the question ------
You can't import classes that are not inside some package. You just can't. It is a compile time error to import a type from the unnamed package.
You don't need to import anything if your classes are in the same package, or if all of your classes are packageless.
If Process class was inside some package it would be possible to import just its status inner class: import a.b.c.Process.status would work just fine.
All your Windows/Linux migration issues don't have anything to do with Java and exceptions that you see. import Process.state; will produce exception on any OS because you can't import classes that don't belong to any package.
Eclipse doesn't use the Sun JDK by default. I would assume that you are using Eclipse's built in compiler as Sun's JDK and the OpenJDK are almost identical.
Java code compiles and runs exact the same on Windows and Linux most of the time (unless you use a few of the platform specific operations)
I suspect you are not building the code the same way and when you compile Machine, the Process class has not been compiled.
I suggest you use a standard build system like maven or ant and it will build the same everywhere. Failing that run Eclipse on Linux or just the same .class you use on windows as they don't need to be re-compiled in any case.
BTW: You don't need to import Process.state as it not used and its in the same package (so you wouldn't need to if you did)

Resources