Build dkms module for specific kernel versions only - dkms

How do you define dkms.conf such that a DKMS module will only be built for specific kernel version or range of versions?
Background:
A buggy driver is present in the the current kernels we are using (eg 4.4) but fixed in 4.10. I produced as dkms package with the 4.10 source code in it, which all works fine on kernel 4.4. But as we update to later OS releases (or HWE releases) with later kernel releases - eg 4.15 - I want to avoid rebuilding the (now possibly older) 4.10 kernel driver when the kernel version is 4.10 or higher.
Here's my base dkms.conf file
PACKAGE_NAME="cp210x"
PACKAGE_VERSION="#MODULE_VERSION#"
BUILT_MODULE_NAME[0]="$PACKAGE_NAME"
DEST_MODULE_LOCATION[0]="/updates/dkms"
AUTOINSTALL="YES"
REMAKE_INITRD="YES"
I tried BUILD_EXCLUSIVE_KERNEL matching to 4.N kernel versions
BUILD_EXCLUSIVE_KERNEL="^4\.[0-9]\.*"
Expected behaviour - will not install the kernel module for kernel 4.15.0-43-generic. Actual behaviour - installs as normal
My reading suggests an alternate might work (for this test I'm just matching my current kernel version) to change the compile rule to be a no-op.
MAKE_MATCH[1]="^4\.15\.*"
MAKE[1]=":"
I'm on Debian/Ubuntu platforms if that makes any difference.

Ok - the problem was between keyboard and chair - my BUILD_EXCLUSIVE_KERNEL regexp had an error in it - the .* suffix got mixed with the \. number separator. But I'll document a working example here since google didn't find any good examples before I posted here:
Firstly I wasn't sure what regexp dialect I needed to be using (grep, pcre, etc,..) especially since there is shell escaping mixed in, so thought perhaps the mismatch was there.
Turns out dkms is a bash script and so uses [[ $ver =~ $match_regexp ]]. So to test the matching this worked:
re="^(3\.[0-9]+\.|4\.[0-9]\.)" ; [[ "4.15.0-43-generic" =~ $re ]] && echo true
# but this didn't
[[ "4.15.0-43-generic" =~ "^(3\.[0-9]+\.|4\.[0-9]\.)" ]] && echo true
Here's the config file I ended up using:
PACKAGE_NAME="cp210x"
PACKAGE_VERSION="#MODULE_VERSION#"
BUILT_MODULE_NAME[0]="$PACKAGE_NAME"
DEST_MODULE_LOCATION[0]="/updates/dkms"
AUTOINSTALL="YES"
REMAKE_INITRD="YES"
# Since this code comes from 4.10 only update kernels 4.9 and earlier
BUILD_EXCLUSIVE_KERNEL="^(3\.[0-9]+\.|4\.[0-9]\.)"
Which looks like this when installed via dpkg.
First Installation: checking all kernels...
Building only for 4.15.0-43-generic
Building initial module for 4.15.0-43-generic
Error! The dkms.conf for this module includes a BUILD_EXCLUSIVE directive which
does not match this kernel/arch. This indicates that it should not be built.
Skipped.
But installs correctly against lower kernel versions.
Additionally the wording of the BUILD_EXCLUSIVE_KERNEL documentation suggests it is an error if the kernel mismatches which might not be desirable, however if you check the output above you'll see that the "Error" does not cause a package installation failure, just marked as skipped.

Related

How to make Bazel correctly cache dependencies built by itself?

I have a (relatively small) Bazel rule for some configure/make based project, say xmlsec1 - you can take any other, the important thing seems to be the external tooling behind foreign_cc:
xmlsec1.BUILD:
load("#rules_foreign_cc//foreign_cc:defs.bzl", "configure_make")
filegroup(name="all_srcs", srcs=glob(["**"]))
configure_make(
name="xmlsec1",
lib_name="xmlsec1",
lib_source=":all_srcs",
configure_command="configure",
configure_in_place=True,
out_binaries=["xmlsec1"],
targets=["install"],
)
xmlsec1.bzl:
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
def xmlsec1():
http_archive(
name = "xmlsec1",
url = "https://www.aleksey.com/xmlsec/download/xmlsec1-1.2.37.tar.gz",
sha256 = "5f8dfbcb6d1e56bddd0b5ec2e00a3d0ca5342a9f57c24dffde5c796b2be2871c",
build_file = "#//:xmlsec1.BUILD",
)
All works fine for me until Bazel's remote cache gets activated and I'm building among different Linux distributions.
To avoid cache collisions I'm running bazel build with --action_env=SYSTEM_DIGEST="$(cat /etc/os-release)", resulting in different hashes on different distros.
While this approach seems to work for the artifacts defined in xmlsec1 (I can see this from the execution logs and observing expected re-builds), the foreign_cc part seems to built without those action_env variables.
This is what I get when I try to build #xmlsec1//:xmlsec1 (line breaks for readability):
+ /home/me/.cache/bazel/_bazel_me/8f6a55c898f3ec22f87d9cee5890b9e5/sandbox/processwrapper-sandbox/5/execroot/my_project_packages/bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_foreign_cc/toolchains\
/make/bin/make install
/home/me/.cache/bazel/_bazel_me/8f6a55c898f3ec22f87d9cee5890b9e5/sandbox/processwrapper-sandbox/5/execroot/my_project_packages/bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_foreign_cc/toolchains\
/make/bin/make: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /home/me/.cache/bazel/_bazel_me/8f6a55c898f3ec22f87d9cee5890b9e5/sandbox/processwrapper-sandbox/5/execroot/my_project_packages/bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_foreign_cc/toolchains/make/bin/make)
/home/me/.cache/bazel/_bazel_me/8f6a55c898f3ec22f87d9cee5890b9e5/sandbox/processwrapper-sandbox/5/execroot/my_project_packages/bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_foreign_cc/toolchains\
/make/bin/make: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by /home/me/.cache/bazel/_bazel_me/8f6a55c898f3ec22f87d9cee5890b9e5/sandbox/processwrapper-sandbox/5/execroot/my_project_packages/bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_foreign_cc/toolchains/make/bin/make)
I get this linker error only with bazel remote cache being activated and (this is the interesting part) having xmlsec1 built on a recent distribution (say Ubuntu 22.04) and then trying to build it on Centos-8.
So I guess this is what's going on:
make from foreign_cc get's built and linked against a recent version of GLIBC, ignoring values provided with --action_env
foreign_cc artifacts are being stored in the bazel remote cache
another build on an older distro (Centos-8) also tries to build make and has no reason not to take the artifacts from the cache, since --action_env values are also ignored here, resulting in the same hash
since the binaries are linked against a version of GLIBC which is not available yet on Centos-8 they are not compatible and crash with the error you see above.
So my question(s) is(/are):
Is that intended behavior? why is --action_env being ignored for builds Bazel runs implicitly (and not for those defined explicitly)?
Is there a way to apply those rules for Bazels own dependencies?
Is there a better way to define system properties with effects on all Builds?

Problems with updating pfgplots inside docker with tds file structure

I have a docker image with texlive installed (via apt not tlmgr). I have a pgfplot in my project which needs a newer pgfplot version. I'm searching for ways to update my pgplots because I can't update it with tlmgr because of base install via apt.
Initial error message if I try to compile with texlive 2014:
! Package pgfkeys Error: Choice '1.16' unknown in choice key '/pgfplots/compat/
anchors'. I am going to ignore this key.
See the pgfkeys package documentation for explanation.
Type H <return> for immediate help.
...
l.7 \pgfplotsset{compat=1.16}
?
! Emergency stop.
...
l.7 \pgfplotsset{compat=1.16}
I downloaded the pgfplots.tds and did the following steps like the manual said:
docker cp pgfplots.tds docker_container_name:/root/texmf/pgfplots
export TEXINPUTS=/root/texmf/pgfplots/tex//:
export TEXDOCS=/root/texmf/pgfplots/doc//:
export LUAINPUTS=/root/texmf/pgfplots//:
texhash
Of course the export and texhash were done inside the container and not on the host system.
After this, the error message is gone, but I have a new issue:
package pgfplots notification 'compat/show suggested version=true': you might b
enefit from \pgfplotsset{compat=1.18} (current compat level: 1.16).
! Illegal parameter number in definition of \pgfmaththisrow#.
<to be read again>
I searched online and got the response that this is because of a broken pgfplots installation. In many articles the fix was just to install the texlive new. But I can't do that.
The issue should also not be in the tex code itself. If I install texlive on my host system, which is the most recent Ubuntu distro, the tex compiles just fine.
Can somebody help me in fixing this or lead me to a better way of upgrading pgfplots?
Resolution:
The pgfplots package 1.18.1 and also 1.16 were to recent. It had conflicts with the pgf package. I tried to go further back and landed on \pgfplotsset{1.14} and version 1.14 of pgfplots.tds.
This works fine now. I was probably pretty lucky that my plot looks and functions the same with this version as in 1.18.
This approach probably won't work for you if your more bound to version 1.18.

Building tensorflow 2.2.0 pip wheel file, for use in CentOS system (older libc)

Introduction:
I have to create a pip wheel of Tensorflow 2.2.0 with cuda libraries dynamically linked(specifically cudart.so). To accomplish this i am currently using the tensorflow-dev docker image.
I am able to build the tf wheel file, an able to install and use it while inside the build container.
Issue:
The issue is that importing the generated wheel file in a CentOS server, i get the following error:
ImportError: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /home1/private/mavridis/Vineyard/tensorflowshared/test/lib64/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)
Having looked around, the issue is caused by the build container using a newer libc:
ldd --version
ldd (Ubuntu GLIBC 2.27-3ubuntu1) 2.27
Compared to CentOS older version:
ldd --version
ldd (GNU libc) 2.17
Expected behavior:
Having already tried the 'vanilla' tenorflow 2.2.0 version with no issues, installed using pip:
pip install tensorflow==2.2.0
I expected my own build to also work.
So i assume there is some configuration option or docker configuration to allow me to use the docker built wheel file, in a CentOS setup, just like the pip installed version. As this wheel file is intended to be deployed to setups beyond my control, solutions involving alternate OSes and/or libc replacement are not applicable.
Build configuration:
During build i use the following configuration/ command line:
export TF_NEED_CUDA=1
export TF_USE_XLA=0
export TF_SET_ANDROID_WORKSPACE=0
export TF_NEED_OPENCL_SYCL=0
export TF_NEED_ROCM=0
bazel build --config=opt --config=cuda --output_filter=DONT_MATCH_ANYTHING --linkopt=-L/usr/local/cuda/lib64 --linkopt=-lcudart --linkopt=-static-libstdc++ //tensorflow/tools/pip_package:build_pip_package
Regarding options used:
--output_filter=DONT_MATCH_ANYTHING : Silence warnings
--linkopt=-L/usr/local/cuda/lib64 --linkopt=-lcudart : Dynamic linking of cudart.so
--linkopt=-static-libstdc++ : Static link libstc++ as libstc++ also caused the libc error, this however is not possible for libm
I expected my own build to also work.
That expectation is (obviously) incorrect. The symbols your program or library requires from GLIBC depend on exactly which functions you call.
Consider the following program:
int main() { exit(0); }
When compiled/linked on a GLIBC-2.30 system, this program only depends on GLIBC_2.2.5 (because it doesn't call any newer symbols).
Now change the program slightly:
int main() { gettid(); exit(0); }
Compile/link it again, and all of a sudden this program now requires GLIBC_2.30 (because that's where gettid() was added to GLIBC), and will not work on any system which has older GLIBC.
So i assume there is some configuration option or docker configuration
Sure: your Docker image must have GLIBC that is not newer than what your target system have, i.e. GLIBC-2.17. Your current image contains GLIBC-2.27 (or newer).
You need a different Docker image, and you'll likely have to build it yourself, since GLIBC-2.17 is over 7 years old, and predates TensorFlow by many years.
Update:
What i don't understand is how come the pip tensorflow package (which i assumed was build with the docker image i am using) works with CentOS?
It works by accident, just like my first program would work on CentOS, but the second one wouldn't.
In short i wanted to generate a pip package that would work on 'any' linux/libc version
That is an impossible goal: Linux predates GLIBC, and it is impossible to build a single package that will work on a Linux distribution which didn't include GLIBC and on a distribution that did.
You have to draw a line somewhere. The developers of tensorflow-dev docker image drew a line at GLIBC-2.27. Packages built on this image should work on any system with 2.27 or later, and might (but are not at all guaranteed to) work on older systems.
just like the pip installed version.
You claim that the pip installed version has no "only GLIBC-xx or later" requirement, but that is not true. I am 99.9% sure that it requires at least GLIBC-2.14.
To find which GLIBC versions that package requires, run this command:
readelf -WV _pywrap_tensorflow_internal.so | grep GLIBC_
I assumed, the pip installed version was built using the publicly available tensorflow-devel docker image.
That is quite likely. And like I said, it happens to work on CentOS, but minute changes may make it not work anymore.
Update 2:
So running the readelf command as you suggested, does show the most recent required versions to be: - pip version: GLIBC_2.12 - mine : GLIBC_2.27 So from what i understand the pip version uses an older version even from CentOS, which explains why it works.
It doesn't "use" older version, it uses whatever version is available.
It requires a minimum version 2.12, while your build requires a minimum version 2.27.
How do they achieve this? Do they use a different image that has an older libc? If so, where can i get it? Or do they use the public image, but build with some bazel flag, that 'limits' symbols to the ones contained up to libc 2.12?
You are still not getting it.
The version that your program requires depends on exactly which functions you call. In my example program, if I only call exit, my program requires vesion 2.2.5, but if I also call gettid, then my program requires version 2.30. Note: these two programs are built on the same system with the same flags.
So no: they (most likely) didn't use a different Docker image, and didn't use "magic" bazel flags. They just happened to not call any functions which require GLIBC version > 2.12, and you did.
P.S. You can find which symbol(s) are causing "bad" dependency in your build like so:
readelf -Ws _pywrap_tensorflow_internal.so | egrep 'GLIBC_2.2[0-9]'
readelf -Ws _pywrap_tensorflow_internal.so | egrep 'GLIBC_2.1[89]'
This would produce output similar to (using my second program):
readelf -Ws a.out | egrep 'GLIBC_2.[23][0-9]'
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND gettid#GLIBC_2.30 (2)
48: 0000000000000000 0 FUNC GLOBAL DEFAULT UND gettid##GLIBC_2.30
The output above shows that the only symbol my binary requires from GLIBC 2.20 or above is gettid.
To make a counter point to what Employed Russian wrote:
The version that your program requires depends on exactly which functions you call. In my example program, if I only call exit, my program requires vesion 2.2.5, but if I also call gettid, then my program requires version 2.30. Note: these two programs are built on the same system with the same flags.
I don't think that's quite accurate. My understanding, which is corroborated by https://github.com/wheybags/glibc_version_header, is that things work like so (quoting that project, emphasis mine):
Glibc uses something called symbol versioning. This means that when you use e.g., malloc in your program, the symbol the linker will actually link against is malloc#GLIBC_YOUR_INSTALLED_VERSION (actually, it will link to malloc from the most recent version of glibc that changed the implementaton of malloc, but you get the idea).
So my guess (I have not checked) would be that the Tensorflow releases are built against an older glibc (perhaps by way of being built on an older release of their target Linux distro).

Lua version in ZeroBraneStudio vs Torch

I am using ZeroBrane Studio as IDE to code deep learning. I have realized that the models I save when programming in the IDE (using Lua 5.1 as interpreter) do not load well when executing the same loading from Torch7. The same happens when learning from torch (./th code.lua) and then trying to load them inside the IDE. I get something like:
/opt/zbstudio/bin/linux/x64/lua: /home/dg/torch/install/share/lua/5.1/torch/File.lua:294: unknown object
Does anybody know how to check the lua version that torch is using? Any idea on how to workaround this?
Thanks!
update: It seems that I am indeed using the same Lua version (5.1) in both Torch and ZeroBrane. I still get different behaviour (one successful and the other crashing) when passing through torch.load().
To check the version of Lua that anything is running, you would usually print _VERSION. It's a global variable that stores the version of Lua (unless you overwrite it, of course).
print(_VERSION)
If this isn't available for some reason, they might state their version on their site (?)
Most command line tools on Linux understand the -v command line switch (for "version"). So do Lua and LuaJIT.
To figure out which interpreter is running a particular script, you can scan the arg table for the smallest (usually negative) index:
local exe, i = arg[ 0 ], -1
while arg[ i ] do
exe, i = arg[ i ], i-1
end
print( exe )
Or (on Linux) you can look into the /proc file system while your script is running:
ls -l /proc/4425/exe
(substitute 4425 with real process ID).
Judging from the error message the interpreter used in ZeroBrane Studio seems to be /opt/zbstudio/bin/linux/x64/lua in your case.
#siffiejoe: thanks for posing your question regarding versions, it gave me the correct directions to explore.
/opt/zbstudio/bin/linux/x64/lua version is LuaJIT 2.0.2
"lua" command alone points to /usr/bin/lua, and it is Lua 5.1.5
~/torch/install/share/lua/5.1 seemed to contain Lua 5.1
~/torch/install/bin/luajit is 2.1.0-alpha
So after realizing that terminal "th" is using LuaJit 2.1.0 all I had to do is create a user.lua in ZeroBrane and add the line "path.lua = "~/torch/install/bin/luajit". Now ZB is using the same luajit interpreter as th.
Thanks all for your suggestions.

How to change the version of python that pyscripter uses

I am a newb with python and just learning what to do.
I am using pyscripter and have been for a while whilst learning.
I am now going through an online course which is taught in 2.6, yet my pyscripter uses the latest.
I need to know how to change it to use an older version, I have seen replies about changing the PATH variable but not where it is or how to do it.
I have 3 versions of python on my machine, 25,26 and 33.
I don't know if this is the best way to do it, but those are the two ways I did it:
WAY 1 (The best of two)
Go to PyScripter>>Tools>>Options...>>Custom Parameters... and add the following values
1. PythonDir = C:\Program Files\CustomPythonInstallation
2. PythonExe = C:\Program Files\CustomPythonInstallation\python.exe
3. PythonVer = 3.3.3
Note: Adapt the Name = Value pairs above to your case.
And close the window with OK button.
Now select PyScripter>>Run>>Python Engine>>Remote and your are ready to go.
WAY 2 (The more temporary solution)
Go to PyScripter>>Run>>Configure External Run...
set the "Application:" field to your python.exe file
Close the window with OK button.
Make sure you run your scripts with PyScripter>>Run>>External Run (Alt+F9)
I hope this helped, good luck.
The easiest way I know (on Windows) is, having used the installer executable, I select from the Start menu's PyScripter folder whichever version of Python I want to run.
You can modify the PYTHONPATH (under Pyscripter>>Tools, for instance)
You can modify your External Python Interpreter with Pyscripter>>Modify Tools>>Python &Interpreter>>Modify
You can modify the default Python engine used with Pyscripter>>Options>>IDE Options>>Python Interpreter>>Python Engine Type
You can simply redirect Pyscripter to see the environment of a different Python distribution.
In Windows, do this by assigning PYTHONDLLPATH in the Pyscripter shortcut. You can r-click on the shortcut, access its properties and then set the target to:
[Pyscripter executable dir] --PYTHONDLLPATH [Python distribution dir]
See this image to help you out:
setting a shortcut target
For example, in my Win10 64-bit computer I have a Python 2.7.8 installation back from when I installed ArcGIS, which is automatically recognized by my 32-bit Pyscripter installation.
In the same computer, I also have Anaconda installed with two environments that feature two 64-bit Python distributions:
2.7.14 in "C:\ProgramData\Anaconda2"
3.6 in "C:\Users\bouzi\AppData\Local\conda\conda\envs\py3"
When I installed a 64-bit version of Pyscripter, that Pyscripter version couldn't even open, as it couldn't find the conda distributions. I had to point them to it by replacing the shortcut target to:
"C:\Program Files\PyScripterx64\PyScripter.exe" --PYTHONDLLPATH "C:\ProgramData\Anaconda2"
You can create three Pyscripter shortcuts that point to these different installations of Python within your system. It's probably not the optimal way to deal with this but it works, and allows you to combine Anaconda environments with Pyscripter.
You can also read more on opening non-standard python distributions with PyScripter from this link.
Run->Python Versions -> setup Python Versions -> Add... select folder
p.s.
python 3.7.3 - ok,
still python 3.10.5 could not be identified by PyScripter in such a way (actually works with WAY_1 Solution in this thread but pip install under such env. not succeed afterwards)

Resources