I'm curious to understand a little more about the anatomy of docker images. I understand how this works in context of docker build: each step in the build file creates a new layer building on the last and that both FROM clauses and layer caching can mean layers are re-used between images.
I also know that layers are effectively composited using overlayfs or similar with changes causing edited / new / deleted files to have whole files or blackout files stored in each layer.
What I don't know is how these layers are then bound together. I don't know if there is a back reference in each layer to its parent, or if the sequencing of layers is defined by metadata held separately.
What I'm particularly curious about is whether or not it is hypothetically possible to take layers from unrelated images and splice them together into a new (working) image?. That is splice them without creating and storing copies in the docker repo. You may assume that the unrelated images were constructed for this purpose.
Note: this is not an XY question. I genuinely want to know the answer to this question as asked... because I want to know
Docker has to know the order to compose them into a container. something knows what order the layers go in.
Right, that information is provided by the image manifest, which is a JSON file that, among other information, contains an ordered list of layers.
For example, let's look at the official postgres:15 image. We can grab the contents for inspection like this:
mkdir postgres-image
docker image save postgres:15 | tar -C postgres-image -xf-
Which gets us:
$ ls postgres-image
1c035120f97a959821c3d682e5ddb1b826410b01f1a66ef71426085d11abbac2
1cde06c3f46bf13f0d87eeacc400abc4b80283952c96a0f586a3c4dbc53dea8d
4c6b3cc10e6bbb2afd68caa44a3eb6cef12caf483e60b28c6de2c56e8f1b99bc.json
586bebe8d837fb5b377b572f83f88d65c3c667624edf6230951a4a714c2d879b
7de3bcc2590b6a6bdf63c49e4861b2d44234de5aff1017c982b51ae8343a8f9d
866914886a5c213ca953042f6d9f15a2a803ef8318321651f45552aff0910d9b
9ec9c81974bf829a515052eb4a318630bcaf43abd02bf4e59703f796fc5df66d
a2036affeb348540e00efc39a5bc8e8f099327725b0e81ce068c4e55ac76cb50
aea69f9cbe92ef8c8efd388c69e169f036070c8ab068acf993cf9096010e4191
d10e11a29048914235b611ff7a0c93ebb529b1d1f8f0e00f883a341f0193d9ab
d24fc4b0912a24f160d52cf9273bf07f0ca18ed0ae1855e41e4c1cbefa650dd5
e72ed9131fa5b904acf7bcfe7978f97e99c9fb087308b7cefd5063edea0bf7b8
eaf7c4940ac93309a059cc3f91f847a130317a98c7380e0534b2c525c8e805a3
fc5484c1b2ce93576cc100282c8e446581c5ffd48e6199a8b2d651d7f9171124
manifest.json
repositories
The manifest is contained in manifest.json, which looks like:
[
{
"Config": "4c6b3cc10e6bbb2afd68caa44a3eb6cef12caf483e60b28c6de2c56e8f1b99bc.json",
"RepoTags": [
"postgres:15"
],
"Layers": [
"fc5484c1b2ce93576cc100282c8e446581c5ffd48e6199a8b2d651d7f9171124/layer.tar",
"d10e11a29048914235b611ff7a0c93ebb529b1d1f8f0e00f883a341f0193d9ab/layer.tar",
"a2036affeb348540e00efc39a5bc8e8f099327725b0e81ce068c4e55ac76cb50/layer.tar",
"1cde06c3f46bf13f0d87eeacc400abc4b80283952c96a0f586a3c4dbc53dea8d/layer.tar",
"7de3bcc2590b6a6bdf63c49e4861b2d44234de5aff1017c982b51ae8343a8f9d/layer.tar",
"aea69f9cbe92ef8c8efd388c69e169f036070c8ab068acf993cf9096010e4191/layer.tar",
"586bebe8d837fb5b377b572f83f88d65c3c667624edf6230951a4a714c2d879b/layer.tar",
"9ec9c81974bf829a515052eb4a318630bcaf43abd02bf4e59703f796fc5df66d/layer.tar",
"1c035120f97a959821c3d682e5ddb1b826410b01f1a66ef71426085d11abbac2/layer.tar",
"e72ed9131fa5b904acf7bcfe7978f97e99c9fb087308b7cefd5063edea0bf7b8/layer.tar",
"eaf7c4940ac93309a059cc3f91f847a130317a98c7380e0534b2c525c8e805a3/layer.tar",
"d24fc4b0912a24f160d52cf9273bf07f0ca18ed0ae1855e41e4c1cbefa650dd5/layer.tar",
"866914886a5c213ca953042f6d9f15a2a803ef8318321651f45552aff0910d9b/layer.tar"
]
}
]
That list of layers is how Docker knows how to compose the final image: you start with the first layer, and then apply each subsequent layer on top of it.
I don't know if there is a back reference in each layer to its parent, or if the sequencing of layers is defined by metadata held separately.
The answer is "both". The sequencing is defined by the list of layers in the manifest, but each layer also contains a reference to its parent. For example, if we look at the file 1cde06c3f46bf13f0d87eeacc400abc4b80283952c96a0f586a3c4dbc53dea8d/json, we see:
{
"id": "1cde06c3f46bf13f0d87eeacc400abc4b80283952c96a0f586a3c4dbc53dea8d",
"parent": "a2036affeb348540e00efc39a5bc8e8f099327725b0e81ce068c4e55ac76cb50",
"created": "1969-12-31T19:00:00-05:00",
"container_config": {
"Hostname": "",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": null,
"Cmd": null,
"Image": "",
"Volumes": null,
"WorkingDir": "",
"Entrypoint": null,
"OnBuild": null,
"Labels": null
},
"os": "linux"
}
The attribute parent is a reference to the previous layer.
Related
I'm trying to translate my company's project from legacy build tool to bazel. Now I'm facing this problem and searched a lot, but unfortunately, I haven't had a clue so far.
Here's the thing:
For compliance with open source audit, we must provide a list of open-source software which are built into our binary. As external dependencies are introduced by repository rules, my intuitive thought is to query these rules and get the URLs. However, subcommand query/cquery hasn't provided such functionality yet AFAIK, it can print rule/target/buildfiles but no repository rules nor their attributes.
Is there a way that I can gather such information from repository rules in WORKSPACE? It's not viable to do it manually as there are thousands of projects in my company and the dependencies also change frequently.
For example, a workspace rule:
http_archive(
name = "testrunner",
urls = ["https://github.com/testrunner/v2.zip"],
sha256 = "..."
)
This dependency is used by a rule named "my_target", so what i expected is that the dependency could be queried like this:
> bazel queryExtDep my_target
External Dependency of my_target: name->testrunner, urls = "https://github.com/testrunner/v2.zip"
--experimental_repository_resolved_file will give you all that information in a single Starlark file, which you can easily process with Starlark or Python etc to extract the information you're looking for.
The resolved file looks something like this:
resolved = [
...,
{
"original_rule_class": "#bazel_tools//tools/build_defs/repo:git.bzl%git_repository",
"original_attributes": {
"name": "com_google_protobuf",
"remote": "https://github.com/google/protobuf",
"branch": "master"
},
"repositories": [
{
"rule_class": "#bazel_tools//tools/build_defs/repo:git.bzl%git_repository",
"attributes": {
"remote": "https://github.com/google/protobuf",
"commit": "78ba021b846e060d5b8f3424259d30a1f3ae4eef",
"shallow_since": "2018-02-07",
"init_submodules": False,
"verbose": False,
"strip_prefix": "",
"patches": [],
"patch_tool": "patch",
"patch_args": [
"-p0"
],
"patch_cmds": [],
"name": "com_google_protobuf"
}
}
]
}
]
This includes the original attributes, which is where that URL you're looking for is. It also includes any additional information returned by the repository rule (ie for git_repository, the actual commit a given ref refers to).
I got that example from blog post introducing that flag, which also has some more background.
I wonder if Serilog can be configured to create a circular logging file when a certain file size limit is reached it deletes older records.
I've already tried to configure it with the out of the box (File) writer
"Name": "File",
"Args": {
"path": "c:\\temp\\MyFile.txt",
"fileSizeLimitBytes": 200,
"rollOnFileSizeLimit": true,
"retainedFileCountLimit": 1
}
And I've explored the sink "Serilog.Sinks.RollingFileAlternate" already.
But all these packages give me a rolling file based upon size but are creating each time new log files.
I created a new extension for TFS following MS tutorial. For some reason when I'm adding Icon to my extension I can see this icon when I'm installing the extension and in the "Extension Manager" page,
But when I choose my extension from the build step menu the image is missing.
In the "vss-extension.json" file I added:
"icons": {
"default": "images/icon.png"
},
"files": [
{
"path": "images",
"addressable": true
},
{
"path": "dist",
"addressable": true,
"packagePath": "scripts"
},
{
"path": "infoTab.html",
"addressable": true
},
{
"path": "node_modules/vss-web-extension-sdk/lib",
"addressable": true,
"packagePath": "lib"
},
{
"path": "buildtask"
}
],
The image file is 32x32
Should this image be reference in the "task.json" file as well?
The accepted answer is not correct for Microsoft Visual Studio Team Foundation Server version 15.105.25910.0. Perhaps it was correct for previous versions.
The image file must be named icon.png.
The image file must be in the same folder as task.json.
The image file should be 32 x 32; no image scaling is applied.
The task.json file does not contain any reference to this file. It is located by using these conventions.
The task itself has its own icon and it must be stored in the same directory as the task.json and must be called icon.png and be 32x32 pixels and optionally an additional icon.svg can be put alongside it. This has to do with the fact that one extension can contain multiple build tasks, each build task then has its own icon. It's not referenced from the task.json, the correct file name will cause it to be picked up.
For an example, check my Azure Pipelines Snyk task. Also, if this is your complete extension manifest, then it's missing the Build task contribution point:
"contributions": [
{
"id": "buildtask",
"type": "ms.vss-distributed-task.task",
"targets": [
"ms.vss-distributed-task.tasks"
],
"properties": {
"name": "buildtask"
}
}
I configured a task in VSCode to compile a Delphi 2005 dpk. It is working and returning the errors on the "problems view", but it is not showing that errors in the file.
I think it is happening because when I click on an error, I get the error message:
Unable to open 'sr075pro.pas': File not found
(...projectfolder\sr075pro.pas)
But the file is in ...projectfolder\webservices\sr075pro.pas.
I can't find a way to tell to the task that the file is in a subfolder. I tried to use the "relative" option on the "fileLocation" tag without sucess.
The error returned:
Compiling sa_webservices...
Borland Delphi Version 13.0 Copyright (c) 1983,99 Inprise Corporation
sr075pro.pas(99) Error: Undeclared identifier: 'ni'
sa_webservices.dpk(802) Fatal: Could not compile used unit 'sr075pro.pas'
My task configuration:
{
"version": "0.1.0",
"name": "Compilar",
"command": "C:\\Compilers\\compile.bat",
"suppressTaskName": true,
"isShellCommand": true,
"isBuildCommand": true,
"tasks": [
{
"taskName": "Compile sa_webservices",
"isBuildCommand": false,
"isTestCommand": false,
"showOutput": "always",
"args": [
"sa_webservices"
],
"problemMatcher": {
"owner": "external",
"fileLocation": "relative",
"pattern": {
"regexp": "^([\\w]+\\.(pas|dpr|dpk))\\((\\d+)\\)\\s(Fatal|Error|Warning|Hint):(.*)",
"file": 1,
"line": 3,
"message": 5
}
}
}
My compile.bat:
#echo off
#P:
#set arg1=%1
shift
...
if "%arg1%" == "sa_webservices" set arg2="webservices"
...
echo Compiling %arg1%...
cd\%arg2%
dcc32.exe -H -W -Q %arg1%.dpk
Your task configuration is wrong. First of all you don't close all brackets but I guess it's a mistake made by copying and pasting it here on StackOverflow. Otherwise the task configuration wouldn't have worked at all.
Now to the real problem:
DCC32 produces hints and warnings containing relative file paths. These paths are relative to the project file. In your task configuration you define the compiler's output to contain relative paths by setting
"fileLocation": "relative"
Visual Studio Code doesn't know how to build the correct absolute path from the relative paths given by the compiler message. So it guesses your current ${workspaceRoot} (in your case it's projectfolder) would be the absolute path.
This explains why you see errors and warnings which contain wrong file paths. In order to get the correct paths you'll need to tell VSCode the correct path to combine the relative paths with.
You do this by simply adding the correct path to the fileLocation entry in you tasks.json:
"fileLocation": ["relative", "${workspaceRoot}\\webservices"]
The entire tasks.json looks like that:
{
"version": "0.1.0",
"name": "Compilar",
"command": "C:\\Compilers\\compile.bat",
"suppressTaskName": true,
"isShellCommand": true,
"isBuildCommand": true,
"tasks": [
{
"taskName": "Compile sa_webservices",
"isBuildCommand": false,
"isTestCommand": false,
"showOutput": "always",
"args": [
"sa_webservices"
],
"problemMatcher": {
"owner": "external",
"fileLocation": ["relative", "${workspaceRoot}\\webservices"],
"pattern": {
"regexp": "^([\\w]+\\.(pas|dpr|dpk))\\((\\d+)\\)\\s(Fatal|Error|Warning|Hint):(.*)",
"file": 1,
"line": 3,
"message": 5
}
}
}
]
}
It might be easier to find files in the problemMatcher in vscode 1.74, see file location search: v1.74 release notes. There is a new option search for the fileLocation property:
New file location method; search
Previously, problem matchers needed to know exactly where to look for
the problematic files, via the fileLocation property. The supported
methods were absolute, relative, or autoDetect (i.e., check for
relative paths first and opt to absolute paths in case of failure).
However, in workspaces that need to invoke various scripts residing in
nested sub-directories, the developers could have a hard time setting
up their tasks; since such scripts seldom report file paths in a
unified manner (e.g., relative to the workspace's base directory).
To help alleviate the problem, a new file location method, named
search, is introduced in this version. With this method, a deep file
system search will be initiated to locate any captured path. See the
example below on how to setup the search file location method
(although, all parameters are optional):
// ...
"fileLocation": [
"search",
{
"include": [ // Optional; defaults to ["${workspaceFolder}"]
"${workspaceFolder}/src",
"${workspaceFolder}/extensions"
],
"exclude": [ // Optional
"${workspaceFolder}/extensions/node_modules"
]
}
],
// ... } ```
⚠️ Of course, users should be wary of the possibly **heavy file system
searches** (e.g., looking inside `node_modules` directories) and set
the `exclude` property with discretion.
Is there a way to remove string "[Finished in ...s]" from output every time I build?
The build system has the quiet attribute to do exactly that. Just open the lua build system, e.g. via PackageResourceViewer, add "quiet": true, and save the file. It should look like this:
{
"cmd": ["lua", "$file"],
"quiet": true,
"file_regex": "^(?:lua:)?[\t ](...*?):([0-9]*):?([0-9]*)",
"selector": "source.lua"
}