Terraform CustomScriptExtension reports unable to download files - terraform-provider-azure

G'day,
I am running a Terraform extension - CustomScriptExtension. However it reports the error that it is unable to load the files.
resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-ad-ext-install-addom" {
count = "${var.count_ad_vm}"
name = "${var.ad_base_hostname}${format("%02d",count.index+1)}-CSE"
location = "${azurerm_resource_group.tf-rg-grpprd-core.location}"
resource_group_name = "${azurerm_resource_group.tf-rg-grpprd-core.name}"
virtual_machine_name = "${var.ad_base_hostname}${format("%02d",count.index+1)}"
publisher = "${var.extension_publisher_ad}"
type = "${var.extension_type_customerscriptextension}"
auto_upgrade_minor_version = "true"
type_handler_version = "${var.extension_version_customscriptextension}"
depends_on = ["azurerm_virtual_machine.tf-vm-grpprd-ad"]
settings = <<SETTINGS
{
"fileUris": ["https://blackbeltteam.visualstudio.com/blackbeltteam/_git/groupsprod/scripts/Install_AD_Components.ps1"],
"commandToExecute": "powershell.exe pwd",
"commandToExecute": "powershell.exe ls",
"commandToExecute": "powershell.exe -ExecutionPolicy unrestricted -NoProfile -NonInteractive -File Install_AD_Components.ps1"
}
SETTINGS
}
I get the error message as below:
2019-07-31T01:17:20.0860444Z [1m[31mError: [0m[0m[1mCode="VMExtensionProvisioningError" Message="VM has reported a failure when processing extension 'grpprdad02-CSE'. Error message: \"Failed to download all specified files. Exiting. Error Message: The remote server returned an error: (404) Not Found.\"."[0m
2019-07-31T01:17:20.0860808Z
2019-07-31T01:17:20.0861339Z [0m on ad.tf line 250, in resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-ad-ext-install-addom":
2019-07-31T01:17:20.0861901Z 250: resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-ad-ext-install-addom" [4m{[0m
2019-07-31T01:17:20.0862336Z [0m
2019-07-31T01:17:20.0862897Z [0m[0m
2019-07-31T01:17:20.0894295Z [31m
2019-07-31T01:17:20.0895064Z [1m[31mError: [0m[0m[1mCode="VMExtensionProvisioningError" Message="VM has reported a failure when processing extension 'grpprdad01-CSE'. Error message: \"Failed to download all specified files. Exiting. Error Message: The remote server returned an error: (404) Not Found.\"."[0m
2019-07-31T01:17:20.0895600Z
2019-07-31T01:17:20.0896889Z [0m on ad.tf line 250, in resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-ad-ext-install-addom":
2019-07-31T01:17:20.0897664Z 250: resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-ad-ext-install-addom" [4m{[0m
2019-07-31T01:17:20.0898223Z [0m
2019-07-31T01:17:20.0898695Z [0m[0m
2019-07-31T01:17:20.0943083Z [31m
2019-07-31T01:17:20.0944168Z [1m[31mError: [0m[0m[1mcompute.VirtualMachineExtensionsClient#CreateOrUpdate: Failure sending request: StatusCode=404 -- Original Error: Code="ParentResourceNotFound" Message="Can not perform requested operation on nested resource. Parent resource 'grpprdaos01' not found."[0m
2019-07-31T01:17:20.0944773Z
2019-07-31T01:17:20.0945288Z [0m on aos.tf line 240, in resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-aos-ext-join-ad":
2019-07-31T01:17:20.0945858Z 240: resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-aos-ext-join-ad" [4m{[0m
2019-07-31T01:17:20.0946979Z [0m
2019-07-31T01:17:20.0947787Z [0m[0m
2019-07-31T01:17:20.0990513Z [31m
2019-07-31T01:17:20.0991014Z [1m[31mError: [0m[0m[1mcompute.VirtualMachineExtensionsClient#CreateOrUpdate: Failure sending request: StatusCode=404 -- Original Error: Code="ParentResourceNotFound" Message="Can not perform requested operation on nested resource. Parent resource 'grpprdaos02' not found."[0m
2019-07-31T01:17:20.0991108Z
2019-07-31T01:17:20.1002906Z [0m on aos.tf line 240, in resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-aos-ext-join-ad":
2019-07-31T01:17:20.1003501Z 240: resource "azurerm_virtual_machine_extension" "tf-vm-grpprd-aos-ext-join-ad" [4m{[0m
It does not indicate which command it failed to execute, where it is running etc, how can I understand what is causing the grief.

The problem might be the resource is not found when you access the URI https://blackbeltteam.visualstudio.com/blackbeltteam/_git/groupsprod/scripts/Install_AD_Components.ps1. The script location can be anywhere, as long as the VM can route to that endpoint, such as GitHub or an internal file server.
You could ensure the resources could be retrieved by the file Uri in the SETTINGS from your VM. If you need to download a script externally then additional firewall and Network Security Group ports need to be opened.
For more references:
Custom Script Extension for Windows
Bootstrapping Azure VMs with Terraform

I was hitting the same issue and eventually found it was due to the file name in blob storage having higher case letters in it! Hopefully this will help someone else avoid the same issue.

Related

Pyspark running in docker container cannot write file

I have a docker container running PySpark, hadoop and all the required dependecies. I am using spark-submit to query the minio and I want to write the output dataframe to the file. Reading the file works but writing does not. If I execute python in that container and try to create file at the same path, it works.
Am I missing some spark configuration?
This is the error I get:
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1109, in save
File "/usr/local/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
File "/usr/local/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o38.save
: java.net.ConnectException: Call From 10d3463d04ce/10.0.1.132 to localhost:9000 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Relevant code:
spark = SparkSession.builder.getOrCreate()
spark_context = spark.sparkContext
spark_context._jsc.hadoopConfiguration().set('fs.s3a.access.key', 'minio')
spark_context._jsc.hadoopConfiguration().set(
'fs.s3a.secret.key', AWS_SECRET_ACCESS_KEY
)
spark_context._jsc.hadoopConfiguration().set('fs.s3a.path.style.access', 'true')
spark_context._jsc.hadoopConfiguration().set(
'fs.s3a.impl', 'org.apache.hadoop.fs.s3a.S3AFileSystem'
)
spark_context._jsc.hadoopConfiguration().set('fs.s3a.endpoint', AWS_S3_ENDPOINT)
spark_context._jsc.hadoopConfiguration().set(
'fs.s3a.connection.ssl.enabled', 'false'
)
df = spark.sql(query)
df.show() # this works perfectly fine
df.coalesce(1).write.format('json').save(output_path) # here I get the error
Solution was to prepend file:// to output_path.

Jenkins with Azure AD integration fails with "A problem occurred while processing the request"

There is lot of help available. Not able to fix it. After I enter my user name and password.
Jenkins with Azure AD login is successful and it return the token also. After that it fails with "A problem occurred while processing the request". Login is successful when I see the Azure side and jenkins ui prints token also.
When I see the error logs, I see
javax.net.ssl|DEBUG|13|Handling POST /securityRealm/finishLogin from x.x.x.x : Jetty (winstone)-19|2021-07-15 19:36:53.374 EDT|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=login.microsoftonline.com) was replaced with (type=host_name (0), value=login.microsoftonline.com)
2021-07-15 23:36:55.398+0000 [id=326] INFO c.m.a.a.AuthenticationAuthority#doInstanceDiscovery: [Correlation ID: e11160be-50c3-43d7-96a8-dc02c3cc2b2c] Instance discovery was successful
javax.net.ssl|ERROR|13|Handling POST /securityRealm/finishLogin from x.x.x.x : Jetty (winstone)-19|2021-07-15 19:36:55.769 EDT|TransportContext.java:344|Fatal **** (CERTIFICATE_UNKNOWN): PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target (
"throwable" : {
Also
javax.net.ssl|DEBUG|13|Handling POST /securityRealm/finishLogin from x.x.x.x : Jetty (winstone)-19|2021-07-15 19:36:55.773 EDT|SSLSocketImpl.java:1569|close the underlying socket
javax.net.ssl|DEBUG|13|Handling POST /securityRealm/finishLogin from x.x.x.x : Jetty (winstone)-19|2021-07-15 19:36:55.773 EDT|SSLSocketImpl.java:1588|close the SSL connection (initiative)
2021-07-15 23:36:55.787+0000 [id=19] SEVERE c.m.j.azuread.AzureSecurityRealm#doFinishLogin: error
sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
I have imported certs for login.microsoftonline.com and portal.azure.com, my jenkins.xml has
-Djavax.net.ssl.trustStore="C:\Program Files (x86)\Jenkins\.cacerts\jssecacerts" -Djavax.net.ssl.trustStorePassword=changeit
Not sure what website it says "unable to find valid certification path to requested target"
As posted in another entry in stack overflow I'd try to debug what is happening with your trustStore with something like:
java -Djavax.net.debug=all -Djavax.net.ssl.trustStore="C:\Program Files (x86)\Jenkins\.cacerts\jssecacerts" -Djavax.net.ssl.trustStorePassword=changeit
You may want to have a look at this post

Failed to start the VM error when starting a Dataflow SQL job

Getting the following error when I try to launch a Dataflow SQL job:
Failed to start the VM, launcher-____, used for launching because of status code: INVALID_ARGUMENT, reason: Error: Message: Invalid value for field 'resource.networkInterfaces[0].network': 'global/networks/default'. The referenced network resource cannot be found. HTTP Code: 400.
This issue just started today.
Adding the default network solved the issue.

How to close Appium driver properly in Python?

I am using Python 3.7 with Appium 1.15.1 on real Android Device.
When my script finish the job, I close the driver with these lines:
if p_driver:
p_driver.close()
but I get this error ouput:
File "C:\Users\Nino\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 688, in close
self.execute(Command.CLOSE)
File "C:\Users\Nino\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\Nino\AppData\Roaming\Python\Python37\site-packages\appium\webdriver\errorhandler.py", line 29, in check_response
raise wde
File "C:\Users\Nino\AppData\Roaming\Python\Python37\site-packages\appium\webdriver\errorhandler.py", line 24, in check_response
super(MobileErrorHandler, self).check_response(response)
File "C:\Users\Nino\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: An unknown server-side error occurred while processing the command. Original error: Could not proxy. Proxy error: Could not proxy command to remote server. Original error: 404 - undefined
I would like to understand what I am doing wrong?
What is the way to close properly the driver?
Can you help me please?
Step1: you should get the appium session dictionary first:
session_instance = webdriver.Remote(str(url), caps_dic)
where url is your appium server url. something like: "http://127.0.0.1:4723/wd/hub"
and caps_dic is a dictionary of all your desired capabilities
Step2: you run the quit() method on session:
session_instance[session].quit()
So the whole snippet is:
session_instance = webdriver.Remote(str(url), caps_dic)
session_instance[session].quit()

peer node status command not working correcly in hyperledger fabric network

I have a problem like this. I am very new to hyper ledger fabric. I attach a shell to a running peer container in visual studio code and hit peer node start command in that terminal it gives me an error saying that,
2018-09-13 09:08:04.621 UTC [nodeCmd] status -> INFO 040 Error trying to get status from local peer: rpc error: code = Unknown desc
= access denied
status:UNKNOWN
Error: Error trying to connect to local peer: rpc error: code = Unknown desc = access denied
Can Someone help me to solve this problem? I search a lot but I was unable to find a solution to my problem. Thank You?
edit: the problem is you are using an old card with a new setup. when you create the app and then restarted the environment, it leads to the regeneration of the certificates.
I guess the problem is the FABRIC_VERSION. When you set it to hlfv1 and get bash into peer container (docker exec -it peer0.org1.example.com bash), the peer commands are working properly but when you set it to hlfv12 there are some peer commands are not working. I guess there is something wrong with the startup scripts. There is no "creds" folder exists under hlfv12/composer like hlfv1/composer by the way..
The peer node status command must be called by an administrator of the peer (someone who holds a private key matching one of the public keys in the MSP admincerts folder).
You need to run peer commands on a properly configured (by correct authentication materials) client. In my case it was CLI node.
Peer node logs:
root#bba2c96e744e:/# peer node status
2019-04-04 13:26:18.407 UTC [nodeCmd] status -> INFO 001 Error trying to get status from local peer: rpc error: code = Unknown desc = access denied
status:UNKNOWN
Error: Error trying to connect to local peer: rpc error: code = Unknown desc = access denied
root#bba2c96e744e:/# peer chaincode list --installed
Error: Bad response: 500 - access denied for [getinstalledchaincodes]: Failed verifying that proposal's creator satisfies local MSP principal during channelless check policy with policy [Admins]: [This identity is not an admin]
root#bba2c96e744e:/# peer logging getlevel system
Error: rpc error: code = Unknown desc = access denied
CLI node logs:
root#4079f33980f3:/# peer node status
status:STARTED
root#4079f33980f3:/# peer chaincode list --installed
Get installed chaincodes on peer:
Name: ccc, Version: 1.0, Path: chaincode/ccc, Id: e75e5770a29401d840b46a775854a1bb8576c6d83cf2832dce650d2a984ab29a
root#4079f33980f3:/# peer logging getlevel system
2019-04-04 13:26:02.287 UTC [cli/logging] getLevel -> INFO 001 Current log level for peer module 'system': INFO

Resources