Migration from ES 2.x to ES 5.x Elasticsearch - elasticsearch-5

We have created new ES cluster v5.x. We have added new backup repository for restoring (old ES 2.x).
We have restored from our snapshot and everything was fine. We have retention period for our ES snapshots.
We are using s3 storage and repository-s3 plugin for backups.
And while our retention period has deleted this snapshot we got an error:
{
"error": {
"root_cause": [
{
"type": "snapshot_missing_exception",
"reason": "[s3_repository:snapshot_201701040203/snapshot_201701040203] is missing"
}
],
"type": "snapshot_exception",
"reason": "[s3_repository:snapshot_201701040203/snapshot_201701040203] Snapshot could not be read",
"caused_by": {
"type": "snapshot_missing_exception",
"reason": "[s3_repository:snapshot_201701040203/snapshot_201701040203] is missing",
"caused_by": {
"type": "no_such_file_exception",
"reason": "Blob object [snap-snapshot_201701040203.dat] not found: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: DB308DF310809F58)"
}
}
},
"status": 500
}
Full log:
[2017-01-30T13:18:57,344][WARN ][r.suppressed ] path: /_snapshot/s3_repository/_all, params: {repository=s3_repository, snapshot=_all}
org.elasticsearch.snapshots.SnapshotException: [s3_repository:snapshot_201701040203/snapshot_201701040203] Snapshot could not be read
at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:187) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:122) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:50) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:86) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$3.doRun(TransportMasterNodeAction.java:170) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.1.2.jar:5.1.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: org.elasticsearch.snapshots.SnapshotMissingException: [s3_repository:snapshot_201701040203/snapshot_201701040203] is missing
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:566) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:182) ~[elasticsearch-5.1.2.jar:5.1.2]
... 9 more
Caused by: java.nio.file.NoSuchFileException: Blob object [snap-snapshot_201701040203.dat] not found: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 38087D5B4A20B627)
at org.elasticsearch.cloud.aws.blobstore.S3BlobContainer.readBlob(S3BlobContainer.java:92) ~[?:?]
at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.readBlob(ChecksumBlobStoreFormat.java:100) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.repositories.blobstore.BlobStoreFormat.read(BlobStoreFormat.java:89) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:560) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:182) ~[elasticsearch-5.1.2.jar:5.1.2]
I have tried to remove this snapshot repository, remove all indeces. But if i will add this repository again, i am getting the same error.
How can i restore ES? Where ES takes information about old snapshot?
Best regards.

I ran into this issue today. My situation is slightly different, but maybe this will help you. It appears that ES 5 is maintaining an index of snapshots in the bucket where ES 2.x does not.
In my scenario, I'm transitioning from 2.x to 5.x. Yesterday, I restored a bunch of snapshots created by my 2.x environment onto the 5.x environment. Last night another snapshot was created by my 2.x environment. When I go to restore the snapshot, I get an error that the snapshot doesn't exist (even though it does).
If I rename two files in the s3 bucket, ES 5 will rebuild it's index and see the new snapshot. The two files are: index-0 and index.latest.
Hope that helps.

I got a similar issue and solved it following that advice: https://discuss.elastic.co/t/migration-from-es-2-x-to-es-5-x/73211/6 (in my case, deleting the index-0 file and re-registering the repository with my ES 5)

Related

What does the CDKToolkit's BootstrapVersion SSM parameter represent?

I am using AWS CDK toolkit to create our infrasture. I created helloworld-stack.ts file and when I do cdk synth then this process creates HelloWorldStack.template.json file.
In this file we have some auto generated elements. Like this one.
Now, I am not able to understand, how bootstraping pushes this "/cdk-bootstrap/hnb659fds/version" to SSM store and why this key always has value 14.
Can someone help me to understand this behaviour?
"Parameters": {
"BootstrapVersion": {
"Type": "AWS::SSM::Parameter::Value<String>",
"Default": "/cdk-bootstrap/hnb659fds/version",
"Description": "Version of the CDK Bootstrap resources in this environment, automatically retrieved from SSM Parameter Store. [cdk:skip]"
}
},
After reading AWS offical doc regarding bootstrapping, I got the answer.
https://docs.aws.amazon.com/cdk/v2/guide/bootstrapping.html
In this doc, they mentioned it this is their template version.

AzureBlobStorageOnIoTEdge: Error Target container connection not specified, upload turned off

My local blob storage is not uploading blobs to my cloud storage account. It reports back
"configurationValidation": {
"deviceAutoDeleteProperties": {
"deleteOn": {
"Status": "Success"
},
"deleteAfterMinutes": {
"Status": "Warning",
"Message": "Auto Delete after minutes value not specified, auto deletion turned off."
},
"retainWhileUploading": {
"Status": "Success"
}
},
"deviceToCloudUploadProperties": {
"uploadOn": {
"Status": "Success"
},
"cloudStorageAccountName": {
"Status": "Error",
"Message": "Target container connection not specified, upload turned off."
},
"cloudStorageAccountKey": {
"Status": "Error",
"Message": "Target container connection not specified, upload turned off."
},
"uploadOrder": {
"Status": "Success"
},
"deleteAfterUpload": {
"Status": "Success"
}
}
},
I am pretty sure that it should work. My desired properties are
"deviceToCloudUploadProperties": {
"uploadOn": true,
"uploadOrder": "OldestFirst",
"cloudStorageConnectionString": "DefaultEndpointsProtocol=https;AccountName=*****;AccountKey=******;EndpointSuffix=core.windows.net",
"storageContainersForUpload": {
"***": {
"target": "***"
}
},
"deleteAfterUpload": true
}
The container exists locally and on the cloud site. I copied the primary connection string from my local storage account into the configuration. The local storage is working, I can see that my container was created and contains data but it doesn't want to synchronize with the cloud. Why is it saying "Target container connection not specified, upload turned off."? It sounds like this part is missing
"storageContainersForUpload": {
"***": {
"target": "***"
}
},
but obviously it is not.
I'm using the latest docker image of this service. Is there any chance to use an older version? Some months ago I could make it work already. I tried to use a different version like mcr.microsoft.com/azure-blob-storage:1.4.0 but it doesn't accept any other tags than latest.
Thx!
The difference between my working version of the local blob storage module and my non working version was that the non working version was deployed by a deployment plan. In the deployment plan you cannot just paste the module twin settings of the documentation of the blob storage on IoT edge like https://learn.microsoft.com/en-us/azure/iot-edge/how-to-deploy-blob?view=iotedge-2020-11
You need to split the configuration into two parts where the first part looks like this
and the second part looks like that
And that totally makes sense. If you want to update your modules you probably want to keep your configuration because there might have been some changes which were made e.g. by a customer. This gives you the possibilty to add some properties to your inital configuration later without changing anything what was already configured. In fact every device can keep its individual configuration at any time.
My wrongly configured reported proterties were hidden in the suggested default path "properties.desired.settings" and thus the edge runtime could not find it.

Can't connect to local storage account

I followed the Microsoft docs for deploying a storage account edge module and my module is currently running in my VM.
However, I cannot connect the Python SDK or the storage explorer to it.
My container create options are:
{
"Env": [
"localstorage",
"92RhvUXR59Aa8h90LSHC7w=="
],
"HostConfig": {
"Binds": [
"blobvolume:/blobroot"
],
"PortBindings": {
"11002/tcp": [
{
"HostPort": "11002"
}
]
}
}
}
My module twin settings are
{
"deviceAutoDeleteProperties": {
"deleteOn": false,
"deleteAfterMinutes": 0,
"retainWhileUploading": true
},
"deviceToCloudUploadProperties": {
"uploadOn": true,
"uploadOrder": "OldestFirst",
"cloudStorageConnectionString": "<<my connection string was here :) >>",
"storageContainersForUpload": {
"eds-container-pcu": {
"target": "eds-container-pcu"
}
},
"deleteAfterUpload": true
}
}
iotedge list says
blobstorage running Up 5 hours mcr.microsoft.com/azure-blob-storage:latest
I also checked the volume. I put a file into the "BlockBlob" folder in my volume and then I went into the shell of my container and under /blobroot I could find that file.
I also copied the cloud storage connection string and used that for a file upload and that also worked. I could upload a text file into my cloud storage account successfully.
My connection string for my local storage account would be that, no? I created it by myself like the docs said:
DefaultEndpointsProtocol=http;BlobEndpoint=http://localhost:11002/localstorage;AccountName=localstorage;AccountKey=92RhvUXR59Aa8h90LSHC7w==;
With that connection string I cannot connect anything. Not the SDK, nor the Storage Explorer.
Python code would be:
CONNECTION_STRING = "DefaultEndpointsProtocol=http;BlobEndpoint=http://localhost:11002/localstorage;AccountName=localstorage;AccountKey=92RhvUXR59Aa8h90LSHC7w==;"
blobClient = BlobClient.from_connection_string(CONNECTION_STRING, "eds-container-pcu", "test123.txt")
# Doesn't work with or without this line
blobClient._X_MS_VERSION = '2017-04-17'
blobClient.upload_blob(data="Hello World", blob_type="BlockBlob")
And I get:
HttpResponseError: Server encountered an internal error. Please try again after some time.
RequestId:fef9066d-323c-47e3-8e6f-ba006557ee65
Time:2021-04-08T14:33:23.7892893Z
ErrorCode:InternalError
Error:None
ExceptionDetails:None
ExceptionMessage:'s' cannot be null
StackTrace:AsyncHelper.ArgumentNullRethrowableException: 's' cannot be null
---> System.ArgumentNullException: Value cannot be null. (Parameter 's')
at System.Convert.FromBase64String(String s)
at Microsoft.AzureStack.Services.Storage.FrontEnd.WossProvider.WStorageAccount..ctor(WStorageStamp stamp, String accountName)
at Microsoft.AzureStack.Services.Storage.FrontEnd.WossProvider.WStorageStamp.Microsoft.Cis.Services.Nephos.Common.Storage.IStorageStamp.CreateAccountInstance(String accountName, ITableServerCommandFactory tableServerCommandFactory)
at Microsoft.Cis.Services.Nephos.Common.Storage.PerRequestStorageManager.CreateAccountInstance(String accountName)
at Microsoft.Cis.Services.Nephos.Common.Protocols.Rest.BasicHttpProcessorWithAuthAndAccountContainer`1.GetResourceAccountPropertiesImpl(String accountName, Boolean isAdminAccess, TimeSpan timeout, AsyncIteratorContext`1 context)+MoveNext()
at AsyncHelper.AsyncIteratorContextBase.ExecuteIterator(Boolean inBegin)
--- End of inner exception stack trace ---
at Microsoft.Cis.Services.Nephos.Common.Protocols.Rest.BasicHttpProcessorWithAuthAndAccountContainer`1.EndGetResourceAccountProperties(IAsyncResult asyncResult)
at Microsoft.Cis.Services.Nephos.Common.Protocols.Rest.BasicHttpProcessorWithAuthAndAccountContainer`1.ProcessImpl(AsyncIteratorContext`1 async)+MoveNext()
Why could that be? Storage explorer is also not connecting and the lines of code above are working fine with my cloud storage connection string.
A colleague pointed out that my local account key is too short. As written in the docs, the key must have a length of 64 bytes. Connection with SDK or StorageExplorer is fine then. My code also was missing the creation of a local container but the error message will tell about this.
in your sample for the container create options, I am missing the keys (LOCAL_STORAGE_ACCOUNT_NAME and LOCAL_STORAGE_ACCOUNT_KEY). Did you specify only the values?
"Env": [
"LOCAL_STORAGE_ACCOUNT_NAME=localstorage",
"LOCAL_STORAGE_ACCOUNT_KEY=92RhvUXR59Aa8h90LSHC7w=="
],
When I deploy a module with the settings, I can connect via Storage Explorer.
Did you look at the logs of the blob storage module?

DevOps migrate prepare fails after 'user not found' graph call

I am trying to migrate on-premise TFS (DevOps Server 2019 update 1.1) to DevOps Services and using the migrate tool. I have run the validate command and cleaned up those warnings but the next command (prepare) fails mysteriously. The log file simply says:
[Error #11:18:19.488]
Exception Message: Request failed (type AadGraphTimeoutException)
Exception Stack Trace: at Microsoft.VisualStudio.Services.Identity.DataImport.AadIdentityMapper.ExecuteGraphRequest[T](Func`1 request)
at Microsoft.VisualStudio.Services.Identity.DataImport.AadIdentityMapper.GetAadTenantId()
at TfsMigrator.TfsMigratorCommandValidate.PopulateDataImportPropertiesOnContext()
at TfsMigrator.TfsMigratorCommandValidate.PopulateValidationItems(DataImportValidationContext context)
at TfsMigrator.TfsMigratorCommandValidate.RunValidations(Boolean validateFiles)
at TfsMigrator.TfsMigratorCommandPrepare.RunImpl()
at TfsMigrator.TfsMigratorCommand.Run()
A colleague pointed out this troubleshooting from the docs but a) we have about 10 users involved in TFS (~50 total active in local AD) so it is hard to believe we have so many users it would time out, and b) I ran the Get-MsolUser troubleshooting commands and successfully queried AAD via Graph.
I ran the prepare command again with Fiddler Classic connected as a proxy and discovered a failing call to the Graph API. It looked like
Request (simplified headers):
POST https://graph.windows.net/xxxxxxxx-xxxx-xxxx-xxxx-0664e34adcbd/$batch?api-version=1.6 HTTP/1.1
Content-Type: multipart/mixed; boundary=batch_ea471df4-db73-403d-a172-a0955ddb1575
...
--batch_ea471df4-db73-403d-a172-a0955ddb1575
GET https://graph.windows.net/xxxxxxxx-xxxx-xxxx-xxxx-0664e34adcbd/tenantDetails?api-version=1.6 HTTP/1.1
...
--batch_ea471df4-db73-403d-a172-a0955ddb1575--
Response (body):
{
"odata.error": {
"code": "Authentication_Unauthorized",
"message": {
"lang": "en",
"value": "User was not found."
},
"requestId": "58c4cabc-dd67-4ce8-9735-134a7e0df60c",
"date": "2020-09-14T20:07:49"
}
}
So my question at this point is - Are there any permissions (DevOps, Azure, Graph) that are missing? Are there any work arounds available? I did tag this question with Microsoft Graph API but do believe the failing call uses the older Azure AD Graph API.

IoTAgent-LoRaWAN is apparently not working as expected

I was trying to provisioning the IoTAgent-LoRaWAN using the TTN credentials, I'm following the official docs and this is my POST request:
{
"devices": [
{
"device_id": "{{node}}",
"entity_name": "LORA-N-0",
"entity_type": "LoraDevice",
"timezone": "Europe/Madrid",
"attributes": [
{
"object_id": "potVal",
"name": "Pot_Value",
"type": "Number"
}
],
"internal_attributes": {
"lorawan": {
"application_server": {
"host": "eu.thethings.network",
"username": "{{TTN_app_id}}",
"password": "{{TTN_app_pw}}",
"provider": "TTN"
},
"dev_eui": "{{TTN_dev_eui}}",
"app_eui": "{{TTN_app_eui}}",
"application_id": "{{TTN_app_id}}",
"application_key": "{{TTN_app_skey}}"
}
}
}
]
}
Obviously I'm using Postman to manage all those HTTP requests in a collection and I've setup a few environment variables that are:
{{node}} -> the device ID node_0
{{TTN_app_id}} -> my app id which I've chosen dendrometer
{{TTN_app_pw}} -> the application access key shown in the picture (It can be found in the same view than the Application Overview; https://console.thethingsnetwork.org/applications/<application_id>)
{{TTN_dev_eui}} and {{TTN_app_eui}} also shown in the following picture (regarding to device; I think these are not sensitive info because TTN is not hiding them, that's because I'm posting the picture)
{{TTN_app_skey}} -> The Application Session Key also shown in the following picture (the last one)
The point is ... once I've provisioned IoTAgent using that request, the docker-compose logs -f iot-agent shows the following errors:
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.689Z","level":"info","message":"New message in topic"}
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.690Z","level":"info","message":"IOTA provisioned devices:"}
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.691Z","level":"info","message":"Decoding CaynneLPP message:+XQ="}
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.691Z","level":"error","message":"Error decoding CaynneLPP message:Error: Invalid CayennLpp buffer size"}
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.691Z","level":"error","message":"Could not cast message to NGSI"}
So I think there is something not working properly. That's my docker-compose.yml, btw http://ix.io/2pWd
However I don't think the problem is caused by docker, all containers are working as expected apparently because I can request their versions and I don't see error messages in the logs.
Also ... I feel the docs like incomplete, I'd like more info, about how to subscribe those provisioned devices with OrionCB (?) or Delete them (that's not shown in the docs, although is just a DELETE request to the proper URL.)
Anyway ... What I'm doing wrong? Thank you all.
EDIT: I feel like there is something wrong in the IoTAgent itself, there is a typo in the following error messages:
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.691Z","level":"info","message":"Decoding CaynneLPP message:+XQ="}
fiware-iot-agent | {"timestamp":"2020-06-23T11:45:53.691Z","level":"error","message":"Error decoding CaynneLPP message:Error: Invalid CayennLpp buffer size"}
Because it isn't CaynneLPP but CayenneLPP. I've also opened an issue in its GitHub repo but I don't expect they answer any time soon. I actually feel like this project has been abandoned.
It's apparently a problem with encoding, I was using the encoding method suggested by arduinio-lmic library but FIWARE does work under CayenneLPP data model. So I'm going to try replace that encoding method.
Thank you all anyway and specially to #arjan

Resources