Replay events and ingest them in AWS Timestream - time-series

I can't ingest records into AWS Timestream if timestamp is out of the window of Memory Store. Thus, I can't implement functionality where I replay messages, process and ingest them if there is some issue. Are there any solutions for this?

Currently there is not a way to ingest records that are outside of the memory store retention window. Would you be able to create a table with a memory store retention window large enough for the window that you expect to be correcting data?

Related

Allow User to Extract Data Dumps From DW

We use synapse in azure as our warehouse and create reports in power bi for our users on top of this. We currently have a request to move all of the data dumps from our production system onto our warehouse DB as some of them are causing performance issue in production when run. We've been looking to re-do these into reports in power bi, however in some instances we still need to provide the "raw" data in csv/excel format. This has thrown an issue as some of these extracts are above 150k rows and therefore we can't use power bi to provide the extract as it has a limit on the rows it can export. Our solution would be to build a process to runs against the db and spits out a file into sharepoint for the user to consume, which we can do however we're unsure of how we could provide a method of the user triggering the extract. One of the ways I was thinking of doing it would be using power apps, however I'm wondering if there is an easier way someone on here might be able to suggest? I just need to provide pages with various buttons that trigger extracts to sharepoint from azure when clicked, which can be controlled by security in some way. Any advice would be appreciated.
Paginated Report Export doesn't have that row limit.
See, eg
https://learn.microsoft.com/en-us/power-bi/collaborate-share/service-automate-paginated-integration
Or you can use ADF Copy Activity to create .csv extracts.

How to synchronize AWS DynamoDB with dataset ios swift?

I am using AWS Cognito and DynamoDB.i have authenticated user using AWS Cognito and also used crud operation in DynamoDB successfully.I am creating dataset when internet is not available but i have no idea how to synchronize dataset with DynamoDB.Is AWS support dataset synchronization with DynamoDB.
You have several options depending on your use-case.
The most straightforward and simple option is to use DynamoDB streams. It can store all updates to DynamoDB table for up to 24 hours and allows you to read these changes and reapply them in another DB.
If 24 hour window is too strict for you, you will have to create some sort of DynamoDB snapshots. Say you can create a DynamoDB snapshot every 24 hours and store it into S3. Then you can use DynamoDB streams to read real time updates and snapshots to read baseline data.
To create DynamoDB snapshots you can use Data Pipeline service.

iOS chat app design

I'm building a simple chat app on iOS for fun (and to have projects to gain experience from), using socketsIO and a node backend. I am trying to figure out the best design for messages. I was planning to use a mongoDB database where each conversation would have its message data stored. Whenever the client sends a new message to the server, the server adds it to the appropriate conversation in the database.
I was also hoping to create a user Sign Up/Log In system which would add you to the database.
However, I've googled around quite a bit and I am really not sure if creating a database made up of conversations (that get updated whenever a sentMessage event is triggered) and user data is the right way to go.
Additionally, I've seen some people talk about saving the chats on the actual devices themselves, not in a database? What is the common design pattern for a chat app like this?
for the design I would use socket.io for emitting messages as well. It has a great community behind it, I woul also use MongoDb because everything is using JSON format and it's integrated so well with Node due to it using JavaScript.
Now the part you are interested about, is REDIS. Redis is a database that sits in RAM on the web and should be used with mongodb if you're going to be having higher traffic / need quick speed / less hanging and waiting.
REDIS would be your temporary save for the chat with a session because doing disk write/read/querying is a lot on the machine (looking at you MongoDB), If you plan on saving the chat with every message. Doing so MongoDb would just not scale all the well in the long run and is not as fast as REDIS. Mind you REDIS database will only hold the temporary chat log of let's say the last 1 million chat session or some limit (it's all in RAM so the size is limited can't have Terabytes or hundreds of Gigabytes of RAM on 1 server).
so the data flow would look something like
user sends message
server receives messsage via HTTP(S) post/put - Ajax/Observable
Server will use socket.io to emit the message to the designated user while saving the message to REDIS with a specific key/session/message.
designated user get's the update on their screen via io event.
-- inbetween there should be a check on the REDIS db of whether it is getting full. if it's full remove the last 10,000 inactive messages (could be from 1 year ago if the server hasn't gotten full yet) to make some space.
Saving the chat on the phone is an okay idea as it would save the users data/bandwidth and they could potentially look at their message while offline.
a solution is using SQL Lite which is a lightweight library that will sit inside your app acting as a database which you can perform queries on if your familiar with RDBMS you will have no problem implementing it. But now you gotta find a good way to manage saving data to REDIS/SQL-LITE/MongoDb.

Is there a solution like Apache ActiveMQ on top of HDFS?

I want to store webpages fetched by a web crawler. I don't have any random access. so whenever i want to read the stored data, i read from the start to the end.
We have tried solutions like HBase but one of the most good things about HBase is random access to records which we don't need at all. HBase has not proved to be stable for us after 1.5 years of test.
I want just a stack or queue on top of HDFS becuase the number of webpages is about 1 billion. I don't even want the queue behaviour of ActiveMQ i just want to be able to store the webpages so that i can read them all in case of a failure.
I don't want to use Files because i don't want to handle things like file rotations, file consistencies and ...
It is worth to mention that we need HDFS so we can run MapReduce jobs on the data when we want to send all the stored data to a solr cluster and to have good things like redundancy and availability by HDFS.
Is there a service on HDFS that just stores JMS records without any functionality for random access and without transparent view of records?

Amazon DynamoDB Provisioned Throughput (iOS SDK)

I am new to DynamoDB. I am very much confused about provisioned throughput. I am creating a iPhone game in which the users can chat within the game. I am having a Chat table. The Chat table contains GameID, UserID and Message. How do I find the size of the item to calculate throughput. The size of the item entirely depends on the Message right? How to calculate the size of an item?
Amazon tells that we can either modify the throughput by using UpdateTable API or by manually from the console. If I want to change it form code, how will I know that the provisioned throughput has been exceeded for a certain table? How to check that from code?
I am also confused about the CloudWatch. How to understand this?
Could anyone please help me? Please don't point me to the documentation.
Thanks.
I will do my best to help with the confusion.
DynamoDB is a key:value database
CloudWatch is Amazon's products monitoring tool
Provisioned throughput is roughly the number Items KB you plan to Read/Write per seconds
Whenever you exceed your provisioned throughput,
DynamoDB answers with ProvisionedThroughputExceededException
DynamoDB notifies CloudWatch
What Cloudwatch does is basically record and aggregates data-points. For most applications, it will only keep track of aggregated data over each consecutive 5min periods.
You can then access these data for "manual" monitoring or set up "alarms".
There was a really interesting question on SO a couple of weeks earlier on DynamoDB auto-scaling using alarms. You might be interested in reading it: http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/ErrorHandling.html
Knowing this, you can start building your application.
As for every DynamoDB services, one needs credentials to access it. Even though they can be restricted to a specific table or set of action, it is very dangerous to bundle them in an application. Would you give MySQL or MongoDB or credentials, even Read Only to any untrusted people ?
May I suggest you do build your application to rely on a server of your own ? This server being trusted and build by you, you could safely perform any authorization check there and grant it full access to your table.
I hope this helps. Feel free to ask for more precisions.

Resources