So IMAP has a feature where, once I've looked at a mailbox, I can efficiently look for any new messages by asking it for any new UIDs which I haven't seen yet.
But I how do I efficiently find expunged messages? I haven't found any commands for doing this yet; my only option is to retrieve the full UID list of the mailbox and looking for missing ones. This is rather suboptimal.
I have mailboxes with 25000+ messages. Scanning one of these mailboxes takes megabytes of traffic just to do the UID SEARCH command, so I'd like to avoid this. Is there anything hidden in the depths of the IMAP protocol that I'm missing?
Okay, for offline use this can work:
Since big mailboxes typically grow big by having many messages added and few deleted, you can optimise for there being large unchanged parts of the messages. If your clientside store contains 10000 messages, then you can send 'x UID SEARCH 2000,4000,6000,8000,10000` which will return five UIDs in one SEARCH response. Depending on which ones have changed, you know whether there have been any expunges in each fifth of the mailbox, so if the mailbox hasn't changed you've verified that very cheaply. If a particular fifth has changed, you can retrieve all 2000 UIDs in that.
QRESYNC/CONDSTORE is much nicer though, and also let you resync flag state.
The only effective answer I know to this is to change the problem.
Instead of learning which of the 25000 messages have been deleted, the client can learn which of the messages in its cache have been deleted and which ones still exist, and that can be done fairly efficiently. One approach is to keep a per-message flag in the client, "this message has been observed to exist in this IMAP connection" and when you access a cached message for which the flag isn't set, you send "x UID SEARCH UID y", which will return the message's UID if the message exists and an empty result if not. QRESYNC's SELECT arguments provide a natural improvement to this technique.
The problems, of course, are that with a non-QRESYNC server the client will cache messages for longer than necessary, and that depending on how you implement it the client might flicker or have unpleasant delays.
Related
I use Twilio to operate an automated phone line that connects callers to resources for some very sensitive topics. If the phone numbers of callers were revealed due to a data breach or subpoena, it could have negative consequences for them. There's no need for us to log callers' numbers, and ideally I'd like to not store that information at all. However, these numbers show up in my usage logs:
I've searched for ways to prevent these numbers from being logged or to delete them after they've been logged, but I can't find anything documented. Is there a way to do this?
Partial answer: You can delete the record of a call using the API DELETE functionality:
DELETE https://api.twilio.com/2010-04-01/Accounts/{AccountSid}/Calls/{Sid}.json
You can have a script periodically request all calls made to the number and call DELETE for each one. If your system involves recordings, transcriptions, or texts, you need to do the same for them.
This is an acceptable solution for our needs, but it would be ideal if the numbers weren't logged in the first place, so I'm still interested in hearing others' answers.
Has anyone posted a response to this problem? There have been other posts with no answers. Our situation is that we are pushing messages onto a topic that is backing a KTable in the first step of our stream process. We are then pulling a small amount of data from those messages and passing them along. We are doing multiple computations on that smaller amount of data for grouping and aggregation. At the end of the streaming process, we simply want to join back to that original topic via a KTable to pick up the full message content again. The results of the join are only a subset of the data because it can not find the entries in the KTable.
This is just the beginning of the problem. In another case, we are using KTables as indexes for lookups meant to enrich the data coming in. Think of these lookups as identifying whether we have seen a specific pattern in the streaming message before. If we have seen the pattern we want to tag it with an ID (used for grouping) pulled from an existing KTable. If we have not seen the pattern before we would assign it an ID and place it back into the KTable to be used to tag future messages. What we have found is that there is no guaranty that the information will be present in the KTable for future messages. This lack of guaranty seems to make KTables useless. We can not figure out why there is a very little discussion of this on the forums.
Finally, none of this seemed to be a problem when running with a single instance of the streams application. However, as soon as our data got large and we were forced to have 10 instances of the app, everything broke. As well, there is no way that we could use things like GlobalKTables because there is too much data to be loaded into a single machine's memory.
What can we do? We are currently planning to abandon KTables all together and use something like Hazelcast to store the lookup data. Should we just move to Hazelcast Jet and drop Kafka streams all together?
Adding flow:
Kafka data flow
I'm sorry for this non-answer answer, but I don't have enough points to comment...
The behavior you describe is definitely inconsistent with my understanding and experience with streams. If you can share the topology (or a simplified one) that is causing the problem, there might be a simple mistake we can point out.
Once we get more info, I can edit this into a "real" answer...
Thanks!
-John
I build an app that is able to store OData offline by using SAP Kapsel Plugins.
More or less it's the same as generated by WEB ID or similer to the apps in this example: https://blogs.sap.com/2017/01/24/getting-started-with-kapsel-part-10-offline-odatasp13/
Now I am at the point to check the error resolution potential. I created a sync conflict (chaning data on the server after the offline database was stored and changed something on the app and started a flush).
As mentioned in the documentation I can see the error in ErrorArchive and could also see some details. But what I am missing is the information of the "current" data on the database.
In the error details I can just see the data on the device but not the data changed on the server.
For example:
Device is loading some names into offline store
Device is offline
User A is changing some names
User B is changing one of this names directly online
User A is online again and starts a sync
User A is now informend about the entity that was changed BUT:
not the content user B entered
I just see the "offline" data.
Is there a solution to see the "current" and the "offline" one in a kind of compare view?
Please also note that the server communication is done by the Kapsel Plugin and not with normal AJAX calls. This could be an alternative but I am wondering if there is no smarter way supported by the API?
Meanwhile I figured out how to load the online data (manually).
This could be done by switching http handler back to normal one.
sap.OData.removeHttpClient();
sap.OData.applyHttpClient();
Anyhow this does not look like a proper solution and I also have the issue with the conflict log itself. It must be deleted before any refresh could be applied.
I could not find any proper documentation for that. Also ETag handling is hardly described in SAPUI5 and SAP Kapsel documentation.
This question is a really tricky one, due to its implications. I understand that you are simulating a synchronization error due to concurrent modification, and want to know if there is a way for the client to obtain the "current" server state in order to give the user a means to compare the local and server state.
First, let me give you the short answer: No, there is no way for the client to see the current server state "for reference" via the Offline APIs when there are synchronization errors. Doing an online query as outlined above might work, but it certainly is a bad idea.
Now for the longer answer, which explains why this is not necessarily a defect and why I said there are quite some implications to the answer.
Types of Synchronization Errors
We distinguish a number of synchronization errors, and in this context, we are clearly dealing with business-related issues. There are two subtypes here: Those that the user can correct, e.g. validation errors, and those that are issues in the business process itself.
If the user violates the input range, e.g. by putting a negative price for a product, the server would reply with the corresponding message: "-1 is not a valid input value for 'Price'". You, as a developer, can display such messages to the user from the error archive, and the ensuing fix is indeed a very easy one.
Now when we talk about concurrent modification, things get really, really nasty. In fact, I like to say that in this case there is an issue with the business process, because on one hand, we allow data to get out of sync. On the other hand, the process allows multiple users to manipulate the same piece of information. How all relevant users should now be notified and synchronize, is no longer just a technical detail, but in fact a new business process. There just is no way to generically device how to handle this case. In most cases, it would involve back-office experts who need to decide how the changes should be merged.
A Better Solution
Angstrom pointed out that there is no way to manipulate ETags on the client side, and you should in fact not even think about it. ETags work like version numbers in optimistic locking scenarios, and changing the ETag basically means "Just overwrite what's on the server". This is a no-go in serious scenarios.
An acceptable workaround would be the following:
Make sure the server returns verbose error messages so that the user can see what happened and what caused the conflict.
If that does not help, refresh the data. This will get you an updated ETag, and merge the local changes into the "current" server state, but only locally. "Merging" really means that local changes always overwrite remote changes.
The user now has another opportunity to review the data and can submit it again.
A Good Solution
Better is not necessarily good, so here is what you should really do: Never let concurrent modification happen because it is really expensive to handle. This implies that not the developer should address this issue, but the business needs to change the process.
The right question to ask is, "When you replicate data in a distributed system, why do you allow it to be modified concurrently at all?" Typically stakeholders will not like this kind of question, and the appropriate reaction is to work out a conflict resolution process together with them. Only then they will realize how expensive fixing that kind of desynchronization is, and more often than not they will see that adjusting the process is way cheaper than insisting in yet another back-office process to fix the issues it causes. Even if they insist that there is a need for this concurrent modification, they will now understand that it is not your task to sort this out and that they need to invest in a conflict resolution process.
TL;DR
There is no way to compare the server and client state to the server state on the client, but you can do a refresh to retain the local changes and get an updated ETag. The real solution, however, is to rework the business process, because this no longer is a purely technical issue.
The default solution is that SMP or HCPms is detecting errors by ETags. At client side there is no API to manipulate ETags in case of conflicts. A potential solution to implement a kind of diff view on the device would work like this:
Show errors
Cache errors (maybe only in memory?)
delete the errors
do a refresh of the database
build a diff view with current data and cached errors
The idea with
sap.OData.removeHttpClient();
sap.OData.applyHttpClient();
could also work but could be very tricky and may introduce side effects.
Maybe some requests are triggered against the "wrong" backend.
This is quite a broad question but ill try and summarise it as best I can.
I have an MVC front end which displays/allows processing of records which are classed as outstanding. I also have a scheduled console app which runs nightly and attempts to resolve each of these records using some logic I wrote.
I have a new requirement, which is to have an email sent every time the total number of outstanding records exceeds a certain amount, this amount needs to be configurable.
The table will contain every record with a flag to say if they have been resolved or not, so I will need to count the outstanding's then fire an email to notify if the threshold is broken.
I initially thought about adding a SQL Server trigger on insert however I soon realised that if no more records were added for a few days but the total number stayed above the threshold because nobody resolved them, then no further email would be sent.
I need the email to send every day on a schedule independently of insert/update.
So now I'm thinking possibly a SQL Server job, or an SSIS package or even a service which runs, but I'm aware this threshold number needs to be configurable.
So what would be the quickest simplest solution to my requirements, I'm open to any suggestion as long as it ticks all the boxes.
Given that the OP already has a console app running on a schedule, the most logical choice would be to simply add this check to the console app along with the email sending logic. It will be much easier to send emails that way, anyways, especially if you employ something like Postal, which will let you use MVC-style views to create your emails.
An SQL Server scheduled job seems to me to be the simplest way to go.
you can add a table to your database that will hold the threshold number and read it's value from there.
In many cases a GeneralParams table is a good thing to have anyway.
The other option you mentioned (windows service) is also configurable in many ways: you can use a GeneralParams table, or the App.Config file of the service (but you will have to restart it every time you change the app.config), or even a simple text file. anything goes. the downside is that it's outside of your sql server, but the upside is that it is probably easier to send emails from.
RFC 3501 states in section 6.1.2. that you should use the NOOP command for polling.
Though in TIdIMAP4 there's only the KeepAlive method using it, which is implemented as a procedure, i.e. doesn't return anything.
So how to check for status updates like e.g. new messages or read status changes? I.e. how can I do manual polling with TIdIMAP4? Which methods and properties are involved in doing that? And how to get the (U)IDs these messages?
Or is it even possible to use the IDLE command specified in RFC 2177 to avoid polling and to get updates automatically?
IMAP is technically an asynchronous protocol, but TIdIMAP4 is currently implemented as a synchronous client. As such, unexpected/out-of-order data is either discarded, treated as untagged data, or treated as error data, depending on timing and context. Untagged/extra data is accessible from the TIdIMAP4.LastCmdResult property, which you can type-cast to TIdReplyIMAP4 to access its Extra sub-property.
IDLE is not currently supported in TIdIMAP4. There are tickets in Indy's issue trackers (see here and here) to add IDLE support in a future release, maybe in Indy 11. Until then, you will have to poll the mailbox envelopes periodically, keeping track of messages you have already seen so you can detect new messages.
Yes, you can use IDLE to avoid NOOP and in general it's a good idea.
However, that won't give you any results. In a way, IMAP commands don't have results. They tell the server what you want, and the server tells you things. The server is free to tell you things for other reasons as well, including the goodness of its heart.
You might say that NOOP means "hi server, now is a good time to tell me things, I'm listening" and IDLE means "hi server, I'm listening all the time, so just tell me whatever you want whenever you want". Both also mean "and btw, restart your inactivity timeout if you have one".
The server will send you EXISTS, FETCH and other responses, which I expect TIdIMAP4 forwards to you in some way. (Yes, they're called responses even though they're not in response to any command of yours. They may be sent in response to someone else having sent you mail, for instance. Stupid naming.)