Need to send SMS to 11K people at once but don't know in advance how much $ to have in balance? - twilio

I have a python script which sends to all 11K people an SMS at once, they are from all sorts of countries.
I don't want to have money left over in my balance as I won't be doing that again.
Problem it's too difficult to estimate the cost as the people are from 190 different countries.
I know there is Auto-recharge which is enabled for me, but the issue is that it's sending all messages at once, so I do not think auto-recharge will work as it needs to recharge inside milliseconds.
Any solution?

I'd try batching strategies, since most numbers can't process more than 10 SMS/second (1 SMS/second for NA numbers) anyhow (anything more will just get queued), so 11k messages would take ~18 mins anyhow.
So split your pool into 5 batches of ~2k messages, and see how much the first 3 batches cost, which would inform how much money to load for batches 4 & 5.
NOTE: running out of money mid-batch would need to be adequately handled, too.
Sending costs will vary by [destination] country, but these rates are published, e.g. US - 0.75 cents/msg, India - 1.75 cents/msg, UK - 4 cents/msg, etc.
Then the problem becomes one of parsing out country codes from your target numbers if they're not already split (e.g. +18005551234 vs. +1 8005551234).

Related

max-series-per-database limit exceeded clarification needed / how to calculate number of series in use

We recently started to encounter this error:
{"error":"partial write: max-series-per-database limit exceeded: (1000000) dropped=1"}
When writing metric data like this:
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=1103,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
I know that Influx recommends you keep your series cardinality low, and our impression was that series cardinality would mean keeping each tag individually to a small number of values. e.g. we felt comfortable sending instance_id=1103 as a tag, because we know that there will never be more than 2000 distinct instance_id tag values.
But after running into this error... I'm afraid maybe I was mistaken here. Do we actually need to keep the cardinality of all possible combinations of all tags low? e.g. do these two things count as two separate series towards the 1,000,000 default max, because the instance_id is different?
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=1111,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=2222,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
If those count as two separate series... then is there a better way to structure this data in Influx? 1,000,000 total seems like a tiny amount if each separate combination of tags is a separate series...
Does InfluxDB 2.x help with this?
Is there a better tool that can handle a large number of tags and not bump into limits like this?
There is no way to figure out what data was not recorded. Update the max-series-per-database configuration to be more than 1M in order to stop dropping data.
This can be an indication that you are creating a lot of series. i saw some documentation on why that isn't great.
Hope this helps!

Array#product RangeError: too big to product

I have 93 arrays. Each array has about 18 values in average
I need to make a product of these arrays.
So I have my two dimension array that store these 93 arrays.
Here is what I try to do
DATASET.first.product(*DATASET[1..-1])
Ruby returns
RangeError: too big to product
Does anyone know some workaround to figure out of it?
Some ways to chunk them?
What you want is impossible.
The product of 93 arrays with ~18 elements each is an array with approximately 549975033204266172374216967425209467080301768557741749051999338598022831065169332830885722071173603516904554174087168 elements, each of which is a 93-element array.
This means you need 549975033204266172374216967425209467080301768557741749051999338598022831065169332830885722071173603516904554174087168 * 93 * 64bit of memory to store it, which is roughly 409181424703974032246417423764355843507744515806959861294687507916928986312485983626178977220953161016576988305520852992 bytes. That is about 40 orders of magnitude more than the number of particles in the universe. In other words, even if you were to convert the entire universe into RAM, you would still need to find a way to store on the order of 827180612553027 yobibyte on each and every particle in the universe; that is about 6000000000000000000000000 times the information content of the World Wide Web and 10000000000000000000000 times the information content of the dark web.
Does anyone know some workaround to figure out of it? Some ways to chunk them?
Even if you process them in chunks, that doesn't change the fact that you still need to process 51147678087996754030802177970544480438468064475869982661835938489616123289060747953272372152619145127072123538190106624 elements. Even if you were able to process one element per CPU instruction (which is unrealistic, you will probably need dozens if not hundreds of instructions), and even if each instruction only takes one clock cycle (which is unrealistic, on current mainstream CPUs, each instruction takes multiple clock cycles), and even if you had a terahertz CPU (which is unrealistic, the fastest current CPUs top out at 5 GHz), and even if your CPU had a million cores (which is unrealistic, even GPUs only have a couple of thousand extremely simple cores), and even if your motherboard had a million sockets (which is unrealistic, mainstream motherboards only have a maximum of 4 sockets, and even the biggest supercomputers only have 10 million cores in total), and even if you had a million of those computers in a cluster, and even if you had a million of those clusters in a supercluster, and even if you had a million friends that also have a supercluster like this, it would still take you about 1621000000000000000000000000000000000000000000000000000000000000000000 years to iterate through them.
Right, so as it is hopefully clear that this should not be attempted I'll take a risk and attempt solving your actual problem.
You've mentioned in the comments that you need this array for property testing - I'll take a massive leap of faith here and assume you want to test that every possible combination satisfies some conditions - and this is the mistake here, as the amount of possible combination is just... large...
Instead, you can test that some of the combinations works. You can easily generate a short, randomized list of combinations using:
Array.new(num) { DATASET.map(&:sample) }
Where num is a number of combinations you want to test. Note that there is a chance that some of the entries will be duplicated - but given your dataset size the chances would be comparable with colliding uuids and can be safely ignored.
Generating such a subset of possible solutions is much easier, faster and, most importantly, possible. Since the output is randomized, it will test slightly different combination on each run, so remember to have some randomization setup in your test suite if you want to be able to recreate failures.

How can I calculate the appropriate amount of channel capacity?

I am looking for a solution because the sth-channel is full.
I am troubled with calculating the appropriate capacity of channel capacity.
This document has the following description.
In order to calculate the appropriate capacity, just have in consideration the following parameters:
・The amount of events to be put into the channel by the sources per unit time (let's say 1 minute).
・The amount of events to be gotten from the channel by the sinks per unit time.
・An estimation of the amount of events that could not be processed per unit time, and thus to be reinjected into the channel (see next section).
How can I check the values of these parameters?
How can I check the values of these parameters?
You can't just check these parameters. They depend on your application.
What they are saying is that you should have a size which is large enough so the generator doesn't get stuck. This may not be possible in your application.
Say your generator receives one event per second and it takes 2 seconds for a receiver to manage that event. Now lets assume you have 3 receivers. In 1 second, you can manage to process 0.5 events per receiver. You have 3 receivers, so your receivers, together, are capable of processing 0.5 × 3 = 1.5 events, which is more than what you get as input. Your capacity can be 1 or 2, using 2 will greatly increase your chances that you do not get blocked.
Let's review another example:
Your generator wants to pushes 1,000 events per second
Your receivers take 3 seconds to process one event
You would need 1,000 x 3 = 3,000 receivers (3,000 goroutines that can run at full speed in parallel...)
In this example, the total number of receivers is so large that you have to either break up your code to work on multiple computers or optimize your receiver code so it can process the data in an amount of time that makes sense. Say you have 50 processors, your receivers will get 1,000 events per second, all 50 can run at full speed, you need one receiver to do its work in:
50 / 1000 = 0.05 seconds
Now let's assume that in most cases your goroutines take 0.02 but once in a while one will take 1 second. That means your goroutines can get a little behind. In that case your capacity (so the generator doesn't get blocked) should be a little over 1,000. Again, it will depend on how many of the routines get slowed down, etc. In this last example, a run is 0.02 seconds so to process 1,000 events it usually takes 0.02 seconds. If you can send those 1,000 event over the 1 second period, you may not even need the 50 goroutines and could have a smaller capacity. On the other hand, if you have big bursts where you may end up sending many (say 500) events all at ones, then more goroutines and a larger capacity is important to not get blocked.

Timing Advance in GSM

I have a bunch of questions concerning Timing Advance in GSM :
When is it defined ?
Is it the phone or the BTS who's in charge of defining it's value ?
is it dynamic, does it depends on certain situations ?
Let's say that I figured out a way to get the exact value of the Timing Advance (GSM Layer 1 Transmission level) from the phone's modem :
In order to verify my solution, I'm supposed to put my phone over and over in a situation where he have to use/change the Timing Advance while I log its value...
How can I do that ?
Thanks
In the GSM cellular mobile phone standard, timing advance value corresponds to the length of time a signal takes to reach the base station from a mobile phone. GSM uses TDMA technology in the radio interface to share a single frequency between several users, assigning sequential timeslots to the individual users sharing a frequency. Each user transmits periodically for less than one-eighth of the time within one of the eight timeslots. Since the users are at various distances from the base station and radio waves travel at the finite speed of light, the precise arrival-time within the slot can be used by the base station to determine the distance to the mobile phone. The time at which the phone is allowed to transmit a burst of traffic within a timeslot must be adjusted accordingly to prevent collisions with adjacent users. Timing Advance (TA) is the variable controlling this adjustment.
Technical Specifications 3GPP TS 05.10[1] and TS 45.010[2] describe the TA value adjustment procedures. The TA value is normally between 0 and 63, with each step representing an advance of one bit period (approximately 3.69 microseconds). With radio waves travelling at about 300,000,000 metres per second (that is 300 metres per microsecond), one TA step then represents a change in round-trip distance (twice the propagation range) of about 1,100 metres. This means that the TA value changes for each 550-metre change in the range between a mobile and the base station. This limit of 63 × 550 metres is the maximum 35 kilometres that a device can be from a base station and is the upper bound on cell placement distance.
A continually adjusted TA value avoids interference to and from other users in adjacent timeslots, thereby minimizing data loss and maintaining Mobile QoS (call quality-of-service).
Timing Advance is significant for privacy and communications security, as its combination with other variables can allow GSM localization to find the device's position and tracking the mobile phone user. TA is also used to adjust transmission power in Space-division multiple access systems.
This limited the original range of a GSM cell site to 35km as mandated by the duration of the standard timeslots defined in the GSM specification. The maximum distance is given by the maximum time that the signal from the mobile/BTS needs to reach the receiver of the mobile/BTS on time to be successfully heard. At the air interface the delay between the transmission of the downlink (BTS) and the uplink (mobile) has an offset of 3 timeslots. Until now the mobile station has used a timing advance to compensate for the propagation delay as the distance to the BTS changes. The timing advance values are coded by 6 bits, which gives the theoretical maximum BTS/mobile separation as 35km.
By implementing the Extended Range feature, the BTS is able to receive the uplink signal in two adjacent timeslots instead of one. When the mobile station reaches its maximum timing advance, i.e. maximum range, the BTS expands its hearing window with an internal timing advance that gives the necessary time for the mobile to be heard by the BTS even from the extended distance. This extra advance is the duration of a single timeslot, a 156 bit period. This gives roughly 120 km range for a cell.[3] and is implemented in sparsely populated areas and to reach islands for example.
Hope this Answer the question:)
It's defined everytime the BTS needs to set the define the phone's transmission power, which happens quite often.
It's the core system (BTS in GSM) who totally in charge of defining it's value.
It's very dynamic, and change a lot. Globally, the GSM core system is constantly trying to find the exact distance between the BTS and the MS, so it constantly make a kind of "ping" to calculate it. The result of such operations is generally not that accurate since there are a lot of obstacles between the mobile and the BTS (it's not a direct link in an open space).
Such operations happens a lot, so use your smartphone. Simply.

What should i do to maintain performance of a mobile app which is using database?

I'm building an app using database.
I have a words table and everytime user types something, this app will record and update word the database.
And the frequency field will be auto increase after user enter one matched word.
But the trouble is user type day by day and i afraid the search performance will be reduce after times and also the Int field will reach to the limit (max limit Int) someday.
So, i limit the database to around less than 50.000 records.
I delete less-used records after a certain time.
But i don't know how to deal with frequency Int field of each word?
How to know exactly frequency usage of each word without increasing the field forever?
I recommend that you use a logarithmic scale for the frequency values. That's what is often done in situations like this. See Wikipedia to learn about logarithmic scales.
For example, if you have a word MAN that has a frequency of 15, the value you store in the database would be log(15) ~= 1.17609125906.
If you then find 4 new occurrences of MAN, then you want to add 4 to the field. You cannot add the log values directly because log(x)+log(y)=log(x*y). (See the Logarithm Rules section of this article for more information on log rules.)
Instead -- assuming you use a base 10 logarithm, you would use this formula:
SET frequency = log(10^frequency+4)
Depending on the length of your words, the few bytes for the frequency don't matter. With an unsigned four bytes integer, you can count up to more than two billion, which is way above the number of words what the user can type in in their whole lifespan.
So may want to go for two or three bytes, but the savings may be negligible.
Anyway, there are the following approaches for preventing overflow:
You can detect it, and then undo the operations, scale everything down by some factor of two, and then redo.
You can periodically check all your numbers and do the scaling when approaching the limit.
You can do a probabilistic update like below.
Probabilistic update
Instead of simply incrementing the frequency every time by one, you do it only with a probability which gets lower and lower as the counter grows. For example, you can do the increment with a probability of 1.0 / (oldValue + 1) or 2 ** -oldValue. The latter leads to a logarithmic growth, but, unlike the idea in the other answer, it works.
There are obviously some disadvantages due to the randomness and precision loss, but when all you care about is the relative frequency, it should be good enough.

Resources