How does Redis persist data on my local Apache server even after reboot and complete power down? - memory

From how I understand, Redis uses in memory from which I gather my RAM if I am running a local apache development server. I tried powering down my computer and disconnected the power cable as well, but the redis data on my local server development website persisted when I powered back up my computer and tested my test website again. I thought RAM data gets completely wiped when I do a system reboot, how does Redis persist data even after reboot on my local development environment? Thanks! :)

Redis serves data only out of RAM, but it provides two modes of persistence RDB (snapshot persistence) and AOF (changelog persistence). If either mode of persistence is enabled on your Redis server, then your data will persist between reboots.
The config directives you want to check are:
appendonly yes
save
More information on Redis Persistence here.

Redis has persistence options that saves Redis data in either RDB or AOF format (basically saving the Redis data to a file/log):
The RDB persistence performs point-in-time snapshots of your dataset at specified intervals.
The AOF persistence logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. Redis is able to rewrite the log on background when it gets too big.
If you wish, you can disable persistence at all, if you want your data to just exist as long as the server is running.
It is possible to combine both AOF and RDB in the same instance. Notice that, in this case, when Redis restarts the AOF file will be used to reconstruct the original dataset since it is guaranteed to be the most complete.
This info was quoted from https://redis.io/topics/persistence, which goes into detail about these options.
You can read more from the Antirez weblog: Redis Persistence Demystified

Related

Redis clustering on same servers as app and db server

I would like to setup a cluster for a small app/db (with galera) and was watching for session storage.
The only reason of the cluster is because we need it to be always available, ressources should not be a matter.
I am wondering if I setup my 3 hosts with the same stack :
app
db (sync with galera)
redis (1master+1slave per host clustered between the 3 hosts)
Will my app's containers be able to access every part of redis's sessions stored, no matter where they are ? Or are they bound to the linked redis-container I set it up with at first ?

pgpool node detached for no reason?

I have a two-node PostgreSQL cluster running on VMs where each VM runs both the pgpool service and a Postgres server.
due to insufficient memory configuration the Postgres server crashed, so I've bumped the VM memory and the changed Postgres memory config in the postgresql.conf file. since that memory changes the slave pgpool node detaches every night at a specific time, though when looking at node_exporter metrics regarding CPU, load, processes disk usage or memory didn't show any spikes or sudden changes.
the slave node detaching happened before but not day after day. I've stumbled upon this thread and read this part of the documentation about the failover but Since the Postgres server didn't crash and existing connections to the slave node were working (it kept serving existing connections but didn't take new ones) so network issues seemed irrelevant, especially after consulting with our OPS team on whether they noticed any abnormal network or DNS activity that could explain that. Unfortunately, they didn't notice any interesting findings.
I have pg_exporter, postgres_exporter and node_exporter on each node to monitor the server and VM behavior, what should I be looking for to debug this? what should I ask of our OPS team to check specifically? our pgpool log file only states the failure to access the other node but no exact reason, as the aforementioned docs say:
Pgpool-II does not distinguish each case and just decides that the
particular PostgreSQL node is not available if health check fails.
could it still be a network\DNS issue? and if so. how would I confirm this?
thnx for reading and taking your time to assist me in this conundrum
that was interesting
If summing the gist of it,
it was part of the OPS team infrastructure backups
Now the entire process went like that:
setting the ambiance:
we run on-prem on top of VMWare vCenter cluster backing up on the infra side with VMWare VM snapshot and Veeamm VM backup where the vmdk files\ESXi datastores reside on a NetApp storage based on NFS.
when checking node exporter metrics in Node Exporter Full Dashboard I saw network traffic drop in the order of up to 2 packets per second for about 5 to 15 minutes consistently through the last few months, increasing dramatically in phenomenon length in the last month (around the same time late at night).
Rough illustration:
After checking again with our OPS team they indicated it could be the host configurations\Veeam Backups.
It turns out that because the storage for the VMs (including the one that runs the Veeam backup) is attached via network and not directly on the ESXi hosts, the final snapshot saved\consolidated at that late-night time -
node detaches every night at a specific time
With the way NFS works with disk locking (limiting IOPs to existing data) along with the high IOPs requirements from the Veeam backup cause the server to hang\freeze and sometimes on rare occasions even a VM restart. here's the quote from the Veeam issue doc:
The snapshot removal process significantly lowers the total IOPS that can be delivered by the VM because of additional locks on the VMFS storage due to the increase in metadata updates
the snapshot removal process will easily push that into the 80%+ mark and likely much higher. Most storage arrays will see a significant latency penalty once IOP's get into the 80%+ mark which will of course be detrimental to application performance.
This issue occurs when the target virtual machine and the backup appliance [proxy] reside on two different hosts, and the NFSv3 protocol is used to mount NFS datastores. A limitation in the NFSv3 locking method causes a lock timeout, which pauses the virtual machine being backed up [during snapshot removal].
Obviously, that would interfere at the very least with Postgres functionality especially configured as a cluster with replication that requires a near-constant connection between the Postgres servers. A similar thread on SO using a different DB server
a solution is suggested including solving the issue presented in the last quote in this link, though for the time being, we removed the usage of Veeam backup for sensitive VMs until the solution can be verified locally (will update in the future if tried and fixed issue)
additional incidents documentation: similar issue case, suggested solution info from Veeam, third party site solution (around the same problem as a temp fix as I see it), Reddit thread acknowledging the issue and suggesting options

Persisting Neo4j transactions to external storage

I'm currently working on a new Java application which uses an embedded Neo4j database as its data store. Eventually we'll be deploying to a cloud host which has no persistent data storage available - we're fine while the app is running but as soon as it stops we lose access to anything written to disk.
Therefore I'm trying to come up with a means of persisting data across an application restart. We have the option of capturing any change commands as they come into our application and writing them off somewhere but that means retaining a lifetime of changes and applying them in order as an application node comes back up. Is there any functionality in Neo4j or SDN that we could leverage to capture changes at the Neo4j level and write them off to and AWS S3 store or the like? I have had a look at Neo4j clustering but I don't think that will work either from a technical level (limited protocol support on our cloud platform) or from the cost of an Enterprise licence.
Any assistance would be gratefully accepted...
If you have an embedded Neo4j, you should know where in your code you are performing an update/create/delete query in Neo, no ?
To respond to your question, Neo4j has a TransactionEventHandler (https://neo4j.com/docs/java-reference/current/javadocs/org/neo4j/graphdb/event/TransactionEventHandler.html) that captures all the transaction and tells you what node/rel has been added, updated, deleted.
In fact it's the way to implement triggers in Neo4j.
But in your case I will consider to :
use another cloud provider that allow you to have a storage
if not possible, to implement a hook on the application shutdown that copy the graph.db folder to a storage (do the opposite for the startup)
use Neo4j as a remote server, and to install it on a cloud provider with a storage.

Azure Redis Cache VS Application Cache

[ Azure Redis Cache ]
ConnectionMultiplexer connection = ConnectionMultiplexer.Connect(string.Format("{0},abortConnect=false,ssl=true,password={1}", "reditCacheEndpoint", "reditCachePassword"));
IDatabase cache = connection.GetDatabase();
cache.StringSet("time", DateTime.Now, TimeSpan.FromMinutes(30));
Pros
Available from all your programs in Azure
Doesn't effect the performance of the application
Cons
Expensive
Slower
[ Application Cache ]
System.Web.HttpContext.Current.Cache["time"] = DateTime.Now;
Pros
Faster
Free
Cons
Can only be used in your current app
Effects the performance of the application
I don't really see the reason to why you should use Azure Redis Cache instead of application cache.
If you have problems with the performance due to the cache taking up too much resouces, you can always add more power to your app, instead of scaling up your cache which would be more expensive.
Azure Redis Cache is nice when you need to use it for more than one service, but mostly I only need caching for one service.
Why should I use Azure Redis Cache?
Please add more pros & cons if you feel like i forgot something important.
Redis is more than just a key / value store. It supports many data types like Sorted Sets, Lists, HashSets, Strings etc. If your application needs to use the likes of such, you will benefit from Redis.
If your application is going to be on-premises accessing Redis in the cloud, it is bound to be slower in terms of latency than running Redis on your local machine or the Application cache.
Redis has in-built session state or output-cache providers that help to quickly wire up your websites with your Azure Redis Cache.
An application cache works just fine if you have one instance of your application. As soon as you scale this out, you need to have a distributed cache. At such times, Redis in VMs or Azure Redis Cache will be a solution you should consider.
Last, but not the least, if your application and Redis Cache are in the same azure region, you could get around 1 ms latencies or less for most of your requests, if you follow some of the best practices on connection management and payload sizes.
You should use Redis because it's not only a cache for your objects, but you can also cache the output of your pages. Also, Redis is replicated on Azure Servers, which means that you won't lose state, in case of rebooting or any other problem. I strongly recommend the use of Redis Cache ( Azure Redis Cache or Redis inside a Virtual Machine)

How to dump contents of membase db to seed another cluster

I am using membase 1.7.1 server cluster of 3 machines (vbuckets only), and would like to be able to back up the contents for the -- presumably unlikely -- case that the entire cluster goes down.
I periodically get new data from my provider; I want to keep the old data around more or less indefinitely, and add the new data. Imagine a wine rating application. New vintages come in all the time, but I need to keep the old ones around.
Currently I have a process which does the following:
Download some data from 3rd party provider
Push data into my vbucket; some old data may be overwritten, some data will be new
Hang out until next data update; other processes will be reading the data
What I'd like to do is:
See if my bucket has any data in it
If it doesn't, load from offline storage (see step #5)
Download some data from 3rd party provider
Push data into my vbucket; some old data may be overwritten, some data will be new
Take dump of all data into offline storage
Hang out until next data update; other processes will be reading the data
Steps 1,2, 5 are new.
So the question is about step #5. Is TAP protocol a good way to dump out the contents of my membase bucket? Will it interfere with readers?
The membase documentation recommends the mbbackup tool for backup, which is invoked manually from command-line outside of your application. The data dumped can be restored via mbrestore. The target of mbrestore can be a cluster that's different from the original cluster you ran mbbackup on.
Reference: http://www.couchbase.org/wiki/display/membase/Membase+Server+version+1.7.1+and+up
If you're on AWS, you can run membase on EBS and have the option of snapshotting the EBS volumes over to Amazon S3 periodically.
Reference: http://couchbase.org/forums/thread/correct-way-back-aws-membase-ebs

Resources