I am trying to create an iOS app, which will transfer the files from an iPhone to a server, process them there, and return the result to the app instantly.
I have noticed that AWS offers an SDK to transfer files from iOS app to S3, but not to EC2 (or at least to EBS which can be attached to EC2). I wonder why I have to go through S3, when my business logic doesn't warrant storage of files. I have used file system softwares such as s3cmd and s3fs to connect to S3 from EC2, but they are very slow at transferring files to EC2. I am concerned that the route through S3 will kill time, especially when the users expect result in a split second.
Could you please guide me on how can I bypass the S3 route to transfer files in real time from iOS app to EC2 (or EBS)?
Allowing an app to write directly to an instance file system is a non starter, short of treating it as a network drive which would be pretty convoluted, not to mention the security issues youll almost certainly have. This really is what s3 is there for. You say you are seeing bad performance between ec2 and s3, this does not sound at all right, this will be an internal datacenter connection which would be at the very least several orders of magnitude faster than a connection from a mobile device to the amazon datacentre. Are you sure you created your bucket and instance in the same region? Alternatively it might be the clients you're using, dont try and setup file system access, just use the aws cli.
If you are really tied to the idea of going direct to the ec2 instance you will need to do it via some network connection, either by running a web server or perhaps using some variety of copy over ssh if that is available on ios. It does seem pointless to set this up when s3 has already done it for you. Finally depending on how big the files are you may be able to get away with sqs or some kind of database store.
It's okay being a newbie!! I ran up against exactly the same processing problem and solved it by running a series of load-balanced webservers where the mobile calls an upload utility, uploads the file, processes it, and then deploys the results to s3 using a signed URL which the mobile can display. It is fast, reliable and secure. The results are cached using CloudFront so once written, are blazing fast to re-access on the mobile. hope this helps.
Related
I am looking for a mechanism to accomplish a two-way storage mirroring.
I have two storages, both used for reads & writes at the same time.
any file wrote to one of these storages should be available for reading in the second one ASAP (the period of time should exceed no more than a few seconds).
in case one storage is down, the second one is already a full copy, and can serve any file requested.
new files should be synced to the breaking storage once it's up again.
for more case understanding here is my use case:
I am deploying an asp.net application into two sites (Site-A | Site-B), with a load balancer in between.
each site will have its own NAS storage (Storage-A | Storage-B).
Now when a user uploads a file to the application it will be saved to one storage which is linked to the site that handled the request, let's assume it was Storage-A.
Then, another user needs to download the file, but now his request handled by Site-B
means the file will be looked for inside Storage-B, and it should be available through the two-way mirroring.
Further information:
there is a 5-kilometer distance between the sites, and it's all private network and has no internet access.
network speed is 1Gb but can be increased if needed.
OS used is Windows server 2019.
I've searched a lot but all solution founds were including cloud services or clustering with one way mirroring.
happy to hear any suggestions, and pardon my deliver as it's the first question for me here.
I have this idea for what I think should happen with my project, but I want to check in and see if this works on a theoretical level first. Basically I am working on a Django site that is run on Kubernetes, but am struggling a little bit about which way I should set up my replicationSet/statefulSet to manage uploaded content (images).
My biggest concern is trying to find out how to scale and maintain uploaded content. My first idea is that I need to have a single volume that has these files written to it, but can I have multiple pods write to the same volume that way while scaling?
From what I have gathered, it doesn't seem to work that way. It sounds more like each pod, or at least each node, would have it's own volume. But would a request for an image reach the volume it is stored on? Or should I create a custom backend program to move things around so that it is served off of a NGINX server like my other static content?
FYI - this its my first scalable project lol. But I am really just trying to find the best way to manage uploads... or a way in general. I would appreciate any explanations, thoughts, or fancy diagrams on how something like this might work!
Hello I think you should forget kubernetes a bit and think of the architecture and capabilities of your Django application. I guess you have built a web app, that offers some 'upload image' functionality, and then you have code that 'stores' this image somewhere. On the very simple scenario if you run your app on your laptop, the you web app, is configured to save this content to a local folder, a more advanced example is that you deploy your application to a VM or a cloud vm e.g an AWS EC2 instance, and your app is saving the files to the local storage of this EC2 instance. The question is twofold - what happens if we have 2 instances of your web app deployed - can the be configured and run - so that they 'share' the same folder to save the images? I guess this is what you want, other wise your app would not scale horizontally , each user would have to hit each individual instance - in order to upload or retrieve specific images. So having that in mind that is a design decision of your application, which I am pretty sure you have already worked it out, the you need to think - how can I share a folder? a bucket so that all my instances of my web app can save files? If you spinned 3 different vms, on any cloud, you would have to use some kind of clour storage, so that all three instances point to the same physical storage location, or an NFS drive or you could save your data to a cloud storage service S3!
Having all the above in mind, and clearly understanding that you need to decouple your application from the notion of locale storage especially if you want to make it as as stateless as it gets (whatever that means to you), having your web app, which is packaged as a docker container and deployed in a kubernetes cluster as a pod - and saving files to the local storage is not going to get any far, since each pod, each docker container will use the underlying kubernetes worker (vm) storage to save files, so another instance will be saving files on some other vm etc etc.
Kubernetes provides this kind of abstraction for applications (pods) that want to 'share' within the kubernetes cluster, some local storage and of course persist it. Something that I did not add above is that pod and worker storage (meaning if you save files in the kubernetes worker or pod) once this vm / instance is restarted you will loose your data. So you want something durable.
To cut a long story short,
1) you can either to deploy your application / pod along with a Persistent Volume Claim assuming that your kubernetes cluster supports it. What is happening is that you can mount to your pod some kind of folder / storage which will be backed up by whatever is available to your cluster - some kind of NFS store. https://kubernetes.io/docs/concepts/storage/persistent-volumes/
2) You can 'outsource' this need to share a common local storage to some external provider, e.g a common case use an S3 bucket, and not tackle the problem on kubernetes - just keep and provision the app within kubernetes.
I hope I gave you some basic ideas.
Note: Kubernetes 1.14 now (March 2019) comes with Durable Local Storage Management is Now GA, which:
Makes locally attached (non-network attached) storage available as a persistent volume source.
Allows users to take advantage of the typically cheaper and improved performance of persistent local storage kubernetes/kubernetes: #73525, #74391, #74769 kubernetes/enhancements: #121 (kep)
That might help securing a truly persistent storage for your case.
As noted by x-yuri in the comments:
See more with "Kubernetes 1.14: Local Persistent Volumes GA", from Michelle Au (Google), Matt Schallert (Uber), Celina Ward (Uber).
you could use ipfs https://pypi.org/project/django-ipfs-storage/
creating a container with this image https://hub.docker.com/r/ipfs/go-ipfs/ in the same pod you can ref as 'localhost'
I have a node app that allows people to upload their profile picture. Profile pictures are stored on the file system.
I now what to turn my node app into a docker container.
I would like to be able to deploy it pretty much anywhere (Amazon, etc.) and realise that storing files within the file system is a no-go.
So:
Option 1: store files on Amazon's S3 (or something equivalent)
Option 2: creating a "data volume. This makes me wonder: if I deploy this remotely, will this work? Would this be a long-term way to go about it?
Are volumes what I want to do here? Is this how you use docker volumes in Amazon?
(Damn this stuff is hard to crack...)
The answer is: that depends hehehe
Option 1 is good, resilient, and works just "out-of-the-box", but creates a vendor lock-in. Meaning that if you ever decide to stop using AWS, you'll have some code to refactor. Plus bills for S3 will be high if you perform lots and LOTs of requests.
Option 2 works partially: your docker containers will likely run on VMs on AWS, Azure, etc. which are also ephemeral. Meaning your data can disappear just as quick as if they were on containers, unless you backup them.
Other options I know:
Option 3: AWS have a NFS service (sorry, can't remember its name right now), which seems VERY interesting. In theory, it would be like plugging an USB drive storage to VM instances, which can be mounted as volumes inside the containers. I have never done this myself, but seems viable to reduce S3 costs. However this also generates vendor lock-in.
Option 4: AWS S3 with a caching mechanism for files in the VMs.
If you are just testing, option 1 sounds good! But later on you might have to work on that.
I am running a Rails app on Bluemix and want to use carrierwave for file uploads. So far no problem as I am using external storage to persist the files (ftp, s3, webdav etc.). However, in order to keep performance well I need to enable caching with carrierewave_backgrounder - and here it starts to get tricky. Thing is that I need to specify a temp folder for backgrounding the upload process (temp folder where the file remains before it is persisted on the actual storage), which is shared between all possible workers and app instances. If so at all, how can this be achieved?
Check out Object Storage - you can store files and then delete them when you no longer have a need for them. Redis is another option, as are any of the noSQL databases available on Bluemix.
typically in any cloud you never store on a file system of your VM or PaaS environment - reason being when you scale out, you have multiple VMS and a file written on one VM will not be available when 100s of VMs come up. The recommend practice is to look for storage services that the cloud platform provides. In Bluemix you have Storage options such as Cloud Object Storage, File Storgae and Block Storage.
As suggested before - you can take a look at the cloud object storage and utilize the service. Here is the documentation for Cloud Object Storage: https://ibm-public-cos.github.io/crs-docs/?&cm_mc_uid=06526771022514957173672&cm_mc_sid_50200000=1498597403&cm_mc_sid_52640000=1498599343. This contains quick start guide, storing, retrieving and API usage. Hope this helps.
I'm new to server side programming with a background in iOS. So I want to know where to start.
Here I tried to list some specific questions:
Can I just create a local database and practice on that?
Do the local databases and databases on remote server work the same?
If no, how can I choose which server I can use? (I went through the webpages of AWS cloud service and found they are really overwhelming.)
Arslan's answer is great, but I would like to add to it a bit. You mentioned a Chatroom, so in that case you should look into socket programming. The reason why I bring this up is, while no one has outright said it, you shouldn't create a chat server by read / writing to a database. It's much better to just keep it in memory and log to the database on an as need basis.
AWS is a fantastic solution and they have a lot of different solutions for different situations. You should look at using EC2, which is their server program. They have a free tier of it so that you can use and / or you can test locally. I suggest testing locally then pushing up to a free tier every now and then to make sure everything is running properly.
Also I would look into using CloudKit for data base storage. If you don't need instantaneous communication, it's far easier to use Apple's built in system rather than setup a server and manage it.
links: CloudKit, AWS EC2 Free Tier
As it happens I'm actually working on a ChatRoom Server program, here's the link to github. It is written in C++ so I recommend using it as a reference unless you want to write your own socket in C++.
Can I just create a local database and practice on that?
Sure. You can install a server locally on your machine ( there are plenty of available ) and through 'localhost:3000' or 'localhost' you can access the root of your server depending upon what you are using at server end. You can then configure your server to respond to a particular message.
Do the local databases and databases on remote server work the same?
Of course, the work they way is almost same. The difference you have stated yourself: remote.
If no, how can I choose which server I can use? (I went through the webpages of AWS cloud service and found they are really overwhelming.)
I would suggest you to start from the local server. But first you have to choose language: PHP, Ruby, Python - it depends upon you and your personal preferences. You can also use something like Parse.com. Parse.com is free up to 30 requests/second, and you can use Objective-C to send and retrieve data from the server with a few very easy steps. And of course, parse.com handles singing up and logging in a user for you , all you have to do is to write a code of few lines in your iOS app.
Download Apple's free Server.app from the Appstore, it wraps one of the best database management systems: PostgreSQL. Start it with this Terminal command:
sudo serveradmin start postgres
More info on these pages:
http://support.apple.com/kb/HT5583
http://www.postgresql.org