If I am to make an online backup using the neo4j-admin backup tool remotely, as is advised by Neo4J, I have to open a public IP and the backup port on my Neo4J application.
However, I don't see neo4j-admin asking for any login credentials, basically making it possible for anybody to access the server and copy all the data while the port is opened.
There is no setting inside the neo4j.conf that would only accept backup requests from a certain address.
So what does it mean? When the online backups are done remotely, as is advised, the database may be vulnerable to somebody else just copying all the data.
I didn't find anything in Neo4J documentation that addresses this flaw (only a warning) and it looks like in more than 7 years that this feature has been available as a part of the commercial enterprise version there has not been any solution offered for this.
What do you do to protect the DB then? At the moment the only solution seems to not back it up remotely, but that causes additional stress on the server and is not the best solution. Plus the online backup is not stable when done locally for large DBs. Another solution could be to only open the port remotely via some kind of API to the server, but that may still be exploited if somebody figures out the time frame when the backup is made.
The documentation states that ne04j-admin must be invoked as the neo4j user. That is the user that owns the neo4j executables and the databases. So the security is handled by the OS login and the file permissions should be set to prevent unathorised access to the neo4j directories/files including the neo4j-admin executable.
Related
I need the VPS services for hosting my ASP.NET project.
However, it's not just asp.net hosting, I also need SQL Server, RabbitMq and either my running conrole app or my windows service.
So I read the suggestions to use Amazon Web Services as they provide first year for free.
However when I registered I found that I don't have a clue of where I am:
I don't see the option of creating a virtual machine with Windows
I don't see the option of setting up SQL Server on such the machine
and so on.
So I was wondering whether I'm in the right place?
Please advise if AWS can provide me with what I need or I came to the wrong place?
AWS can provide all that you listed, but you'll need to do some learning on your end.
Basically you create an EC2 instance, and then use RDP to remote into it, and you can install software and configure it to your hearts content - just like it was any other physical server.
If you want to use SQL Server, you'll have the choice of installing it directly on the instance using your own license, or using their 'hosted' version of SQL Server call RDS. You'll need to read about it and decide which option is better for your project - there is no single right way.
Lastly, I will point out that although the 'free-tier' is nice, except for a really small application (i.e. small db on a low traffic website), you may find out the 'free-tier' does not quite give you all the power you need to run a busy application. I would not base your decision on wether or not you should use AWS on how much 'free' stuff you can get. The free-tier is nice for learning, but plan on spending some money for a truly robust solution.
I'm new to server side programming with a background in iOS. So I want to know where to start.
Here I tried to list some specific questions:
Can I just create a local database and practice on that?
Do the local databases and databases on remote server work the same?
If no, how can I choose which server I can use? (I went through the webpages of AWS cloud service and found they are really overwhelming.)
Arslan's answer is great, but I would like to add to it a bit. You mentioned a Chatroom, so in that case you should look into socket programming. The reason why I bring this up is, while no one has outright said it, you shouldn't create a chat server by read / writing to a database. It's much better to just keep it in memory and log to the database on an as need basis.
AWS is a fantastic solution and they have a lot of different solutions for different situations. You should look at using EC2, which is their server program. They have a free tier of it so that you can use and / or you can test locally. I suggest testing locally then pushing up to a free tier every now and then to make sure everything is running properly.
Also I would look into using CloudKit for data base storage. If you don't need instantaneous communication, it's far easier to use Apple's built in system rather than setup a server and manage it.
links: CloudKit, AWS EC2 Free Tier
As it happens I'm actually working on a ChatRoom Server program, here's the link to github. It is written in C++ so I recommend using it as a reference unless you want to write your own socket in C++.
Can I just create a local database and practice on that?
Sure. You can install a server locally on your machine ( there are plenty of available ) and through 'localhost:3000' or 'localhost' you can access the root of your server depending upon what you are using at server end. You can then configure your server to respond to a particular message.
Do the local databases and databases on remote server work the same?
Of course, the work they way is almost same. The difference you have stated yourself: remote.
If no, how can I choose which server I can use? (I went through the webpages of AWS cloud service and found they are really overwhelming.)
I would suggest you to start from the local server. But first you have to choose language: PHP, Ruby, Python - it depends upon you and your personal preferences. You can also use something like Parse.com. Parse.com is free up to 30 requests/second, and you can use Objective-C to send and retrieve data from the server with a few very easy steps. And of course, parse.com handles singing up and logging in a user for you , all you have to do is to write a code of few lines in your iOS app.
Download Apple's free Server.app from the Appstore, it wraps one of the best database management systems: PostgreSQL. Start it with this Terminal command:
sudo serveradmin start postgres
More info on these pages:
http://support.apple.com/kb/HT5583
http://www.postgresql.org
I have an application that connects to a database and can be used in multi-user mode, whereby multiple computers can connect the the same database server to view and modify data. One of the clients is always designated to be the 'Master' client. This master also receives text information from either RS232 or UDP input and logs this data every second to a text file on the local machine.
My issue is that the other clients need to access this data from the Master client. I am just wondering the best and most efficient way to proceed to solve this problem. I am considering two options:
Write a folder synchronize class to synchronize the folder on the remote (Master) computer with the folder on the local (client) computer. This would be a threaded, buffered file copying routine.
Implement a client/server so that the Master computer can serve this data to any client that connects and requests the data. The master would send the file over TCP/UDP to the requesting client.
The solution will have to take the following into account:
a. The log files are being written to every second. It must avoid any potential file locking issues.
b. The copying routine should only copy files that have been modified at a later date than the ones already on the client machine.
c. Be as efficient as possible
d. All machines are on a LAN
e. The synchronization need only be performed, say, every 10 minutes or so.
f. The amount of data is only in the order of ~50MB, but once the initial (first) sync is complete, then the amount of data to transfer would only be in the order of ~1MB. This will increase in the future
Which would be the better method to use? What are the pros/cons? I have also seen the Fast File Copy post which i am considering using.
If you use a database, why the "master" writes data to a text file instead of to the database, if those data needs to be shared?
Why invent the wheel? Use rsync instead. Package for windows: cwrsync.
For example, on the Master machine install rsync server, and on the client machines install rsync clients or simply drop files in your project directory. Whenever needed your application on a client machine shall execute rsync.exe requesting to synchronize necessary files from the server.
In order to copy open files you will need to setup Windows Volume Shadow Copy service. Here's a very detailed description on how the Master machine can be setup to allow copying of open files using Windows Volume Shadow Copy.
Write a web service interface, so that the clients an connect to the server and pull new data as needed. Or, you could write it as a subscribe/push mechanism so that clients connect to the server, "subscribe", and then the server pushes all new content to the registered clients. Clients would need to fully sync (get all changes since last sync) when registering, in case they were offline when updates occurred.
Both solutions would work just fine on the LAN, the choice is yours. You might want to also consider those issues related to the technology you choose:
Deployment flexibility. Using file shares and file copy requires file sharing to work, and all LAN users might gain access to the log files.
Longer term plans: File shares are only good on the local network, while IP based solutions work over routed networks, including Internet.
The file-based solution would be significantly easier to implement compared to the IP solution.
I want to know which is the best architecture to adopt for this case :
I have many shops that connect to a web application developed using Ruby on Rails.
internet is not reachable all the time
The solution was to develop an offline system which requires installing a local copy of the distant database.
All this wad already developed.
Now what I want to do :
Work always on the local copy of the database.
Any change on the local database should be synchronized with distant database.
All the local copies should have the same data in other local copies.
To resolve this problem I thought about using a JMS like software eventually Rabbit MQ.
This consists on pushing any sql request into a JMS queue that will be executed on the distant instance of the application which will insert into the distant DB and push the insert or SQL statement into another queue that will be read by all the local instances. This seems complicated and should slow down the application.
Is there a design or recommendation that I must apply to resolve this kind of problem ?
You can do that but essentially you are developing your own replication engine. Those things can be a bit tricky to get right (what happens if m1 and m3 are executed on replica r1, but m2 isn't?) I wouldn't want to develop something like that unless you are sure you have the resources to make it work.
I would look into existing off-the shelf replication solution. If you are already using a SQL DB it probably has some support for it. Look here for more details if you are using MySQL
Alternatively, if you are willing to explore other backends, I heard that CouchDB has great support for replication. I also heard of people using git libraries to do that sort of thing.
Update: After your comment, I realize you already use MySql replication and are looking for solution for re-syncing the databases after being offline.
Even in that case RabbitMQ doesn't help you at all since it requires constant connection to work, so you are back to square one. Easiest solution would be to just write all the changes (SQL commands) into a text file at a remote location, then when you get connection back copy that file (scp, ftp, emaill or whatever) to master server, run all the commands there and then just resync all the replicas.
Depending on your specific project you may also need to make sure there are no conflicts when running commands from different remote location but there is no general technical solution to this. Again, depending on the project, you may want to cancel one of the transactions, notify the users that it happened and so on.
I would recommend taking a look at CouchDB. It's a non-SQL database that does exactly what you are describing automatically. It's used especially in phone applications that often don't have internet or data connectivity. The idea is that you have a local copy of a CouchDB database and one or more remote CouchDB databases. The CouchDB server then takes care of teh replication of the distributed systems and you always work off your local database. This approach is nice because you don't have to build your own distributed replication engine. For more details I would take a look at the 'Distributed Updates and Replication' section of their documentation.
What I want to do: My application has a full connection to a Derby DB, and I want to poke around in the DB (read-only) in parallel (using a different tool).
I'm not sure how Derby actually works internally, but I understand that I can have only 1 active connection to a Derby DB.
However, since the DB is only consisting of files on my HDD, shouldn't I be able to open additional connections to it, in read-only mode?
Are there any tools to do just that?
There are two possibilities how to run Apache Derby DB.
Embedded: You run DB within your application → only one connection possible
Client: You start DB as server in separate process → classic DB with many connections
You can recognize the type upon driver size. If the driver has more then 2MB that you use embedded version.
Update
When you startup the derby engine (server or embedded) it gets exclusive access to database files.
If you need to access a single database from more than one Java Virtual Machine (JVM), you will need to put a server solution in place. You can allow applications from multiple JVMs that need to access that database to connect to the server.
For details see Double-booting system behavior.
I realize this is an old question, but I thought I might add a little more detail on a solution since links in the currently accepted answer are broken.
It is possible to run the Derby Network Server within a JVM that is using the embedded database already. The code that is using the embedded Derby database doesn't need to change anything and can keep using the DB as is, but with the Derby Network Server started, other programs can connect to derby and access the database.
All you need to do is ensure that derbynet.jar is on the classpath
And then you can do one of the following
Include the following line in the derby.properties file: derby.drda.startNetworkServer=true
Specify the property as a system property at java start
java -Dderby.drda.startNetworkServer=true
You can use the NetworkServerControl API to start the Network Server from a separate thread within a Java application:
NetworkServerControl server = new NetworkServerControl();
server.start (new PrintWriter(System.out));
More details here: http://db.apache.org/derby/docs/10.9/adminguide/tadminconfig814963.html
Keep in mind that doing this does not enable any security on this connection, so it is not a good idea to do this on a production system. It is possible to add security though and that is documented here: http://db.apache.org/derby/docs/10.9/adminguide/cadminnetservsecurity.html
Two other ideas:
In your application, shut down the database and close the connection when the database is not actively in use. Then your application won't interfere with another tool which is trying to open the database.
Make a copy of your database, by taking a backup (you can do this while the database is open by your application), then restore that backup to a separate place on your disk. Then you can use another tool to access the copied database at your ease.
If you can afford the memory and do not need up-to-date data, then you can access read-only databases from multiple JVMs by creating in-memory copies:
ij> connect 'jdbc:derby:memory:memdb;restoreFrom=mydb';