I'm attempting to submit a large database containing many tables to a web service by sending the data via JSON. Extracting the data and converting it to a JSON string is working fine but so far I have only implemented it to send one table at a time each with its own ASIHTTPRequest. My question is whether or not concatenating all the JSON strings generated from each table is a good idea or if I should first combine the tables in their abstract data form, before converting all of them together to JSON?
Alternatively if there is any other suggestion that would be good too.
It entirely depends on your needs. If the tables are unrelated, multiple requests may be more appropriate because if a request fails (timeouts or loss of connection), it won't affect any other requests. However if you have tables with associations with one another, it would be better to send it all in one go so either all the data transmitted wholly or did not so you don't end up with broken associations.
You can't just "concatenate" JSON strings. The result will not be legal JSON. You need to somehow "splice" them.
And, of course, the server on the other end must be capable of parsing the resulting JSON -- it may only expect one table at a time, eg.
I dont see any issue in doing any one of the two choices you proposed
But i would suggest concatenate the tables in the database before converting so that you dont deal with string concatenations and other form of processes
Related
Is it possible to deserialize a subset of fields from a large object serialized using Apache Avro without deserializing all the fields? I'm using GenericDatumReader and the GenericRecord contains all the fields.
I'm pretty sure you can't do it using GenericDatumReader, but my question is whether it is possible given the binary format of Avro.
Conceptually, binary serialization of Avro data is in-order and depth-first. As you traverse the data, record fields are serialized one after the other, lists are serialized from the top to the bottom, etc.
Within one object, there no markers to separate fields, no tags to identify specific fields, and no index into the binary data to help quickly scan to specific fields.
Depending on your schema, you could write custom code to skip some kinds of data ... for example, if a field is a LIST of FIXED bytes, you could read the size of the list and just jump over the data to the next field. This is pretty specific and wouldn't work for most Avro types though (notably integers are variable length when encoded).
Even in that unlikely case, I don't believe there are any helpers in the Java SDK that would be useful.
In brief, Avro isn't designed to do that, and you're probably not going to find a satisfactory way to do a projection on your Schema without deserializing the entire object. If you have a collection, column-oriented persistence like Parquet is probably the right thing to do!
It is possible if the fields you want to read occur first in the record. We do this in some cases where we want to read only the header fields of an object, not the full data which follows.
You can create a "subset" schema containing just those first fields, and pass this to GenericDatumReader. Avro will deserialise those fields, and anything which comes after will be ignored, because the schema doesn't "know" about it.
But this won't work for the general case where you want to pick out fields from within the middle of a record.
In my project I have to parse JSON schema, that comes from server.
It has object "Properties", which in fact like Dictionary in curly braces. And, of course, JSONSerialization.jsonObject parses it as Dictionary.
Everything looks like OK, BUT: I use these Properties for building my view (it defines fields to be fiiled by user). Finally, I have to save order of these fields! But, as we know, immediately after the object is parsed to Dictionary, it looses keys order. Anybody knows how can I parse these object, saving fields order?
Additional information:
Structure of Properties is build by user in WEB, so their count is avsolutely random for mobile client. Furthermore, Every object in properties (e.g. Group) can have its own properties, containing other objects. So we have absolutely random tree of nested objects. And their order is necessary for us.
If you don't care about interoperability, meaning 3rd parties also being able to rely on order, you can try to find a parser that preserves order (such as by reading it into an OrderedMap in Python instead of a regular dict- obviously this will differ by language.)
If you care about 3rd parties, it's trickier. As the last person to respond noted, JSON itself does not support this, and JSON Schema is just JSON as far as parsing goes.
I am building a DW. The sources are comming from rest API that returns Json. I need to design a staging area. I think I have 2 approaches:
1. Transform Json into a relational model.
2. Store the Json into a relational table using a key value. The key is going to be a field that I will use to performs join. The value is going to be the Json.
The first one is a by the book approach but I think it's harder to maintain. The second one is easier to maintain, but complicated to do complex queries.
Which are the drawbacks from each solution? Opinions are accepted.
Approach 1 is good for Data warehouse and second approach fits on Data lake scenario .
JSON - Jtore whole details intact to one document, We will be storing key unnecessary (increase size of data base performance hit) for each document , it will complicated /performance hit query where result recurred cross doc which is a common in DW scenarion ..
I want to know which is more efficient in terms of speed and property limitations of Neo4j.. (I'm using Ruby on Rails 3.2 and REST)
I'm wondering whether I should be storing node properties in a single property, much like a database table, or storing most/all for a node in a single node property but in JSON format.
Right now in a test system I have 1000 nodes with a total of 10000 properties.. Obviously the number of properties is going to skyrocket as more features and new node types are added to my system.
So I was considering storing all the non-searchable properties for a node in an embedded JSON structure.. Except this seems like it will put more burden on the web servers, having to parse the JSON after retrieving it, etc. (I'm going to use a single property field with JSON for activity feed nodes, but I'm addressing things like photo nodes, profile nodes etc).
Any advice here? Keep things in separate properties? A hybrid of JSON and individual properties?
What is your goal by storing things in JSON? Do you think you'll hit the 67B limit (which will be going up in 2.1 in a few months to something much larger)?
From a low level store standpoint, there isn't much difference between storing a long string and storing many shorter properties. The main thing you're doing is preventing yourself from using those fields in a query.
Also, if you're using REST, you're going to have to do JSON parsing anyway, so it's not like you're going to completely avoid that.
I'm trying to convert an XML document into a dataset that I can import into a database (like SQLite or MySQL) that I can query from.
It's an XML file that holds most of the stuff in attributes. This is part of a Rails project so I'm very inclined to use Ruby (and that's the language I'm most comfortable with at the moment).
I'm not sure how to go about doing that and I'd welcome both high-level and low-level contributions.
xmlsimple can convert your xml into a Ruby object (or nested object) which you can then look over and do whatever you like with. Makes working XML in Ruby really easy. As Jim says though depends on your XML complexity and your needs.
There are three basic approaches:
Use ruby's xml stream parsing facilities to process the data with ruby code and write the appropriate rows to the database.
Transform the xml using xslt to a non-XML stream format and feed that into a ruby program that updates the database
Transform the xml with xslt into a format acceptable to the bulk-loading tool for whatever database you are using.
Only you can determine the best approach depending on the XML schema complexity and the type of mapping you have to perform to get it into relational format.
It might help if you could post a sample of the XML and the DB schema you have to populate.
Will it load model data? If you're on *nix take a look at libxml-ruby. =)
With it you can load the XML, and iteration through the nodes you can create your AR objects.
You can have a look at the XMLMapping gem. It lets you define different classes depending upon the structure of your XML. Now you can create objects from those classes.
Now you will have to write some module which actually converts these XMLMapping objects into ActiveRecord objects. Once those are converted to AR objects you can simply call save to save those objects into the corresponding tables.
It is a long solution but it will let you create objects out of your XML without iterating over it. XMLMapping will do it for you.
Have you considered loading the data into an XML database?
Without knowing what the structure of the data is, I have no idea what the benefits of an RDBMS over an XML DB are.