Splitting up EDI orders - edi

We are implementing edi order processing and are currently about to check how we set up the interface.
Usually a order may carry multiple positions. For our customer it seems that an incoming order may need to be split up in multiple orders if they will be manufactured in different factories due to erp system requirements.
For me its sounds confusing if a customer does one order and gets back 3 different order numbers. One solution could be one position per order policy, however this increases the number of orders even when not needed.
The customer is sending an "ORDERS" message and will receive an order confirmation "ORDRSP".
I'm wondering whats the common solution to solve this problem? Manually keeping track of the sub orders thems to be no option has they are confirmed at different times, etc..

different solutions:
You split up orders: retailer will receive different shipments, different ASN's, etc (all referring to the same RetailersOrderNumber). But...not all retailers want this/can handle this.
Retailer splits up the orders (in most cases this means you will be 3 different manufacturers in their system ;-)
Factories deliver to one (manufacturers) DC where all the order/shipment handling is done.
(AFAIK these are the possibilities....)
kind regards, henk-jan

Related

Rails & Postgres: Best practice for many boolean relationship entries

I'm looking for input on what's best practice for my problem. In abstract terms, I need to store a lot of relationships and need to prioritise between database size, query speed and “ease of maintenance”. The setup is Ruby on Rails with PostgreSQL.
More specific description: Imagine a website that's essentially a searchable database of Products sold by Vendors. Vendors may not ship worldwide, and I want to filter out Products by Vendors who won't ship to a user based on their GeoIP country. To make matters slightly more complex, a Product page can have two different features (let’s call them F1 and F2) that are separately geo-IP-dependent.
Example: A Vendor may want their Product pages to have feature F1 for all countries worldwide except a few e.g. because of embargos; but feature F2 may only be available for countries within e.g. Europe.
Country filters will always be set at the Vendor level.
The “search” function of the website is a basic SQL search in which I want Products to show up if at least one of the features is available for the current user’s country.
The website will allow in the range of 1,000 to 3,000 Vendors (this is a hard limit), and a total of around 10,000 to 50,000 Products. Let's assume that in the beginning, filtering is only relevant to around 100 Vendors.
I had the following ideas and hope that others have feedback on these, or additional approaches:
One relation model CountryVendor with two boolean columns (in which case optionally, a Product could still be shown if the respective country_vendor does not yet exist; i.e. show if !country_vendor&.allow).
Assuming ca. 200 countries, this would imply ca. 2,000 rows in the beginning, and around 600k rows if filters were in place for every one of the 3,000 potential Vendors.
(Theoretically, if non-existence is treatet as true, I could also set up a rake task that removes rows that are false for both features, thus reigning in the table size.)
Two relation models CountryVendorF1 and CountryVendorF2, each with just one boolean column. Not sure if this will effectively be much different, but I imagine it closer to how I think of the UI for setting up the country filters (without going into detail here).
Two JSON columns in the Vendors table that would store true/false for each Country. (Maybe with an ISO code string as the index for simplicity.) There wouldn’t be thousands of new rows, although the DB would still grow in size, but querying might become slow.

Fact table linked to Slowly Changing Dimension

I'm struggling to understand the best way to model a particular scenario for a data warehouse.
I have a Person dimension, and a Tenancy dimension. A person could be on 0, 1 or (rarely) multiple tenancies at any one time, and will often have a succession of tenancies over time. A tenancy could have one or more people associated with it. The people associated with a tenancy can change over time, and tenancies generally last for many years.
One option is to add tenancy reference, start and end dates to the Person Dimension as type 2 SCD columns. This would work well as long as I ignore the possibility of multiple concurrent tenancies for a person. However, I have other areas of the data warehouse where I am facing a similar design issue and ignoring multiple relationships is not a possibility.
Another option is to model the relationship as an accumulating snapshot fact table. I'm not sure how well this would work in practice though as I could only link it to one version of a Person and Tenancy (both of which will have type 2 SCD columns) and that would seem to make it impossible to produce current or historical reports that link people and tenancies together.
Are there any recommended ways of modelling this type of relationship?
Edit based on the patient answer and comments given by SQL.Injection
I've produced a basic model showing the model as described by SQL.Injection.
I've moved tenancy start/end dates to the 'junk' dimension (Dim.Tenancy) and added Person tenancy start/end dates to the fact table as I felt that was a more accurate way to describe the relationship.
However, now that I see it visually I don't think that this is fundamentally any different from the model that I started with, other than the fact table is a periodic snapshot rather than an accumulating snapshot. It certainly seems to suffer from the same flaw that whenever I update a type 2 slowly changing attribute in any of the dimensions it is not reflected in the fact.
In order to make this work to reflect current changes and also allow historical reporting it seems that I will have to add a row to the fact table every time a SCD2 change occurs on any of the dimensions. Then, in order to prevent over-counting by joining to multiple versions of the same entity I will also need to add new versions of the other related dimensions so that I have new keys to join on.
I need to think about this some more. I'm beginning to think that the database model is right and that it's my understanding of how the model will be used that is wrong.
In the meantime any comments or suggestions are welcome!
Your problem is similar to to the sale transactions with multiple item. The difference, is that a transaction usually has multiple items and your tenancy fact usually has a single person (the tenant).
Your hydra is born because you are trying to model the tenancy as a dimension, when you should be modeling it as a fact.
The reason why I think you have a tenancy dimension, is because somewhere you have a fact rent. To model the fact rent consider use the same approach i stated above, if two persons are tenants of the same property two fact records should be inserted each month:
1) And now comes some magic (that is no magic at all), split the value of the of the rent by the number of tenants and store it the fact
2) store also the full value of the rent (you don't know how the data scientist is going to use the data)
3) check 1) with the business user (i mean people that build the risk models); there might be some advanced rule on how to do the spliting (a similar thing happens when the cost of shipping is to be divided across multiple item lines of the same order -- it might not be uniformly distributed)

How to create an Order Model with a type field that dictates other fields

I'm building a Ruby on Rails App for a business and will be utilizing an ActiveRecord database. My question really has to do with Database Architecture and really the best way I should organize all the different tables and models within my app. So the App I'm building is going to have a database of orders for an ECommerce Business that sells products through 2 different channels, a subscription service where they pick the products and sell it for a fixed monthly fee and a traditional ECommerce channel, where customers pay for their products directly. So essentially while all of these would be classified as the Order model, there are two types of Orders: Subscription Order and Regular Order.
So initially I thought I would classify all this activity in my Orders Table and include a field 'Type' that would indicate whether it is a subscription order or a regular order. My issue is that there are a bunch of fields that I would need that would be specific to each type. For instance, transaction_id, batch_id and sub_id are all fields that would only be present if that order type was a subscription, and conversely would be absent if the order type was regular.
My question is, would it be in my best interest to just create two separate tables, one for subscription orders and one for regular orders? Or is there a way that fields could only appear conditional on what the Type field is? I would hate to see so many Nil values, for instance, if the order type was a regular order.
Sorry this question isn't as technical as it is just pertaining to best practice and organization.
Thanks,
Sunny
What you've described is a pattern called Single Table Inheritance — aka, having one table store data for different types of objects with different behavior.
Generally, people will tell you not to do it, since it leads to a lot of empty fields in your database which will hurt performance long term. It also just looks gross.
You should probably instead store the data in separate tables. If you want to get fancy, you can try to implement Class Table Inheritance, in which there are actually separate but connected table for each of the child classes. This isn't supported natively by ActiveRecord. This gem and this gem might be able to help you, but I've never used either, so I can't give you a firm recommendation.
I would keep all of my orders in one table. You could create a second table for "subscription order information" that would only contain the columns transaction_id, batch_id and sub_id as well as a primary key to link it back to the main orders table. You would still want to include an order type column in the main database though to make it a little easier when debugging.
Assuming you're using Postgres, I might lean towards an Hstore for that.
Some reading:
http://www.devmynd.com/blog/2013-3-single-table-inheritance-hstore-lovely-combination
https://github.com/devmynd/hstore_accessor
Make an integer column called order_type.
In the model do:
SUBSCRIPTION = 0
ONLINE = 1
...
It'll query better than strings and whenever you want to call one you do Order:SUBSCRIPTION.
Make two+ other tables with a foreign key equal to whatever the ID of the corresponding row in orders.
Now you can keep all shared data in the orders table, for easy querying, and all unique data in the other tables so you don't have bloated models.

Handling lots of COUNT queries for a report

I am putting together a report that shows statistical information about products for a company that owns those products. This report, in the form I need, contains as many as 150 'counts', because we are filling the table with the counts for 12 product types against 15 different statistical categories.
Here's the set up of the models. I'm afraid it's a little complicated!
Company is the entity accessing the report.
Company has many Products through Matchings; and
Product has many Companies through Matchings.
Matching belongs_to Order.
Example report:
___________|_Available/Active/Light Available/Active/Heavy (+12 columns)__
Perishable |
Intangible |
(+10 rows) |
The product types are in the Product table (they run down the left side of the report).
The categories across the top of the report are combinations of three criteria: two from Product and one from Order.
Example - for one cell in the Perishable row, show me how many matchings exist for whom the order type is 'active', the product's weight is 'light' and the product status is 'available'.
On its own the above query is not too bad, but if I keep going like this I'm going to have ~170 queries for this report - both an inelegant and highly impractical solution. Is there a magic ActiveRecord way to deal with this scenario?
You could always create a background job to run regularly and pre-cache the results, or pre-generate the entire report. This would free your users from having to sit and wait for 170 queries to run, and I assume it would be acceptable to have slightly stale results.
As for the elegance and practicality of it, the only magic you could use is SQL. Your object model wasn't built for reporting, don't feel bad about using a tool that was.
There is a statistics gem that does this sort of thing. It does allow you to cache the statistics.
I've used it for lightweight statistics like counts and averages but have never taken benchmarks, which is definitely something you'll want to do if performance is a concern.

Generating sequential numbers in multi-user saas application

How do people generate auto_incrementing integers for a particular user in a typical saas application?
For example, the invoice numbers for all the invoices for a particular user should be auto_incrementing and start from 1. The rails id field can't be used in this case, as it's shared amongst all the users.
Off the top of my head, I could count all the invoices a user has, and then add 1, but does anyone know of any better solution?
Typical solution for any relation database could be a table like
user_invoice_numbers (user_id int primary key clustered, last_id int)
and a stored procedure or a SQL query like
update user_invoice_numbers set last_id = last_id + 1 where user_id = #user_id
select last_id from user_invoice_numbers where user_id = #user_id
It will work for users (if each user has a few simultaneously running transactions) but will not work for companies (for example when you need companies_invoice_numbers) because transactions from different users inside the same company may block each other and there will be a performance bottleneck in this table.
The most important functional requirement you should check is whether your system is allowed to have gaps in invoice numbering or not. When you use standard auto_increment, you allow gaps, because in most database I know, when you rollback transaction, the incremented number will not be rolled back. Having this in mind, you can improve performance using one of the following guidelines
1) Exclude the procedure that you use for getting new numbers from the long running transactions. Let's suppose that insert into invoice procedure is a long running transaction with complex server-side logic. In this case you first acquire a new id , and then, in separate transaction insert new invoice. If last transaction will be rolled back, auto-number will not decrease. But user_invoice_numbers will not be locked for long time, so a lot of simultaneous users could insert invoices at the same time
2) Do not use a traditional transactional database to store the data with last id for each user. When you need to maintain simple list of keys and values there are lot of small but fast database engines that can do that work for you. List of Key/Value databases. Probably memcached is the most popular. In the past, I saw the projects where simple key/value storages where implemented using Windows Registry or even a file system. There was a directory where each file name was the key and inside each file was the last id. And this rough solution was still better then using SQL table, because locks were issued and released very quickly and were not involved into transaction scope.
Well, if my proposal for the optimization seems to be overcomplicated for your project, forget about this now, until you will actually run into performance issues. In most projects simple method with an additional table will work pretty fast.
You could introduce another table associated with your "users" table that tracks the most recent invoice number for a user. However, reading this value will result in a database query, so you might as well just get a count of the user's invoices and add one, as you suggested. Either way, it's a database hit.
If the invoice numbers are independent for each user/customer then it seems like having "lastInvoice" field in some persistent store (eg. DB record) associated with the user is pretty unavoidable. However this could lead to some contention for the "latest" number.
Does it really matter if we send a user invoices 1, 2, 3 and 5, and never send them invoice
4? If you can relax the requirement a bit.
If the requirement is actually "every invoice number must be unique" then we can look at all the normal id generating tricks, and these can be quite efficient.
Ensuring that the numbers are sequenctial adds to the complexity, does it add to the business benefit?
I've just uploaded a gem that should resolve your need (a few years late is better than never!) :)
https://github.com/alisyed/sequenceid/
Not sure if this is the best solution, but you could store the last Invoice ID on the User and then use that to determine the next ID when creating a new Invoice for that User. But this simple solution may have problems with integrity, will need to be careful.
Do you really want to generate the invoice IDs in an incremental format? Would this not open security holes (where in, if a user can guess the invoice number generation, they can change it in the request and may lead to information disclosure).
I would ideally generate the numbers randomly (and keep track of used numbers). This prevents collisions as well (Chances of collision are reduced as the numbers are allocated randomly over a range).

Resources