Sending events (async data) in CANopen with more than 8 bytes of data - can-bus

We are developing a typical CANbus networked system with what you could call a controller organizing a number of devices.
The devices needs configuration, which the controller writes (and might also read back) using regular object dictionary items (currently in the manufacturer specific range).
The devices also perform actions (commands) with more than 8 bytes of data and this we solve by having write only items in the device object dictionary and relying on the regular segmentation/de-segmentation of SDO's. (I don't know if this is the CANopen way of doing things, but it seems reasonable).
However, the device also produces events (say some sensor data passes a certain threshold) resulting in more than 8 bytes of asynchronous data coming up from the device. PDO's are meant to be used for sending async event data, but it can only contain 8 bytes. The devices could write the data into an object dictionary item on the controller, but this doesn't seem like the CANopen way. Am I right?
The best we've come up with is to send a PDO to the controller, informing the controller that more data are available in the object dictionary on the device.
Anyone with CANopen background that can way in on the best (CANopen) way of solving this?
Since I'm repeating 8 bytes a lot, we can safely assume that this network is not running CAN-FD.

The key of any sensible CAN network design is to consider real-time, data priorities, bus load and data amounts early on. If you find yourself with a chunk of data larger than 8 bytes, then that strongly suggests that something is wrong in this design - it should likely be split in several packages.
Generally, you shouldn't be using SDO for data at all, since they come with overhead. That includes writes to the object dictionary, which also means SDO access. Block transfers etc with SDO are meant for things like bootloaders or one-time configuration, not for live data traffic in operational mode. It can be done, but it is fishy.
You can in theory map data across several PDOs with PDO mapping, but all of this really sounds like an "XY problem" - you are convinced that you need to transmit larger chunks of data and look for a way to do it. But step 1 is to look at the fundamental network data/design and see if you actually need those large chunks or if it makes sense to split them in several. The ideal CANopen design is to have one PDO per type of data, when possible.

Related

What difference is there when sending a dictionary rather than a struct through Game Center?

I'm implementing a multiplayer mode for my iOS game using Game Center.
It is pretty much a turn-based RPG (but it uses the real-time model), so I don't really need to send data too often.
I've noticed that most examples send data using a struct. One property being the message kind, and the other properties just extra parameters to interpret the message.
I, personally, would enjoy sending a dictionary rather than a struct for no particular reason other than ease of work (in my specific case at least).
I want to know the objective differences between sending structs and dictionaries through Game Center, so I can measure whether it is actually worth doing one over the other.
Some factors:
It's a turn-based RPG. You wait for your rival to send a message with their decision. There is little dynamism, so the data exchange is not quite frequent. If anything, I'd say that it will usually take like 5 seconds for a player to make a decision.
All my data is sent in reliable mode.
My dictionary will usually contain around 3-5 keys, where the values will usually be NSString or NSNumber instances.
Dictionaries use a lot more space than a simple struct.
With dictionaries, you have a lot of flexibility -> You can create your dictionary at run time and send it, as long as all the objects in your dictionary are serialisable. The code is also very easy to write.
Sending a dictionary is a great option, if you need to do things like send arrays, or send multiple objects that you are not aware of at compile time. In fact, if you know that all the objects that you use in your messages are serialisable, then you need just one encode / decode method that could be used for all the dictionaries.
Example: Convert nsdictionary to nsdata
Structs typically use less space.
You need to define your struct at compile time - you need to know the structure of your data that you plan to send. You can't do this dynamically.
If you need to send multiple arrays, you need to know the size before hand, since you can't send multiple variable length arrays at run time.
For your situation, if it is "turn based", and if a couple of seconds here or there won't make too much of a difference, then an NSDictionary may be more convenient for you.
If you were writing a first person shooter, or some kind of action game that needs to send data 30 or 60 times a second, then you would use the smallest format possible to send your data to avoid latency problems.
In an ideal world, regardless of whether it was turn based, or an fps, you wouldn't want to send more data than necessary, because you are wasting bandwidth, and your app will take longer to send / receive messages. The optimal method would be to create your own format where you minimise the number of wasted bits (example: if you know that a number you send will never be greater than 10, then you use only 4 bits to represent it). But in reality, it may making coding such an efficient encoding / decoding method tricky.

Erlang in-memory linked list -- Exhausting memory?

I tried to make an Erlang in-memory datastore that would receive messages and add them to a list. Here's the current incarnation. The trouble is, I'm receiving about 200 messages per second and this easily exhausts the memory available.
Once a minute, I send a {write, Pid} message that should clear out and clean up this list, but it doesn't look like it's being garbage collected.
What am I doing wrong? I think I'm approaching this from the completely wrong direction...
datastore(Db) ->
receive
{put, Data} ->
datastore(lists:concat([Data,Db]));
{write, Responder} ->
ScratchName = "ScratchFile.dat",
{ok, ScratchDevice} = file:open(ScratchName,[write]),
file:write(ScratchDevice,Db),
ok = file:close(ScratchDevice),
Responder ! {load, ScratchName},
datastore([])
end.
First spontaneous comment is that file:open will open the file, truncate it, and then write to it. So every time in the loop will overwrite any previous data. So if the Responder is slow with its loading of the file, there could be data you did not expect in the file.
Second reaction is that you don't have to do this buffering yourself. If you open the file with the option {delayed_write, Size, Delay}, and set Size and Delay to values that fit your purpose, you get precisely what you are trying to implement here by just writing all the time.
Third reaction is that you are probably doing the wrong thing if you use a file to communicate between different parts of your system. What are you attempting to do?
ps.
If you need a new random filename, you can easily generate one with erlang:now/0 and io_lib:format/2. As an added bonus they will sort in creation order.
This is a very wrong way of buffering in Erlang. Data Structures such as ETS (http://www.erlang.org/doc/man/ets.html) have been designed to handle thousands and millions of IN-MEMORY Erlang Data Structures with ease. Please, do not use Lists or Queues for handling too much data. If a part of your code will be handling data which other parts of the application are supposed to consume and yet you know that the consumers will be doing it a slower rate as compared to the part that is generating or getting the data, then you need a more robust way of buffering (ETS Tables).
Another thing is that usually, processes are a point of failure in a system. If a process is used to buffer or hold on to very essential data, even if that data is instantaneous but critical to the system, what would happen at that time when the process exits or dies ? ETS tables have been designed in a way that they can provide data access to all processes even applications within the same VM (of type public). In this way, all processes can use the data, reading as much as they want (concurrently) but what you would do is to ensure consistency by having one writer / updater.
ETS Tables rarely fail in an application as compared to the frequency at which processes fail. Most recently, a method that helps us to redeem data in a failing ETS table has been introduced ( ets:give_away/3 ).
Another thing, in a comment above, you have mentioned that you are working for a large Company. Usually, with large teams, its better you evaluate a number of options and make intensive tests against several depending the nature of the application you are developing. To avoid side effects, its best that you identify which data structures are best to use for what. For example, for in-memory storage, capable of handling 200 messages per second, if tested properly, Lists and Files would fail against ETS Tables.

Use it for JSON data transfer

I am trying to use RabbitMQ for a distributed system that would work something like:
a producer puts in a queue a JSON-formatted list of order ids
several consumers pull out of that queue, do the business logic with that order ids and the result (JSON formatted) as well is put back into another queue
from the second queue, another consumer will take the data and pass it back to the caller
I am still very new to RabbitMQ and I am wondering if this model is the right approach, given the fact that the data should be back as fast as possible (sometimes in the matter of seconds, max 5) so there are real time requirements.
Also, how large can the message passed to a queue can be? The JSON that the producer will get back will be fairly large, based on what the consumer does.
Thanks for any ideas!
See page 47 in this presentation (InfoQ) for a great comparision between different messaging formats.
There's nothing wrong with the design you suggested.
The slight wrinkle is that enforcing "real time requirements" isn't straightforward. For instance, it's not currently possible to expire messages within a queue, so this would need to be handled by the clients when consuming messages.
The total size of messages in RabbitMQ <=1.8.1 was bounded by the amount of available RAM. As of 2.0.0, it's bounded by the amount of available disk space (i.e. rabbit will page messages to disk if it's running low on memory). Individual message sizes are recorded as 32-bit integers (IIRC), so individual messages cannot be larger than ~4GB; if this is a problem, consider saving the JSONs to network storage and passing some ID to them in the messages. Other than this, there aren't any constraints.

What kind of socket server protocol is efficient?

When I was writing a simple server for a simple client <> server multiplayer game, I thought of the following text-based protocol using a translation library. Basically, each command had a certain meaning, eg:
1 = character starts turning right
2 = character starts turning left
3 = character stops turning
4 = character starts moving forward
5 = character stops moving
6 = character teleports to x, y
So, the client would simply broadcast the following to inform that the player is now moving forward and turning right:
4
1
Or, to teleport to 100x200:
6#100#200
Where # is the parameter delimiter.
The socket connection would be connected to the player identifier, so that no identifier has to be broadcasted with the protocol to know what player the message belongs to.
Of course all data would be validated server side, but that is a different subject.
Now, this seems pretty efficient to me, only 2 bytes to inform the server that I am moving forward and turning right.
However, most "professional" code snippets I saw seemed to be sending objects or xml commands. This seems to require a lot more server resources to me, doesn't it?
Is my unexperienced logic of why my text based protocol would be efficient flawed? Or what is the recommended protocol for real-time action multiplayer games?
I want to setup a protocol that is as efficient as possible, because I do not want to use multiple clusters/servers to cover excessive amounts of bandwidth for my 2D multiplayer game, and to safe synchronization problems and hassle.
However, most "professional" code
snippets I saw seemed to be sending
objects or xml commands. This seems to
require a lot more server resources to
me, doesn't it?
Is my unexperienced logic of why my
text based protocol would be efficient
flawed? Or what is the recommended
protocol for real-time action
multiplayer games?
Plain text is more expensive to send than a binary format containing the same information. For example, if you only send 1 byte, you can only send 10 different commands, digits 0 to 9. A binary format can send as many different commands as there are different values you can fit into a byte, ie. 256.
As such, although you are thinking of objects as being large, in actual fact they are almost always smaller than the plain text representation of that same object. Often they are as small as is possible without compression (and you can always add compression anyway).
The benefits of a plain text format are that they are easy to debug and understand. Unfortunately you lose those benefits if you put your own encoding in there (eg. reducing commands down to single digits instead of readable names). The downside is that the format is bigger, and that you have to write your own parser. XML formats eliminate the second problem, but they can't compete with a binary format for pure efficiency.
You are probably overthinking this issue at this stage, however. If you're only sending information about events such as the commands you mention above, bandwidth will not be a concern. It's broadcasting information about the game state that can get expensive - but even that can be mitigated by being careful who you send it to, and how frequently. I would recommend working with whatever format is easiest for now, as this will be the least of your problems. Just make sure that your code is always in a state where you can change the message writing and reading routines later if you need.
You need to be aware of the latency involved in sending your data. "Start turning"/"stop turning" will be less effective if the time between the receipt of those packets is different than the time between sending them.
I can't speak for all games, but when I've worked on this sort of code we'd send orientation and position information across the wire. That way the receiver could do smoothing and extrapolation (figure out where the object should be "now" based on data that I have that is already known to be old). Different games will want to send different data, but generally speaking you will need to figure out how to make the receiver's display of the data match the sender's, so you'll need to send data that is resilient in the face of networking problems.
Also, many games use UDP for this sort of data transfer instead of TCP. UDP is unreliable, so you may not get all of your packets. That means that "stop moving now" or "start moving now" may not be received in pairs. When coding on top of UDP then it's even more important to send "this is the state right now" every so often so that clients get ample opportunity to sync up.
The common way is to use a binary format, not text, not xml. So with only one byte you can represent one of 256 different commands.
Also use UDP and not TCP. The game will be a lot more responsive with UDP in case of packet loss. In case of packet loss you can still extrapolate the movements. With each packet send a packet number so that the server knows when the command was sent.
I highly recommend that you download the Quake source code where you can learn more about network programming in modern multiplayer games. It's really easy to read and understand.
edit:
I almost forgot..
Google's Protocol Buffers can be of great help when sending complex data structures.
I thought I would give my two cents and provide a practical application to what is being referred to as Binary Serialization. The concept is actually incredibly simple, yet only seems complicated on the outside.
You can actually send XMLs and have a server that processes the data within the XML to different functions within the server itself. You can also just send the server a single number that is stored within the server as a variable. After that, it can process the rest of the data and choose the correct course of actions.
As an example, some rough code:
private const MOVE_RIGHT:int = 0;
private const MOVE_LEFT:int = 1;
private const MOVE_UP:int = 2;
private const MOVE_DOWN:int = 3;
function processData(e:event.data)
{
switch (e)
{
case MOVE_RIGHT:
//move the clients player to the right
case MOVE_LEFT:
//move the clients player to the left
case MOVE_UP:
//move the clients player to the up
case MOVE_DOWN:
//move the clients player to the down
}
}
This would be a very simple example, and would need to be modified but as you can see you merely just store the variables encoded with whole numbers that you transmit in strings of numbers. You can parse these and create headers of information to organize them into different sections of data that needs to be transmitted.
Also, it is better to do a UDP setup for games because just missing a packet should NOT halter the gaming experience, but instead should be able to handle it client-side AND server-side.

What is a stream? [duplicate]

I understand that a stream is a representation of a sequence of bytes. Each stream provides means for reading and writing bytes to its given backing store. But what is the point of the stream? Why isn't the backing store itself what we interact with?
For whatever reason this concept just isn't clicking for me. I've read a bunch of articles, but I think I need an analogy or something.
The word "stream" has been chosen because it represents (in real life) a very similar meaning to what we want to convey when we use it.
Let's forget about the backing store for a little, and start thinking about the analogy to a water stream. You receive a continuous flow of data, just like water continuously flows in a river. You don't necessarily know where the data is coming from, and most often you don't need to; be it from a file, a socket, or any other source, it doesn't (shouldn't) really matter. This is very similar to receiving a stream of water, whereby you don't need to know where it is coming from; be it from a lake, a fountain, or any other source, it doesn't (shouldn't) really matter.
That said, once you start thinking that you only care about getting the data you need, regardless of where it comes from, the abstractions other people talked about become clearer. You start thinking that you can wrap streams, and your methods will still work perfectly. For example, you could do this:
int ReadInt(StreamReader reader) { return Int32.Parse(reader.ReadLine()); }
// in another method:
Stream fileStream = new FileStream("My Data.dat");
Stream zipStream = new ZipDecompressorStream(fileStream);
Stream decryptedStream = new DecryptionStream(zipStream);
StreamReader reader = new StreamReader(decryptedStream);
int x = ReadInt(reader);
As you see, it becomes very easy to change your input source without changing your processing logic. For example, to read your data from a network socket instead of a file:
Stream stream = new NetworkStream(mySocket);
StreamReader reader = new StreamReader(stream);
int x = ReadInt(reader);
As easy as it can be. And the beauty continues, as you can use any kind of input source, as long as you can build a stream "wrapper" for it. You could even do this:
public class RandomNumbersStreamReader : StreamReader {
private Random random = new Random();
public String ReadLine() { return random.Next().ToString(); }
}
// and to call it:
int x = ReadInt(new RandomNumbersStreamReader());
See? As long as your method doesn't care what the input source is, you can customize your source in various ways. The abstraction allows you to decouple input from processing logic in a very elegant way.
Note that the stream we created ourselves does not have a backing store, but it still serves our purposes perfectly.
So, to summarize, a stream is just a source of input, hiding away (abstracting) another source. As long as you don't break the abstraction, your code will be very flexible.
A stream represents a sequence of objects (usually bytes, but not necessarily so), which can be accessed in sequential order. Typical operations on a stream:
read one byte. Next time you read, you'll get the next byte, and so on.
read several bytes from the stream into an array
seek (move your current position in the stream, so that next time you read you get bytes from the new position)
write one byte
write several bytes from an array into the stream
skip bytes from the stream (this is like read, but you ignore the data. Or if you prefer it's like seek but can only go forwards.)
push back bytes into an input stream (this is like "undo" for read - you shove a few bytes back up the stream, so that next time you read that's what you'll see. It's occasionally useful for parsers, as is:
peek (look at bytes without reading them, so that they're still there in the stream to be read later)
A particular stream might support reading (in which case it is an "input stream"), writing ("output stream") or both. Not all streams are seekable.
Push back is fairly rare, but you can always add it to a stream by wrapping the real input stream in another input stream that holds an internal buffer. Reads come from the buffer, and if you push back then data is placed in the buffer. If there's nothing in the buffer then the push back stream reads from the real stream. This is a simple example of a "stream adaptor": it sits on the "end" of an input stream, it is an input stream itself, and it does something extra that the original stream didn't.
Stream is a useful abstraction because it can describe files (which are really arrays, hence seek is straightforward) but also terminal input/output (which is not seekable unless buffered), sockets, serial ports, etc. So you can write code which says either "I want some data, and I don't care where it comes from or how it got here", or "I'll produce some data, and it's entirely up to my caller what happens to it". The former takes an input stream parameter, the latter takes an output stream parameter.
Best analogy I can think of is that a stream is a conveyor belt coming towards you or leading away from you (or sometimes both). You take stuff off an input stream, you put stuff on an output stream. Some conveyors you can think of as coming out of a hole in the wall - they aren't seekable, reading or writing is a one-time-only deal. Some conveyors are laid out in front of you, and you can move along choosing whereabouts in the stream you want to read/write - that's seeking.
As IRBMe says, though, it's best to think of a stream in terms of the operations it offers (which vary from implementation to implementation, but have a lot in common) rather than by a physical analogy. Streams are "things you can read or write". When you start connecting up stream adaptors, you can think of them as a box with a conveyor in, and a conveyor out, that you connect to other streams and then the box performs some transformation on the data (zipping it, or changing UNIX linefeeds to DOS ones, or whatever). Pipes are another thorough test of the metaphor: that's where you create a pair of streams such that anything you write into one can be read out of the other. Think wormholes :-)
A stream is already a metaphor, an analogy, so there's really no need to provide another one. You can think of it basically as a pipe with a flow of water in it where the water is actually data and the pipe is the stream. I suppose it's kind of a 2-way pipe if the stream is bi-directional. It's basically a common abstraction that is placed upon things where there is a flow or sequence of data in one or both directions.
In languages such as C#, VB.Net, C++, Java etc., the stream metaphor is used for many things. There are file streams, in which you open a file and can read from the stream or write to it continuously; There are network streams where reading from and writing to the stream reads from and writes to an underlying established network connection. Streams for writing only are typically called output streams, as in this example, and similarly, streams that are for reading only are called input streams, as in this example.
A stream can perform transformation or encoding of data (an SslStream in .Net, for example, will eat up the SSL negotiation data and hide it from you; A TelnetStream might hide the Telnet negotiations from you, but provide access to the data; A ZipOutputStream in Java allows you to write to files in a zip archive without having to worry about the internals of the zip file format.
Another common thing you might find is textual streams that allow you to write strings instead of bytes, or some languages provide binary streams that allow you to write primitive types. A common thing you'll find in textual streams is a character encoding, which you should be aware of.
Some streams also support random access, as in this example. A network stream, on the other hand, for obvious reasons, wouldn't.
MSDN gives a good overview of streams in .Net.
Sun also have an overview of their general OutputStream class and InputStream class.
In C++, here is the istream (input stream), ostream (output stream) and iostream (bidirectional stream) documentation.
UNIX like operating systems also support the stream model with program input and output, as described here.
The point is that you shouldn't have to know what the backing store is - it's an abstraction over it. Indeed, there might not even be a backing store - you could be reading from a network, and the data is never "stored" at all.
If you can write code that works whether you're talking to a file system, memory, a network or anything else which supports the stream idea, your code is a lot more flexible.
In addition, streams are often chained together - you can have a stream which compresses whatever is put into it, writing the compressed form on to another stream, or one which encrypts the data, etc. At the other end there'd be the reverse chain, decrypting, decompressing or whatever.
The answers given so far are excellent. I'm only providing another to highlight that a stream is not a sequence of bytes or specific to a programming language since the concept is universal (while its implementation may be unique). I often see an abundance of explanations online in terms of SQL, or C or Java, which make sense as a filestream deals with memory locations and low level operations. But they often address how to create a filestream and operate on the potential file in their given language rather than discuss the concept of a stream.
The Metaphor
As mentioned a stream is a metaphor, an abstraction of something more complex. To get your imagination working I offer some other metaphors:
you want to fill an empty pool with water. one way to accomplish this is to attach a hose to a spigot, placing the end of the hose in the pool and turning on the water.
the hose is the stream
similarly, if you wanted to refill your car with gas, you would go to a gas pump, insert the nozzle into your gas tank and open the valve by squeezing the locking lever.
the hose, nozzle and associated mechanisms to allow the gas to flow into your tank is the stream
if you need to get to work you would start driving from your home to the office using the freeway.
the freeway is the stream
if you want to have a conversation with someone you would use your ears to hear and your mouth to speak.
your ears and eyes are streams
Hopefully you notice in these examples that the stream metaphors only exist to allow something to travel through it (or on it in the case of the freeway) and do not themselves always poses the thing they are transferring. An important distinction. We don't refer to our ears as a sequence of words. A hose is still a hose if no water is flowing through it, but we have to connect it to a spigot for it do its job correctly. A car is not the only 'kind' of vehicle that can traverse a freeway.
Thus a stream can exist that has no data travelling through it as long as it is connected to a file.
Removing the Abstraction
Next, we need to answer a few questions. I'm going to use files to describe streams so... What is a file? And how do we read a file? I will attempt to answer this while maintaining a certain level of abstraction to avoid unneeded complexity and will use the concept of a file relative to a linux operating system because of its simplicity and accessibility.
What is a file?
A file is an abstraction :)
Or, as simply as I can explain, a file is one part data structure describing the file and one part data which is the actual content.
The data structure part (called an inode in UNIX/linux systems) identities important pieces of information about the content, but does not include the content itself (or a name of the file for that matter). One of the pieces of information it keeps is a memory address to where the content starts. So with a file name (or a hard link in linux), a file descriptor (a numeric file name that the operating system cares about) and a starting location in memory we have something we can call a file.
(the key takeaway is a 'file' is defined by the operating system since it is the OS that ultimately has to deal with it. and yes, files are much more complex).
So far so good. But how do we get the content of the file, say a love letter to your beau, so we can print it?
Reading a file
If we start from the result and move backwards, when we open a file on our computer its entire contents is splashed on our screen for us to read. But how? Very methodically is the answer. The content of the file itself is another data structure. Suppose an array of characters. We can also think of this as a string.
So how do we 'read' this string? By finding its location in memory and iterating through our array of characters, one character at a time until reaching an end of file character. In other words a program.
A stream is 'created' when its program is called and it has a memory location to attach to or connect to. Much like our water hose example, the hose is ineffective if it is not connected to a spigot. In the case of the stream, it must be connected to a file for it to exist.
Streams can be further refined, e.g, a stream to receive input or a stream to send a files contents to standard output. UNIX/linux connects and keeps open 3 filestreams for us right off the bat, stdin (standard input), stdout (standard output) and stderr (standard error). Streams can be built as data structures themselves or objects which allows us to perform more complex operations of the data streaming through them, like opening the stream, closing the stream or error checking the file a stream is connected to. C++'s cin is an example of a stream object.
Surely, if you so choose, you can write your own stream.
Definition
A stream is a reusable piece of code that abstracts the complexity of dealing with data while providing useful operations to perform on data.
The point of the stream is to provide a layer of abstraction between you and the backing store. Thus a given block of code that uses a stream need not care if the backing store is a disk file, memory, etc...
In addition to things mentioned above there is a different kind of streams - as defined in functional programming languages such as Scheme or Haskell - a possibly infinite datastructure which is generated by some function on-demand.
The word "stream" has been chosen because it represents (in real life) a very similar meaning to what we want to convey when we use it.
Start thinking about the analogy to a water stream. You receive a continuous flow of data, just like water continuously flows in a river. You don't necessarily know where the data is coming from, and most often you don't need to; be it from a file, a socket, or any other source, it doesn't (shouldn't) really matter. This is very similar to receiving a stream of water, whereby you don't need to know where it is coming from; be it from a lake, a fountain, or any other source, it doesn't (shouldn't) really matter. source
To add to the echo chamber, the stream is an abstraction so you don't care about the underlying store. It makes the most sense when you consider scenarios with and without streams.
Files are uninteresting for the most part because streams don't do much above and beyond what non-stream-based methods I'm familiar with did. Let's start with internet files.
If I want to download a file from the internet, I have to open a TCP socket, make a connection, and receive bytes until there are no more bytes. I have to manage a buffer, know the size of the expected file, and write code to detect when the connection is dropped and handle this appropriately.
Let's say I have some sort of TcpDataStream object. I create it with the appropriate connection information, then read bytes from the stream until it says there aren't any more bytes. The stream handles the buffer management, end-of-data conditions, and connection management.
In this way, streams make I/O easier. You could certainly write a TcpFileDownloader class that does what the stream does, but then you have a class that's specific to TCP. Most stream interfaces simply provide a Read() and Write() method, and any more complicated concepts are handled by the internal implementation. Because of this, you can use the same basic code to read or write to memory, disk files, sockets, and many other data stores.
The visualisation I use is conveyor belts, not in real factories because I don't know anything about that, but in cartoon factories where items move along lines and get stamped and boxed and counted and checked by a sequence of dumb devices.
You have simple components that do one thing, for example a device to put a cherry on a cake. This device has an input stream of cherryless cakes, and an output stream of cakes with cherries. There are three advantages worth mentioning structuring your processing in this way.
Firstly it simplifies the components themselves: if you want to put chocolate icing on a cake, you don't need a complicated device that knows everything about cakes, you can create a dumb device that sticks chocolate icing onto whatever is fed into it (in the cartoons, this goes as far as not knowing that the next item in isn't a cake, it's Wile E. Coyote).
Secondly you can create different products by putting the devices into different sequences: maybe you want your cakes to have icing on top of the cherry instead of cherry on top of the icing, and you can do that simply by swapping the devices around on the line.
Thirdly, the devices don't need to manage inventory, boxing, or unboxing. The most efficient way of aggregating and packaging things is changeable: maybe today you're putting your cakes into boxes of 48 and sending them out by the truckload, but tomorrow you want to send out boxes of six in response to custom orders. This kind of change can be accommodated by replacing or reconfiguring the machines at the start and end of the production line; the cherry machine in the middle of the line doesn't have to be changed to process a different number of items at a time, it always works with one item at a time and it doesn't have to know how its input or output is being grouped.
When I heard about streaming for the first time, it was in the context of live streaming with a webcam. So, one host is broadcasting video content, and the other host is receiving the video content. So is this streaming? Well... yes... but a live stream is a concrete concept, and I think that the question refers to the abstract concept of Streaming. See https://en.wikipedia.org/wiki/Live_streaming
So let's move on.
Video is not the only resource that can be streamed. Audio can be streamed too. So we are talking about Streaming media now. See https://en.wikipedia.org/wiki/Streaming_media . Audio can be delivered from source to target in numerous of ways. So let's compare some data delivery methods to each other.
Classic file downloading
Classic file downloading doesn't happen real-time. Before taking the file to use, you'll have to wait until the download is complete.
Progressive download
Progressive download chunks download data from the streamed media file to a temporary buffer. Data in that buffer is workable: audio-video data in the buffer is playable. Because of that users can watch / listen to the streamed media file while downloading. Fast-forwarding and rewinding is possible, offcourse withing the buffer. Anyway, progressive download is not live streaming.
Streaming
Happens real-time, and chunks data. Streaming is implemented in live broadcasts. Clients listening to the broadcast can't fast-forwarding or rewind. In video streams, data is discarded after playback.
A Streaming Server keeps a 2-way connection with its client, while a Web Server closes connection after a server response.
Audio and video are not the only thing that can be streamed. Let's have a look at the concept of streams in the PHP manual.
a stream is a resource object which exhibits streamable behavior. That
is, it can be read from or written to in a linear fashion, and may be
able to fseek() to an arbitrary location within the stream.
Link: https://www.php.net/manual/en/intro.stream.php
In PHP, a resource is a reference to an external source like a file, database connection. So in other words, a stream is a source that can be read from or written to. So, If you worked with fopen(), then you already worked with streams.
An example of a Text-file that is subjected to Streaming:
// Let's say that cheese.txt is a file that contains this content:
// I like cheese, a lot! My favorite cheese brand is Leerdammer.
$fp = fopen('cheese.txt', 'r');
$str8 = fread($fp, 8); // read first 8 characters from stream.
fseek($fp, 21); // set position indicator from stream at the 21th position (0 = first position)
$str30 = fread($fp, 30); // read 30 characters from stream
echo $str8; // Output: I like c
echo $str30; // Output: My favorite cheese brand is L
Zip files can be streamed too. On top of that, streaming is not limited to files. HTTP, FTP, SSH connections and Input/Output can be streamed as well.
What does wikipedia say about the concept of Streaming?
In computer science, a stream is a sequence of data elements made
available over time. A stream can be thought of as items on a conveyor
belt being processed one at a time rather than in large batches.
See: https://en.wikipedia.org/wiki/Stream_%28computing%29 .
Wikipedia links to this: https://srfi.schemers.org/srfi-41/srfi-41.html
and the writers have this to say about streams:
Streams, sometimes called lazy lists, are a sequential data structure
containing elements computed only on demand. A stream is either null
or is a pair with a stream in its cdr. Since elements of a stream are
computed only when accessed, streams can be infinite.
So a Stream is actually a data structure.
My conclusion: a stream is a source that can contains data that can be read from or written to in a sequential way. A stream does not read everything that the source contains at once, it reads/writes sequentially.
Usefull links:
http://www.slideshare.net/auroraeosrose/writing-and-using-php-streams-and-sockets-zendcon-2011 Provides a very clear presentation
https://www.sk89q.com/2010/04/introduction-to-php-streams/
http://www.netlingo.com/word/stream-or-streaming.php
http://www.brainbell.com/tutorials/php/Using_PHP_Streams.htm
http://www.sitepoint.com/php-streaming-output-buffering-explained/
http://php.net/manual/en/wrappers.php
http://www.digidata-lb.com/streaming/Streaming_Proposal.pdf
http://www.webopedia.com/TERM/S/streaming.html
https://en.wikipedia.org/wiki/Stream_%28computing%29
https://srfi.schemers.org/srfi-41/srfi-41.html
It's just a concept, another level of abstraction that makes your life easier. And they all have common interface which means you can combine them in a pipe like manner. For example, encode to base64, then zip and then write this to disk and all in one line!
The best explanation of streams I've seen is chapter 3 of SICP. (You may need to read the first 2 chapters for it to make sense, but you should anyway. :-)
They don't use sterams for bytes at all, but rather integers. The big points that I got from it were:
Streams are delayed lists
The computational overhead [of eagerly computing everything ahead of time, in some cases] is outrageous
We can use streams to represent sequences that are infinitely long
Another point (For reading file situation):
stream can allow you to do something else before finished reading all content of the file.
you can save memory, because do not need to load all file content at once.
Think of streams as of an abstract source of data (bytes, characters, etc.). They abstract actual mechanics of reading from and writing to the concrete datasource, be it a network socket, file on a disk or a response from the web server.
I think you need to consider that the backing store itself is often just another abstraction. A memory stream is pretty easy to understand, but a file is radically different depending on which file system you're using, never mind what hard drive you are using. Not all streams do in fact sit on top of a backing store: network streams pretty much just are streams.
The point of a stream is that we restrict our attention to what is important. By having a standard abstraction, we can perform common operations. Even if you don't want to, for instance, search a file or an HTTP response for URLs today, doesn't mean you won't wish to tomorrow.
Streams were originally conceived when memory was tiny compared to storage. Just reading a C file could be a significant load. Minimizing the memory footprint was extremely important. Hence, an abstraction in which very little needed to be loaded was very useful. Today, it is equally useful when performing network communication and, it turns out, rarely that restrictive when we deal with files. The ability to transparently add things like buffering in a general fashion makes it even more useful.
A stream is an abstracting of a sequence of bytes. The idea is that you don't need to know where the bytes come from, just that you can read them in a standardized manner.
For example, if you process data via a stream then it doesn't matter to your code if the data comes from a file, a network connection, a string, a blob in a database etc etc etc.
There's nothing wrong per-se with interacting with the backing store itself except for the fact that it ties you to the backing store implementation.
A stream is an abstraction that provides a standard set of methods and properties for interacting with data. By abstracting away from the actual storage medium, your code can be written without total reliance on what that medium is or even the implementation of that medium.
An good analogy might be to consider a bag. You don't care what a bag is made of or what it does when you put your stuff in it, as long as the bag performs the job of being a bag and you can get your stuff back out. A stream defines for storage media what the concept of bag defines for different instances of a bag (such as trash bag, handbag, rucksack, etc.) - the rules of interaction.
I'll keep it short, I was just missing the word here:
Streams are queues usually stored in buffer containing any kind of data.
(Now, since we all know what queues are, there's no need to explain this any further.)
A stream is a highly abstracted metaphor and a strict contract. It means that you can manipulate objects in sequence without concern about gaps. That is to say, a stream must have no vacuum or gaps. Objects in it are arranged in sequence one by one continuously. As a result, we don't have to worry about encountering a vacuum abruptly in the midst of processing a stream, or we can't leave a vacuum deliberately when producing a stream. In other words, we don't have to consider the case of a void in processing or producing a stream. There is no way we can come across it or produce it on purpose. If you are constructing a stream, you must not leave any gaps in the stream.
Put another way, if there is a gap, it must not be a stream. When you refer to a sequence as a stream, either you are warranted there is no gaps in it or you have to keep the promise there is no gaps in the sequence you produce.
To recap, think about a water stream. What is the most prominent characteristic of it?
Continuous!
The spirit of the abstraction of a stream is all about it.

Resources