𝙴𝚣𝚎𝚔𝚒𝚎𝚕

Posted on Jan 5

All you need to know about Streams

#programming #learning

What is a stream?

A Stream is a collection of potentially unlimited data made available over time. It can be thought of as a conveyor belt with data pieces available on it. Streams are essentially important in computing and programming as a whole. For instance;

Data which is transported over a network is usually streamed from the server to the client.
Reading from and writing to the console (standard output and input) also occur in stream interfaces.
Writing to files occur in stream interfaces
Linux system commands such as grep and sed have stream interfaces for efficient operations. (sed means stream editor)
There is the word 'stream' in Video Streaming

So what does this mean to us as developers? Basically every reading and writing in the computer system occur in stream interfaces i.e. every external interface that your program interacts with will likely pull or push data in chunks rather than in a whole.

Understanding streams is vital to building memory efficient systems which maximize performance and can scale easily

Stream Terminologies

The way programming languages work around streams is that data comes in through the standard input, get processed and is returned through the output stream, but, weirdly enough, there is also a third stream called the error stream which is handled separately for logging and debugging purposes. Also;

Flushing a stream means to make sure that data is already in its destination.
Closing a stream means to release every data in the stream to its destination.
Piping actually means converting the output stream of an operation into the input stream of another operation. (mostly used in Linux commands). For example

ps axwww | grep -E "code"

the ps commands returns an output stream containing all the current running processes in the computer system. The pipe (|) character however, passes the output stream of the initial operation into the input stream of the grep command, in which then the "code" filter is applied and then the result is then written to the standard output (console).

Streams in Different Programming Languages

I took some time to observe how 4 of the most programming languages approached streams; Python, JavaScript, Java and C#. I also focused on console as well as file interfaces and, well, one of them was rather different.

Java

The System.out field of the System class just represents the standard output stream (console stream) and the System.in corresponds to the standard input stream and System.err corresponds to the standard error stream.

public static void main(String[] args){
    System.out.println("Hello World")
}

For file interfaces however, Java uses the Java.io.FileWriter class or Java.io.FileReader class to read and write to files. They both inherit from the OutputStreamWriter and the InputStreamReader respectively which proves that Java works in streams.

Trust me the Object inheritance hierarchy is confusing. You can check the docs for more info

Python

Python, most used language of 2024.

I somehow felt betrayed when I found out that the print and input functions are wrappers over the sys.stdout.write() and sys.stdin.read() methods respectively. One thing to note here is that they are both similar to the file.read() and file.write() functions which are methods of the io.TextIOWrapper class which is basically a class for interacting with streams.

Basically, sys.stdout and sys.stdin are file objects which are file-like objects or streams.

C-sharp

The Console.Write() method writes a "text representation of an object to the standard output" using the format string. When it comes to files, C# gets a little upfront by using a StreamWriter class which then writes to the output stream (a file)

JavaScript (Node JS)

Most programming languages by default deal with streams as blocking synchronous operations, but JavaScript again proves to be different by making stream operations non-blocking asynchronous operations.

const {readFile} = require("fs")

function displayContent (text, error){
    if (error){
        throw error
    }
    else {
        console.log(text)
    }
}

readFile("filename.ext", "utf-8", displayContent)

That is why most nodejs functions involve a callback argument function.

Basically what happens is that when the reading is complete, the callback (displayContent) function is called. Until then, Node JS does things that are worth its time. It does have advantages; improved performance and scalability compared to other programming language implementations.

But if you still find it awkward, you could use the readFileSync or operationSync (notice the 'Sync') to read files synchronously making it a blocking operation.

const {readFileSync} = require("fs")

const text = readFileSync("filename.ext", "utf-8")
console.log(text)

Also console.log is a fine wrapper over the process.stdout.write method.

In JavaScript however, I mostly dealt with file or console streams and not network streams. Network streams in Node JS involves event driven programming which is the weirdest implementation of them all.

DEV Community