Programming With Streams in Java

Pritam Panhale
The Startup
Published in
7 min readNov 8, 2020

--

This article is strongly influenced by a book by Jeanne Boyarsky and Scott Selikoff.

A Stream in java is a sequence of data.

A stream pipeline is the consecutive operations that run on the stream to produce the result.

Let's not think of Java for a moment. Let's think about the factory of biscuits where biscuits are made and for further processing, they are kept on conveyer belt till it gets packed. There are three people standing by that production line. One is checking the quality of the biscuit, one group them in a bunch of 10 and the last one wraps them in the pack. The above process can also be reiterated in other words. There is a pipeline of the food where different operations are performed like a quality check, grouping, wrapping, and packing. This all happens in a sequential manner and ends when the biscuits get packed for final delivery. There is a stream of biscuits on which different intermediate operations are performed to get the desired and final product(a pack of biscuits). Likewise in java also we can create a stream and perform some operation on it and get the desired output. Usually, there are multiple operations we need to perform on a data to get it in the final format like sometimes we need to filter the data, we need to map it to some other format, we might need to sort it or we might need to get it as a different data structure, and there are so many other operations we can perform, above is just a gist of those operations.

The diagram below can be a general representation of the stream and its consumption.

  1. Source — Where the stream comes from [Mostly the collection of the data].
  2. Intermediate Operation — Transforms or converts the data to another form.
  3. Terminal Operation — The actual and desired produced result. The pipeline of stream ends here and the stream can not be traversed again.

Let's start creating a simple stream in java to understand the concept better.

public static void main(String[] args) {
Stream<Integer> integerStream = Stream.of(1, 2, 3);
List<Integer> numberInto10List = integerStream.map(number -> number * 10).collect(Collectors.toList());
System.out.println(numberInto10List);
}
output - [10, 20, 30]

You can find all the above classes in java.util package. In the above program, we have created a finite stream of numbers. While traversing through the stream we multiply the number by digit 10 (map to ‘number x 10’), then we simply collect all the mapped numbers into a list with the help Collectors.toList() method, we will see this Collectors class later in detail. The output is very much predictable as you read the program.

numbers ->map->collect

Here map() is an intermediate operation that performs some manipulation on the data of stream and then the terminal operation collect() will collect the desired data. The intermediate operation will be performed on each element of the stream and continue the streamflow. If you get the output of the intermediate operation, you will get the new stream. Please see the following example. Therefore we can also perform the collect() operation on the new reference. The code below will give us the same output as it would have given in the previous code snippet. The difference is that we have just added one extra reference for intermediate operation. Hence our number of lines of code has increased a little and that is absolutely fine.

public static void main(String[] args) {
Stream<Integer> integerStream = Stream.of(1, 2, 3);
Stream<Integer> integerStreamInto10 = integerStream.map(number -> number * 10);
List<Integer> numberInto10List = integerStreamInto10.collect(Collectors.toList());
System.out.println(numberInto10List);
}

But if you see the first code without an extra reference for intermediate operation. It seems a little easy to read and interpret than the later one. Initially, I also felt a little odd to expand a statement like this, but later I get used to it, and finally, I felt it as a great way to write a program and to reduce the number of lines of code. Let's see one more example.

public static void main(String[] args) {
Stream<String> stringStream = Stream.of("tiger", "lion", "monkey", "cat", "elephant");
List<String> animals = stringStream.filter(animal -> animal.length() > 3)
.sorted()
.map(animal -> animal + " is a wild animal")
.collect(Collectors.toList());
for (String animal : animals) {
System.out.println(animal);
}
}
output - elephant is a wild animal
lion is a wild animal
monkey is a wild animal
tiger is a wild animal

In the above example, we have created a stream of strings with animal names. then we filter the stream with a check that the length of the animal name should be greater than 3, which gives us all the animals except the cat, then we sorted the animals by their names(this will work only if the class implements a comparable interface and String class implements it), then we map it and just append a simple String to it and then we finally collected it in a list of Strings. Again here filter(), sorted() and map() are intermediate operations and collect() is a terminal operation.

If you observe closely, all the intermediate operations are written as lambda expression because all the intermediate operation accepts functional interface references as a parameter. Hence it is pretty much easy to write lambda expressions the way we want for intermediate operation. Again you may be thinking that we are writing too much code in a single line, therefore I purposely wrote each operation on a new line and usually all the developers do the same for better readability, but still, it's a single statement only.

This phenomenon of writing a code in a single statement is called Chaining.

The statement with all the chained methods can be called as Pipeline.

Chaining gives us lots of flexibility when there so many operations to perform on the data. Of course, there are traditional ways to write a program without streams and without chaining but that will increase your number of lines of code. So let us do this operation with chaining, let stream handles that data for you and we will just focus on the business logic.

Stream also gives you some advantage over the traditional way of writing code, one we saw as there are fewer lines, and second, internally java processes the stream faster than the traditional iterations.

There are no such disadvantages of streams and chaining but initially, one may feel little left out while implementing it. As one needs to have some prior knowledge about functional interfaces, lambda expressions and the in-built functions provide by stream API, for example, map(), filter(), and so on. Also, we need to remember which method takes what parameters(parameters are usually functional interfaces like Function, Predicate, Supplier, Consumer, etc). One can also feel stream APIs are hard to debug because as it is a single statement but a combination of multiple operations, it is not possible to see the output of any intermediate operation while debugging. But as usual, Java comes with a great solution, it provides a method called — peek(), and as its names suggest it will peek/look-into the stream at that particular moment. Let's see an example.

public static void main(String[] args) {
Stream<String> stringStream = Stream.of("tiger", "lion", "monkey", "cat", "elephant");
List<String> animals = stringStream.filter(animal -> animal.length() > 3)
.sorted()
.peek(animal -> System.out.print(animal + " "))
.collect(Collectors.toList());
System.out.println();
System.out.println("-------Printing Animals---------");
for (String animal : animals) {
System.out.println(animal);
}
}
output -
elephant lion monkey tiger
-------Printing Animals---------
elephant
lion
monkey
tiger

Here in peek() method we are just printing the output to the console to have a look at what's going on after sorted() method. It's a good practice to use peek() method initially to be more familiar with the streams and chaining. But we should not write any code in peek() method which will make any change to the existing data, otherwise, the data will corrupt and it may give us the wrong data/output.

This is all about the streams in theory. Let's see the important and different methods which can be used as intermediate operations and terminal operation and what parameter it accepts.

  • Intermediate Operations
Intermediate Operations
  • Terminal Operation.
Terminal Operation

Examples of Terminal Operations.

public static void main(String[] args) {
List<String> animalList = Arrays.asList("tiger", "lion", "monkey", "cat", "elephant");
Set<String> eAnimals = animalList.stream()
.map(animal -> animal.substring(0, 1))
.collect(Collectors.toSet());
System.out.println(eAnimals);
boolean isLionPresent = animalList.stream().anyMatch(animal -> animal.equals("lion"));
System.out.println(isLionPresent);
boolean allLionPresent = animalList.stream().allMatch(animal -> animal.equals("lion"));
System.out.println(allLionPresent);
animalList.stream().forEach(animal -> System.out.println("Length : " + animal.length()));
}
output
[c, t, e, l, m]
true
false
Length : 5
Length : 4
Length : 6
Length : 3
Length : 8

In the above example, I just wanted to demonstrate the usage of a common terminal operation. Like collect() method, which accepts a parameter of type Collector, but java has already given use commonly used methods like toSet(), toList(), toMap(), etc, but yes we can always write our custom implementations for collectors. The methods anyMatch(), allMatch() accepts Predicate and forEach() accepts Consumer. we can use this method as per our convenience.

I have mentioned only a few methods which get you started to start learning the streams, but java provides tons of API which you can use as per your needs, you just need to explore those.

If you want to know more about functional interfaces and lambda expression, please visit the following blogs which I have written earlier.

Keep exploring! Keep learning! Happy Programming!

--

--