Functional programming in Java¶

Lambda functions, function references and the concept of a functional interface were added into Java in version 8 as part of a wider set of features that provide support for the functional programming paradigm. This course will discuss functional programming only in a very limited manner. Interested students are recommended to take a separate functional programming course offered at our university.

This brief discussion will at least in some way touch upon the following features of the functional programming paradigm:

A data set is processed with functions instead of explicit iteration (such as loops).
- In Java: process items using so-called Stream interfaces instead of containers and iteration.
It is common to pass functions as parameters.
- Data is processed with general functions that take functions, which define some details of how how the processing should be done, as parameters.
  - The preceding example function filter was an example of this principle, as are also the sorting functions, which we have already used extensively, that take a comparison function as a parameter.
- In Java: functions can receive function objects as parameters, and function objects that implement functional interfaces can be created easily using lambda functions or function references.
Data is immutable and functiond do not cause side effects.
- Functions do not mutate existing data: they create new data (e.g. from the existing data).
- In purely functional programming a sorting function would never sort a list directly; it would produce a new list that contains the items of the original list in sorted order.
- In Java: it is encouraged to follow this principle when using Stream interfaces, but it is not mandatory.
Operationg that manipulate data are performed in a “lazy” manner (so-called lazy evaluation).
- Actual data processing begins only once the result is explicitly requested. This is clarified below.
- In Java: Stream interfaces follow this principle.

From now on we will concentrate on Java’s Stream interfaces as they are the most central tool for functional programming style data processing in Java.

Java’s Stream interfaces¶

In order to simplify the exposition, we will from now on use the term “stream” to mean a Java Stream interface or (and perhaps most often) a concrete instance of such.

Java’s streams are types that offer a fairly diverse set of operations for processing data that a stream reads. Java class library contains four different stream types: Stream<T>, IntStream, LongStream and DoubleStream. The main difference between these is the type of the processed items. The generic stream Stream<T> can be used for processing generally any kind of reference type data, and the latter three are specialized for processing the number types corresponding to their names. These numeric streams offer operations that are feasible only for numbers, such as computing their sum.

A stream does not store any items itself (a stream is not a container): it reads data from some source that is defined when the stream is first created. If we talk about items or data in/of a stream, we mean the data/items read by the stream. Java class library offers e.g. the following ways to create a stream:

A stream that reads items from an array arr can be created as Arrays.stream(arr).
- The type of the created stream depends in a natural way on the array’s item type:
  - int: IntStream
  - long: LongStream
  - double: doubleStream
  - Some other type T: Stream<T>
A stream reading items of a container cont can be created as cont.stream(). The created stream will always be of type Stream<T>, where T is the container’s item type.
- For example if the container item type is Integer, the stream will still be Stream<Integer> and not IntStream.
A stream that reads a BufferedReader object br line by line can be created as br.lines(). The stream will be of type Stream<String> since the read lines are strings.
- This allows to read data from a multitude of sources as a BufferedReader can be initialized to read data from practically any kind of input source, such as a file or a string.

Once a stream has been created, we may perform two types of stream operations on it (that is, on the items read by the stream): so-called intermediate operations or so-called terminal operations. An intermediate operations means an operation whose result is also a stream. This enables us to chain stream operations: the result of one intermediate operation can immediately be the target of a new stream operation (without e.g. needing to create a new stream). A terminal operation produces a concrete result (e.g a value or a container that contains items) and the underlying chain of stream operations will end. If we wish to perform further stream operations on the result, we need to create a new stream that reads the previously returned result.

Stream operations are implemented as member functions of the stream types. E.g. if we have already created a stream object s, then e.g. the first operation listed below could be performed as s.distinct(). Stream opOperations can be chained by chaining member function calls: e.g. s.distinct().sorted() would first perform a distinct operation on s and then a sorted operation on the result of the first operation (which also was a stream).

Some intermediate operations offered by the generic stream type Stream<T> are listed below. We have omitted the explicit functional interface parameter types to keep the presentation more simple, but such parameters are described separately.

distinct(): produces a new stream that contains those the unique items of this stream (that is, this operation removes duplicate items).
filter(predicate): produces a new stream that contains only those items of this stream for which the function object predicate returns true. Hence this operation discards those items that do not fulfill some condition defined by the parameter predicate.
- The parameter predicate must implement the functional interface Predicate whose function test defines the condition.
- The basic principle is similar to the previous example function filter given in the section about interfaces.
map(function): produces a new stream where each item t of this stream has been replaced by the resukt of the function call function.apply(t). That is, this operation transforms all items of the stream by the provided transformation function.
- The parameter function must implement the functional interface Function.
mapToInt(function), mapToLong(function) and mapToDouble(function): work otherwise in similar manenr as map, but the produced stream is of type IntStream, LongStream or DoubleStream, respectively. The transformation function function must return a value that is compatible with the created stream type.
- The role of these is to allow changing the stream type from a generic stream into a numeric stream. This would be necessary mainly when we want to use the numeric operations offerd by numeric streams.
sorted() and sorted(comparator): produce a new stream that lists the items of this stream in sorted order. The first form uses natural item order and the second takes a separate comparison object that must implement the functional interface Comparator.

As stated before, streams process data in a lazy manner. Stream operations are started only when a terminal operation is performed (a terminal operation will produce a result, which might be void). Therefore the last step of processing data with streams must always be a terminal operation (there may be zero or more intermediate operations; it might e.g. be the case that only a single terminal operation suffices to produce the desired result). Some terminal operations of Stream<T> are listed below:

void forEach(consumer): Performs the function call consumer.accept(t) for each item t of this stream. Note that this operation does not produce an explicit result: its effect depends only on the side effects produced by the performed function calls.
- The parameter consumer must implement the functional interface Consumer.
Object[] toArray(): Returns an array that contains the items of this stream.
- The corresponding numeric stream operations return numeric arrays. E.g. toArray() of IntStream returns an int array.
T reduce(T identity, accumulator): produces a sort of cumulative result from the items of this stream.
- The parameter accumulator must implement the functional interface BinaryOperator.
- The result is initialized as T result = identity and then updated as result = accumulator.apply(result, t) at each stream item t.
  - E.g. the sum of the numbers in a stream s of type Stream<Integer> could be computed by the operation s.reduce(0, (a, b) -> a + b) or alternativelu, using function reference, by the operation s.reduce(0, Integer::sum).
    - E.g. the sum of 7, 2, 6 would be computed by first initializing result = 0 and then updating result = 0 + 7 = 7, result = 7 + 2 = 9 and finally result = 9 + 6 = 15.
R collect(supplier, accumulator, combiner): collects the items of this stream (usually either literally collects them into a container or computes and returns some other type of a result).
- The parameter supplier must implement the functional interface Supplier<R>. It is used for initializing the end result as R result = supplier.get().
- The parameter accumulator must implement the functional interface BiConsumer<R,? super T>. The result is updated as accumulator.accept(result, element) at each stream item t.
- The parameter combiner must implement the functional interface BiConsumer<R, R>. The stream may use this, if necessary, to combine two partial results into a single result (this might be necessary e.g. if the stream is processed in parallel manner).
- For example the items of a stream s of some type Stream<T> could be collected into an ArrayList by the operation s.collect(() -> new ArrayList<>(), (r, t) -> r.add(t), (r1, r2) -> r1.addAll(r2)) or alternatively, using function references, the operation s.collect(ArrayList::new, ArrayList::add, ArrayList::addAll).
  - Note how a new operation that creates an object of some class className can be referred to as className::new.
R collect(collector): collects the items of this stream. Otherwise similar to the preceding collect, but now there is only one parameter collector that simultaneously provides all the three functionalities that the preceding version takes as separate parameters.
- The parameter collector must implement the functional interface Collector. We have not introduced it before, and will not introduce it in detail here either. It should suffice for now that the Java class library class Collectors offers many static member functions that create different types of useful Collector objects that may be used with collect. We describe some of them below:
  - Collectors.toList(): collects the items into a list.
    - E.g. the items of a stream s of some type Stream<T> could be collected into a List<T> by the operation s.collect(Collectors.toList()). The returned list is of some type that implements the interface List<T>.
  - Collectors.counting(): returns the number of items in the stream.
    - E.g. if the stream s would read items from the array {4, 7, 6, 3, 8}, the operation s.collect(Collectors.counting()) would return 5.
    - This is an example of how collect may produce some kind of a result instead of literally “collecting” the items.
  - Collectors.averagingInt(mapper), Collectors.averagingLong(mapper) and Collectors.averagingDouble(mapper): return the average vaue of the stream items as a Double. The parameter mapper must offer a function that transforms an item into the corresponding numeric type (described by the function name). If the items already are of a correct type, the transformation may keep the items as such (but the transformation function must still be defined).
    - E.g. if the stream s would read the array {4, 7, 6, 3, 8}, the operation s.collect(Collectors.averagingInt(i -> i)) would return 5.6.
  - Collectors.summingInt(mapper), Collectors.summingLong(mapper) ja Collectors.SummingDouble(mapper): similar to the preceding average functions but return the sum of the stream items and the result type corresponds to the numeric type (described by the function name)
    - E.g. if the stream s woudl read the array {4, 7, 6, 3, 8}, the operation s.collect(Collectors.summingInt(i -> i)) would return 28.
  - Collectors.joining(delimiter): returns a String that consists of all items in the stream converted into strings and separated by the string parameter delimiter.
    - E.g. if the stream s would read the array {"one", "two", "three"}, the operation s.collect(Collectors.joining("-")) would return the string “one-two-three”.
  - Collectors.groupingBy(classifier): groups the stream items by storing them into a dictionary container under keys defined by the parameter classifier. Each item t of the stream will get a key computed as key = classifier(t), and the item t will be inserted into a dictionary into a list under the key key.
    - The parameter classifier must implement the functional interface Function.
    - The result will be a dictionary that implements the interface Map and whose values are lists that implement the interface List.
    - E.g. if the stream s would read the array {"one", "two", "three", "four"}, the operation s.collect(Collectors.groupingBy(String::length)) would griups the strings based on their lengths: the result would be a dictionary that holds the strings “one” and “two” in a list under the key 3, the string “four” in a list under the key 4, and the string “three” in a list under the key 5.
      - Again note the function reference used above. The same result could be obtained by using a lambda function s -> s.length().
  - Collectors.groupingBy(classifier, collector2): groups the stream items based on the parameter classifier in otherwise similar manner as above, but now the dictionary will not store item lists as such: each list weill be raplaced by the result of applying collector2 to it.
    - The parameter collector2 must implement the functional interface Collector. So this function in essence performs two nested Collector operations: the outer is the groupingBy operation, and the inner is the operation performed by collector2 (which can in principle be any type of a Collector operation).
    - E.g. if the stream s would read the array {"one", "two", "three", "four"}, the operation s.collect(Collectors.groupingBy(String::length, Collectors.counting())) would return a dictionary where the value corresponding to a key (string length) tells the number of strings that have that length. So here the key 3 would have value 2, the key 4 value 1 and the key 5 value 1.
  - Collectors.reducing(accumulator): performs a reduce operation to the items of this stream using the function accumulator. The result is wrapped inside an Optional object that is empty if and only if the stream is empty.
    - Optional<T> is a generic class of Java class library that is meant for representing values that do not necessarily exist. The class has e.g. the member functions boolean isEmpty() and boolean isPresent() for inspecting whether the Optional object is empty or holds a value, and the member function T get() for reading the value (if it exists).

Numeric streams have more or less similar operations as listed above, but in addition e.g. the numeric operations min(), max() and sum(), that compute the minimum, maximum and sum of the numbers in the stream, and summaryStatistics(), that return at once the mimimum, maximum, sum and average of the numbers in the stream.

We have introduced only a small part of the Java class library’s stream operations and related tools. If you wish to explore further, it is a good idea to browse through the Java library documentation of the package java.util.stream.

Below is as a final example a program that processes a file using streams. The program assumes that the input file transactions.csv contains rows of form “bankAccount;transferAmount”, where bankAccount is a string thet describes an account number and transferAmount is a number that decribed a transaction; a negative value means an outgoing and a positive value an incoming payment (concerning the specified account). The program reads the data and creates a Map container whose keys are account numbers and values are corresponding Optional<Account> objects. Each Account object (here wrapped inside an Optional object) stores an account number and the overal balance (sum of the transactions concerning that account). The classes should again be in separate files.

public class Account {
  private String number;
  private double balance;

  public Account(String number, double balance) {
    this.number = number;
    this.balance = balance;
  }

  public String getNumber() {
    return number;
  }

  public double getBalance() {
    return balance;
  }

  public void addAmount(double amount) {
    this.balance += amount;
  }

  @Override
  public String toString() {
    return String.format("%s: %.1f", number, balance);
  }
}


import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collectors;

public class ReadAccounts {
  public static void main(String[] args)
    throws IOException {
    try(var br = new BufferedReader(new FileReader("transactions.csv"))) {
    Map<String, Optional<Account>> accs = br.lines()     // Read file lines with a stream.
      .map(line -> line.split(";"))                      // Split the line into parts.
      .map(acc -> new Account(acc[0], Double.parseDouble(acc[1]))) // Parts -> Account object.
      .collect(Collectors.groupingBy(Account::getNumber,           // Group by account number.
                                     Collectors.reducing((a, b) -> {
                                       a.addAmount(b.getBalance()); // Sum transactions.
                                       return a;
                                     })
                                    )
              );
    System.out.println(accs);
  }
}

If the contents of the input file transactions.csv were

46262;7200
26736;2500
78291;3900
46262;-1825.4
26736;-50.9
26736;-220.5
78291;-31.9
46262;-125
78291;-180.3
46262;-449.1
26736;115
78291;-1390
46262;-899
78291;-49.9
46262;25

then the preceding program would output more or less the following:

{26736=Optional[26736: 2343.6], 46262=Optional[46262: 3926.5], 78291=Optional[78291: 2247.9]}

Programming demo (duration 1:57:51)