This is part 2 of blog series on Angular 2 change detection, see first blog post for details.
As with other approaches, the change detection in Angular 2 wraps around solving the two main problems: how does the framework notice changes and how are the actual changes identified? This dive into Angular 2 change detection is divided around those two main aspects. First we will see how Angular 2 uses overridable nature of browser APIs to patch them with library called Zone.js to be able to hook into all possible sources of changes. After that, in the second part, what happens when possible change is detected will be gone through. This contains the actual identification of possible changes and process of updating the changes to the DOM based on the bindings on template defined by developer.

Who Notifies Angular 2 of Changes?

So the first problem that needs to be solved is who notifies the Angular 2 from changes that may have happened? To answer this problem we must first explore the asynchronous nature of JavaScript and see what actually can cause change in JavaScript after initial state is set. Let’s start by taking a look at how JavaScript actually works.

Asynchronous Nature of JavaScript

JavaScript is said to be asynchronous, yet single-threaded language. These are of course just fancy technical terms, but understanding them forms the foundation to see how change can happen. Let’s start with the basic, synchronous flow that many programmers coming from other languages are familiar. This flow is called imperative programming. In imperative programming each command is executed one after another in an order. Each of these commands is executed completely before proceeding to the next one. Let’s take an example program in JavaScript:

const myNumber = 100;
doSomething(myNumber);
doSomethingElse();

This is just a non-sense program that demonstrates the fact that each command is executed synchronously one by one. First we assign variable, then call some function and after that function returns, call the last one. This is the basic way many programming languages work. This is also true for JavaScript. But there is one major thing to notice about JavaScript. JavaScript is also called reactive. What this means is that things can happen asynchronously. That means there can be different kinds of events. We can subscribe for each of these events and execute some code when they occur.
Let’s take the most basic example of asynchronous nature in JavaScript: setTimeout browser API. What does setTimeout do then? As seen from the signature – setTimeout(callback, timeout) – the function takes two parameters. First one is so called callback function that is executed once certain amount of time has elapsed. The second parameter is the timeout in milliseconds. Let’s take an example of how the setTimeout can be used:

setTimeout(function () {
  doSomething();
}, 1000);

So what we have here is a basic imperative, synchronous call to function called setTimeout. This function call is executed when this piece of code is executed (browser has loaded the script file and executes it, for example). What it does is it schedules the callback function passed to it to be called at later time. This scheduling to be executed later is what we mean by asynchronous execution. The function containing call to doSomething function is executed when one second (1000 milliseconds) has elapsed and the call stack is empty. Why the bolded part is important? Let’s take a look at it in more detail.
Call stack is the same call stack we have in other languages, such as Java. Its purpose is to keep track of the nested function calls that occur during execution of the program given. As many other languages, JavaScript is also single-threaded, meaning that only one piece of code can be executed simultaneously. The main difference here is, that unlike in many other languages, in JavaScript things can get to be executed also after the execution of actual synchronous code is already done. In Java the we would just enter the main method when the program starts, execute code within it, and when there is no more code to be executed, we are done and the program exits. This isn’t the case for JavaScript where we can schedule code to be executed later with the browser APIs. setTimeout is one of these APIs but there are many more. We can for example add event listeners with addEventListener. There are multiple types of events we can subscribe for. The most common events relate to user interaction such as mouse clicks and keyboard input. As an example click events can be subscribed for with the following code:

addEventListener('click', function () {
  doSomething();
});

To summarize what kind of sources for asynchronous execution there can be in JavaScript, we can divide the APIs in three categories:

  • Time-related APIs like setTimeout and setInterval
  • HTTP responses (XmlHttpRequest)
  • Event handlers registered with addEventListener

These are the sources of asynchronous execution of code, and here’s the main thing to realize: these are the only potential sources for changes. So what if we could patch these browser APIs and track calls to them? As it turns out we can and that is exactly what we will do. This brings us to the next subject: Zones.

Zone.js

Zone.js is Angular’s implementation of concept of zones for JavaScript. Zones are originally a concept of Dart programming language.
So how are these so called zones then used? Let’s look at an example code that simply runs code inside a zone

zone.run(() => {
  console.log('Hello world from zone!');
});

What we have here is just a simple function passed to zone.run method. This function will be executed inside the current zone. So what is the point of running the code inside a zone?
The magic comes from the possibility to hook into asynchronous events. Before we run any code inside our zone, we can add some callbacks to be called once something interesting happens. One important example of these hooks is afterTask. TheafterTask hook is executed whenever an asynchronous task has been executed. The asynchronous task simply means the callbacks registered for any of those browser APIs mentioned earlier, such as setTimeout. Let’s have an example of how this works:

zone.fork({
  afterTask: () => console.log('Asynchronous task executed!')
}).run(() => {
  setTimeout(() => console.log('Hello world from zone!'), 1000);
});
// Console log:
// Hello world from zone!
// Asynchronous task executed!

There are actually quite a few of these hooks available, to show some there are enqueueTask, dequeueTask, beforeTask andonError. There is, though, a reason that we looked into afterTask especially. afterTask is key piece we need to trigger change detection in Angular 2. There is still a single twist in the story, and that is the NgZone which we’ll have a look at next.

NgZone

As covered previously, we can use zone.js to execute some code when an asynchronous tasks callback is executed. We could now trigger change detection each time any asynchronous callback code has executed. There is still a simple optimization possible and that is handled by NgZone. NgZone is a class found on Angular 2 core module that extends the concept of zones a little further by keeping track of items in the asynchronous callback queue. It also defines new hook calledonMicrotaskEmpty which does the following (from the documentation):

Notifies when there is no more microtasks enqueue in the current VM Turn. This is a hint for Angular to do change detection, which may enqueue more microtasks. For this reason this event can fire multiple times per VM Turn.

So it basically allows us to only execute change detection once there is no more asynchronous callbacks to be executed instead of running the change detection after each single task. Nice!
NgZone also has some other interesting functionalities that we aren’t going to go through here. It for example allows you to run asynchronous code outside of Angular’s default zone for it not to trigger the change detection. This is especially useful when you have multiple asynchronous calls to be made sequentially and don’t want to unnecessarily trigger change detection after each of them. This this can be achieved with a method called runOutsideAngular which takes a function to be executed as parameter.

Zones & Angular 2

Now that we know what is the concept of zones and how they can be used to track asynchronous execution, we can take a look at how Angular 2 actually triggers the change detection. Let’s have a look at an example pseudo-code by Pascal Precht from his excellent article on this very same topic called Angular 2 Change Detection Explained:

this.zone.onMicrotaskEmpty
  .subscribe(() => {
    this.zone.run(() => this.tick() })
  })
tick() {
  this.changeDetectorsRefs
    .forEach((ref) => ref.detectChanges())
}

As we see here, the API for NgZone is a little different than the one we showed for zone.js hooks since it uses concept of observables instead of registering plain callbacks, as is usual in Angular 2. Nevertheless the concept is still the same that each time the microtask queue (the queue of those asynchronous callbacks to be executed) is empty, we call method called tick. And what the tick does is it iterates through all the change detectors in our application. Simple, yet effective. Next, let’s take a look at what these change detectors are and how they are used to detect the changes made.

Change Happened, Now What?

Great! Now we know how the Angular 2 knows about possibility of changes that may have occurred. What we need to do next is to identify what are these actual changes and after that render the changed parts to the user interface (DOM). To detect changes we first need to think a little about the structure of Angular 2 applications.

Angular 2 Application Structure

As you surely know at this point (at least implicitly), every Angular 2 application is a tree of components. The tree starts from the root component that is passed to the bootstrap method as a first parameter and is usually called AppComponent. This component then has child components through either direct references in the template or via router instantiating them within the <router-outlet></router-outlet> selector. Be that as it may, we can visualize the application as a tree structure:

We can now see that there’s the root node (AppComponent) and some subtrees beneath it symbolizing the component hierarchy of the application.

Unique Change Detector for Each Component

Important aspect of Angular 2 change detection is the fact that each component has its own change detector. These change detectors are customized for data structures of each component to be highly efficient. As seen in the image below, we can see that each component in our component tree has its own change detector.

So what makes these change detectors to unique then? We won’t be going into details on this post, but each of the change detectors is created especially for that component. This makes them extremely performant as they can be built to be something called monomorphic. This is a JavaScript virtual machine optimization that you can read more from Vyacheslav Egorov’s in-depth article What’s up with monomorphism?. This optimization lets Angular 2 change detectors run “Hundreds of thousands simple checks in a few milliseconds” according to Angular core team member Victor Savkin.
One important aspect remains to be discovered: by who and when are the change detectors then created? There are two possible ways. First and the default choice is that they are instantiated automatically by Angular 2 on application initialization. This adds some work to be done while bootstrapping the application. The second option is to use something called offline compiler, which is still work-in-progress, by Angular 2 core team to generate the change detectors through command-line interface already before shipping the application. The latter can obviously boost the booting of application even further. To find out more on the topic of offile compiler, you should see the angular2-template-generator npm package.

Change Detection Tree

Okay, now we know that each component has unique change detector responsible for detecting the changes happened since the previous rendering. How is the whole process of change detection orchestrated? By default, Angular 2 needs to be conservative about the possible changes that might have happened and to check the whole tree through each time. Possibilities to optimize this are shown later on this blog series.
The approach of Angular 2 change detection is to always perform change detection from top-to-bottom. So we start by triggering change detector on our applications root component. After this has been done, we iterate the whole tree through starting from the second level. This is also illustrated in the image below.

Doing change detection this way is predictable, performant, easy to debug and controllable. Let’s explore each of these terms a little further.
Angular 2 change detection is said to be predictable because there is no possibility for need to run change detection through multiple times for one set of changes. This is major difference compared to the Angular.js where there was no guarantee about whether change detection would be single- or multi-pass. How does Angular 2 then prevent need for multi-pass change detection? The key thing to realize is that the data only flows from the top to bottom. There can’t be any cycles in the component tree as all the data coming to component can only be from its parent through the input mechanism (@Inputannotation). This is what is meant when we say that structure of Angular 2 application is always unidirectional tree.
Only needing a single-pass combined with extremely fast change detectors (the VM-friendliness thing) for components is extremely fast. Angular core team manager Brad Green stated on his talk in ng-conf 2016 that compared to Angular.js, Angular 2 is always five times faster on rendering. This is already way more than fast enough for most of the applications. Though, if there are some corner cases performance-wise, we can still apply optimizations techniques shown later in this series to even further increase the performance of change detection. These techniques include usage of immutables and observables, or even totally manual control on when change detection is ran.
If you have done Angular.js development the odds are that you have met some really obscure error messages stating something like “10 $digest() iterations reached. Aborting!“. These problems are often really hard to reason about and thus debug. With Angular 2 change detection system they can’t really happen as running change detection itself is guaranteed not to trigger new passes.
Angular 2 change detection is also extremely controllable. We have multiple change detection strategies (gone through later in this series) to choose from for each component separately. We can also detach and re-attach the change detection manually to enable even further control.

Conclusions

In this blog post we saw how the JavaScript internal event loop works, and how it can be used with concept of zones to trigger change detection automatically on possible changes. We also looked into how Angular 2 manages to run change detection as single-pass, unidirectional tree.

Avatar

Roope Hakulinen

As a lead software developer Roope works as team lead & software architect in projects where failure is not an option. Currently Roope is leading a project for one of the world's largest furniture retailers.

Do you know a perfect match? Sharing is caring

Change detection is the process of mapping the application state into user interface. In case of the web applications this usually means mapping JavaScript data, such as objects, arrays and other primitives, into the DOM (Document Object Model) which is viewable and interactable by the end-user. Even though the mapping of the state to DOM is somewhat simple and straightforward, great challenges are faced when trying to reflect the changes that have happened on the state to the DOM. This phase is called re-rendering and it usually needs to be performed each time there is change in the underlying data model.
In these two blog posts I will go through all the aspects of Angular 2 change detection starting from the very basics:

If you are already familiar with the concept of change detection and its previous implementations, feel free to skip this introduction post and go straight to part 2.

Change Detection

Change detection is an important issue to solve when building a library or a framework containing view rendering functionality. Let’s first take a look at what is the goal of the change detection and how it has been implemented earlier by other vendors.

Goal 

As stated already before, the goal of the change detection is to render the data model to the DOM. This also includes re-rendering when changes occur.
The ultimate goal is to make this process as performant as possible. This goal is greatly affected by the number of expensive DOM accesses needed to fulfill this purpose. But even the minimal possible DOM access doesn’t help if the identification of changed parts is slow. These two aspects are the core to understanding the different approaches taken by different libraries and frameworks as will soon be seen.
The process of change detection centers around two main aspects:

  • Detecting that changes (may) have happened
  • Reacting to those potential changes

There are multiple approaches to both of the problems and when going through the different solutions to change detection in the next section, these two will be emphasized for each solution. The point of having the word may in parenthesis is that some approaches know exactly if the change has happened or not, while others take a different path and check for changes even if they only may have happened. Even though the latter sounds extremely inefficient and insane, it actually provides great flexibility for application developer and can be implemented efficiently enough.

Evolution of Change Detection Mechanisms

The already somewhat traditional approaches to solve the change detection can be divided into five subcategories based on approach they take as done by Tero Parviainen in his commendable article with title Change and Its Detection in JavaScript Frameworks:

  • Server-side rendering
  • Manual re-rendering
  • Data binding
  • Dirty Checking
  • Virtual DOM

All of these approaches deserve to be introduced as they lay down the foundation of change detection to build upon.

Server-side Rendering

Before the era of SPAs the state was exclusively stored on the backend and all the state transitions happened via navigating through links or submitting forms. Either way, they required a full page load which is obviously slow and doesn’t offer much of a user experience.
How possible change is noticed:
No changes can happen in the client.
What happens when change may have happened:
Nothing is updated in client. New HTML is rendered each time on server.

Manual Re-rendering

With the rise of JavaScript usage there came also the idea of bringing the data models to the browser, instead of just keeping them on server-side. This idea was popularized by frameworks such as Backbone.js, Ext JS and Dojo, which were all based on the idea of having events fired on state changes. These events then could be caught by the application developer on the UI code and propagated to the actual DOM. Thus the updating of the DOM was still responsibility of the application developer.
How possible change is noticed:
Frameworks define their own mechanisms to be used for data storing so that changes can be tracked. These can be for example objects that are inherited.
What happens when change may have happened:
Event is triggered and can be handled by application developer.

Data Binding

First approaches that can actually be called data binding were based on the observation, that the events could also trigger automatic update to DOM. The main difference compared to the earlier implementations lays exactly on that there is also support for reacting to events caused by changes. One well-known example of these approaches is Ember.js.
Even though the UI updating was now “automated”, it still had the problem of inconvenient syntax of declaring changes caused by the lack of support by JavaScript. Example of this syntax compared to the what it could be with for example ES6 Proxies is below.

foo.set('x', 42); // The syntax required by Ember.js
foo.x = 42; //What the syntax should and nowadays could be

This kind of awkward syntax made it possible for solutions to detect the changes automatically with minimal effort. It requires though some common API to be used and binds the data model to the framework unnecessarily.
How possible change is noticed:
Frameworks have their own functions like set(key, value) which trigger the change detection automatically.
What happens when change may have happened:
The changed parts are known as they are always set with a setter function. This makes it possible to only update the changed parts to the UI without comparing what may have changed since last rendering.

Dirty Checking

Dirty checking is how Angular.js implements change detection. Every time a change happens set of watches generated for bindings attached to template is ran. These watches will then perform check on whether the data has changed since the last time and if so perform update on DOM.
The name dirty checking comes from the process of checking all the bindings every time there is a possibility of change in the state and an operation called digest is launched. This digestion with iterating and comparing through all the bound values may sound like a lot performance-wise but is actually surprisingly fast as it also minimizes unnecessary DOM accesses.
How possible change is noticed:
Custom implementations for possible change sources like setTimeout ($timeout in Angular.js). If change happens outside of these primitives, the framework needs to be notified. In Angular.js this is done with $scope.apply() and $scope.digest().
What happens when change may have happened:
All the bound values have watcher which keeps the last value in storage and compares it with the current value.

Virtual DOM

Virtual DOM is an approach made famous by React. In it the rendering is done into virtual DOM tree, that is still just a vanilla JavaScript data model. This virtual tree presentation can then be converted into corresponding DOM tree. This is what is done initially, but how about when the changes occur?
When a change is detected a new virtual DOM is generated from a scratch. This newly-created structure is then differentiated against the current presentation of virtual DOM. Patch generated by the diff is then applied to the actual DOM, thus minimizing the need to touch the actual DOM.
How possible change is noticed:
In case of React, the this.setState() needs to be called to set the new state which is then rendered.
What happens when change may have happened:
New virtual DOM structure is composed which is then differentiated with the previous structure and the patch generated from differences is applied to the DOM.

Conclusions

As is seen here, there has been many different solutions to change detection with each of them having its own pros and cons. In the next part we will take a look at how Angular 2 implements the change detection.

Avatar

Roope Hakulinen

As a lead software developer Roope works as team lead & software architect in projects where failure is not an option. Currently Roope is leading a project for one of the world's largest furniture retailers.

Do you know a perfect match? Sharing is caring

The power of Java ecosystem lies in the Java Virtual Machine (JVM) which runs variety of programming languages which are better suitable for some tasks than Java. One relatively new JVM language is Kotlin which is statically typed programming language that targets the JVM and JavaScript. You can use it with Java, Android and the browser and it’s 100% interoperable with Java. Kotlin is open source (Apache 2 License) and developed by a team at JetBrains. The name comes from the Kotlin Island, near St. Petersburg. The first officially considered stable release of Kotlin v1.0 was released on February 15, 2016.

Why Kotlin?

“Kotlin is designed to be an industrial-strength object-oriented language, and to be a better language than Java but still be fully interoperable with Java code, allowing companies to make a gradual migration from Java to Kotlin.” – Kotlin, Wikipedia

Kotlin’s page summaries the question “Why Kotlin?” to:

  • Concise: Reduce the amount of boilerplate code you need to write.
  • Safe: Avoid entire classes of errors such as null pointer exceptions.
  • Versatile: Build server-side applications, Android apps or frontend code running in the browser. You can write code in Kotlin and target JavaScript to run on Node.js or in browser.
  • Interoperable: Leverage existing frameworks and libraries of the JVM with 100% Java Interoperability.

“You can write code that’s more expressive and more concise than even a scripting language, but with way fewer bugs and with way better performance.” – Why Kotlin is my next programming language

One of the obvious applications of Kotlin is Android development as the platform uses Java 6 although it can use most of Java 7 and some backported Java 8 features. Only the recent Android N which changes to use OpenJDK introduces support for Java 8 language features.
For Java developers one significant feature in Kotlin is Higher-Order Functions, function that takes functions as parameters, which makes functional programming more convenient than in Java. But in general, I’m not so sure if using Kotlin compared to Java 8 is as much beneficial. It smooths off a lot of Java’s rough edges, makes code leaner and costs nothing to adopt (other than using IntelliJ IDEA) so it’s at least worth trying. But if you’re stuck with legacy code and can’t upgrade from Java 6, I would jump right in.

Learning Kotlin

Coming from Java background Kotlin at first glance looks a lot leaner, elegant, simpler and the syntax is familiar if you’ve written Swift. To get to know the language it’s useful to do some Kotlin Examples and Koans which get you through how it works. They also have “Convert from Java” tool which is useful to see how Java classes translate to Kotlin. For mode detailed information you can read the complete reference to the Kotlin language and the standard library.
If you compare Kotlin to Java you see that null references are controlled by the type system, there’s no raw types, arrays are invariant (can’t assign an Array to an Array) and there’s no checked exceptions. Also semicolons are not required, there’s no static members, non-private fields or wildcard types.
And what Kotlin has that Java doesn’t have? For starters there’s null safety, smart casts, extension functions and lots of things Java just got in recent versions like Null safety, streams, lambdas ( although which are “expensive”). On the other hand Kotlin targets Java 6 bytecode and doesn’t use some of the improvements in Java 8 like invoke-dynamic or lambda support. Some of JDK7/8 features are going to be included in Standard Library in 1.1 and in the mean time you can use small kotlinx-support library. It provides extension and top-level functions to use JDK7/JDK8 features such as calling default methods of collection interfaces and use extension for AutoCloseable.
And you can also call Java code from Kotlin which makes it easier to write it alongside Java if you want to utilize it in existing project and write some part of your codebase with Kotlin.
The Kotlin Discuss is also nice forum to read experiences of using Kotlin.

Tooling: in practice IntelliJ IDEA

You can use simple text editors and compile your code from the command line or use build tools such as Ant, Gradle and Maven but good IDEs make the development more convenient. In practice, using Kotlin is easiest with JetBrains IntelliJ IDEA and you can use their open source Community edition for free. There’s also Eclipse plugin for Kotlin but naturally it’s much less sophisticated than the IntelliJ support.

Example project

The simplest way to start with Kotlin application is to use Spring Boot’s project generator, add your dependencies, choose Gradle or Maven and click on “Generate Project”.
There are some gotchas with using Spring and Kotling together which can be seen from Spring + Kotlin FAQ. For example by default, classes are final and you have to mark them as “open” if you want the standard Java behaviour. This is useful to know with @Configuration classes and @Bean methods. There’s also Kotlin Primavera which is a set of libraries to support Spring portfolio projects.
For example Spring Boot + Kotlin application you should look at Spring.io writeup where they do a geospatial messenger with Kotlin, Spring Boot and PostgreSQL
What does Kotlin look like compared to Java?
Simple example of using Java 6, Java 8 and Kotlin to filter a Map and return a String. Notice that Kotlin and Java 8 are quite similar.

# Java 6
String result = "";
for (Map.Entry<Integer, String> entry : someMap.entrySet()) {
	if("something".equals(entry.getValue())){
		result += entry.getValue();
}
# Java 8
String result = someMap.entrySet().stream()
		.filter(map -> "something".equals(map.getValue()))
		.map(map->map.getValue())
		.collect(Collectors.joining());
# Kotlin
val result = someMap
  .values
  .filter { it == "something" }
  .joinToString("")
# Kotlin, shorter
val str = "something"
val result = str.repeat(someMap.count { it.value == str })
# Kotlin, more efficient with large maps where only some matching.
val result = someMap
  .asSequence()
  .map { it.value }
  .filter { it == "something" }
  .joinToString("")

The last Kotlin example makes the evaluation lazy by changing the map to sequence. In Kotlin collections map/filter methods aren’t lazy by default but create always a new collection. So if we call filter after values method then it’s not as efficient with large maps where only some elements are matching the predicate.

Using Java and Kotlin in same project

To start with Kotlin it’s easiest to mix it existing Java project and write some classes with Kotlin. Using Kotlin in Maven project is explained in the Reference and to compile mixed code applications Kotlin compiler should be invoked before Java compiler. In maven terms that means kotlin-maven-plugin should be run before maven-compiler-plugin.
Just add the kotlin and kotlin-maven-plugin to your pom.xml as following

<dependencies>
    <dependency>
        <groupId>org.jetbrains.kotlin</groupId>
        <artifactId>kotlin-stdlib</artifactId>
        <version>1.0.3</version>
    </dependency>
</dependencies>
<plugin>
    <artifactId>kotlin-maven-plugin</artifactId>
    <groupId>org.jetbrains.kotlin</groupId>
    <version>1.0.3</version>
    <executions>
        <execution>
            <id>compile</id>
            <phase>process-sources</phase>
            <goals> <goal>compile</goal> </goals>
        </execution>
        <execution>
            <id>test-compile</id>
            <phase>process-test-sources</phase>
            <goals> <goal>test-compile</goal> </goals>
        </execution>
    </executions>
</plugin>

Notes on testing

Almost everything is final in Kotlin by default (classes, methods, etc) which is good as it forces immutability, less bugs. In most cases you use interfaces which you can easily mock and in integration and functional tests you’re likely to use real classes, so even then final is not an obstacle. For using Mockito there’s Mockito-Kotlin library https://github.com/nhaarman/mockito-kotlin which provides helper functions.
You can also do better than just tests by using Spek which is a specification framework for Kotlin. It allows you to easily define specifications in a clear, understandable, human readable way.
There’s yet no static analyzers for Kotlin. Java has: FindBugs, PMD, Checkstyle, Sonarqube, Error Prone, FB infer. Kotlin has kotlinc and IntelliJ itself comes with static analysis engine called the Inspector. Findbugs works with Kotlin but detects some issues that are already covered by the programming language itself and are impossible in Kotlin.

To use Kotlin or not?

After writing some classes with Kotlin and testing converting existing Java classes to Kotlin it makes the code leaner and easier to read especially with data classes like DTOs. Less (boilerplate) code is better. You can call Java code from Kotlin and Kotlin code can be used from Java rather smoothly as well although there are some things to remember.
So, to use Kotlin or not? It looks a good statically-typed alternative to Java if you want to expand your horizons. It’s pragmatic evolution to Java that respects the need for good Java integration and doesn’t introduce anything that’s terribly hard to understand and includes a whole bunch of features you might like. The downsides what I’ve come across are that tooling support is kind of limited, meaning in practice only IntelliJ IDEA. Also documentation isn’t always up to date or updated when the language evolves and that’s also an issue when searching for examples and issues. But hey, everything is fun with Kotlin 🙂

Avatar

Marko Wallin

Marko works as a full stack software engineer and creates better world through digitalization. He writes technology and software development related blog and developes open source applications e.g. for mobile phones. He also likes mountain biking.

Do you know a perfect match? Sharing is caring

Background

Our team at Gofore is developing and maintaining Suomi.fi Data Exchange Layer based on X-Road technology. Because this is a national infrastructure service that is expected to be a standard delivery mechanism for Finnish public sector organizations and also be widely used in the private sector, the system needs to offer clear benefits for user organizations with great level of automation and ease of use. An example of benefits is the API Catalogue that lists all the organizations offering services in the Suomi.fi Data Exchange Layer and technical details like WSDL’s for the services. A part of the data is publicly available and other part requires registration.
X-Road infrastructure didn’t automate collecting the data included in the system even though it offers metaservices that can be used to query this data. Our job was to implement the data collector for the API Catalogue.

Architecture

The data required for the API Catalogue and offered by X-Road meta services consists of member organizations, their subsystems and services. Because the data exchange layer will potentially have hundreds or thousands of organizations with several subsystems per organization and multiple services per subsystem, we decided to use architecture based on concurrent Akka actors. Our data collector component processed the data received from metaservices and caches it on a local database so that API Catalogue receives only data that has been changed since it last requested it. The processing of different organizations and their services can be done concurrently, because it does not depend on other organizations or services. Akka was an easy choice also because the X-Road technology itself is based on Akka. Spring Boot is used to have an easy way to implement the persistence using Spring Data and JPA.
The data model need for the API Catalogue is presented in the following diagram. It should be noted that the term client is used for both members and subsystems and the meta services offer those as a one structure. However, in the catalogue, those are separated. In order to confuse more, the term int the Catalogue are different from those used in the X-Road. X-Road term member in organization in the Catalogue, subsystem is API and service is resource.
diagram
High level architecture of our actor system is shown in the picture below. The system is initiated and the Supervisor actor is sheduler in the Spring Boot main XRoadCatalogCollector. The Supervisor creates pools for other types of actors and send a START_COLLECTING message to ListClientsActor. ListClientsActor calls the metaservice listClients and creates both member organizations and their subsystems based on the data. These are persisted in a form that is ready for the API Catalogue to use. ListClientsActor also sends a message to ListMethodsActor for each subsystem. ListMethodsActor processed the message based on the subsystem information and calls metaservice listMethods for the given subsystem. The methods or services are persisted and each is processed by a FetchWsdlActor.
asset_1

Spring Boot and Akka Configuration

This chapter describes with examples how Spring Boot and Akka can be configured to work together.
Everything starts with our main class XRoadCatalogCollector where the Supervisor is sheduled.

ActorSystem is created in the ApplicationConfiguration. Note the initialization for Akka extension named SpringExtension on line 38

and the implementation of SpringExtension and SpringActorProducer. This is the way to pass the Spring context to Akka actors.

Actors

The two of the top level actors are simple and there is no need to present them here. Implementation of the other actors follow the same principle:
• Check if the received message is of correct type
• Make a call to X-Road metaservice based on the details in the message
• Process the results
• Persist the processed results
• Send a message to a further actor (or return in case of the bottom level actor)
The implementation of ListClientsActor is below.

Other actor implementations are similar although there are differences, for example, the method of calling X-Road meta services. Yes, for some reason some of the calls are REST, some SOAP. Anyway, there is no point to publish other implementations here. The remaining details can can be found in Github where all the source code is published under the MIT License in https://github.com/vrk-kpa/xroad-catalog.
In addition to the data collector component described here, the github repository contains the lister component that is used by the API Catalogue to query data from the collector database and a persistence module which is used by both.

Conclusion

The data collector has been installed to production for the Suomi.fi Data Exchange Layer installation and the production API Catalogue used the real data provided by collector. Currently, the number of organizations and services available in the production system is small and thus we do not have real experience on how the collector would perform in the future when data amounts are expected to be much greater. A simple sequential implementation would also have been sufficient for current situation and much simpler to implement, but with expected future data amounts it would be too slow. It is possible that our concurrent implementation consumes the server resources quite easily if configured incorrectly, but by configuring the pool sizes and collecting interval with proper values, the concurrent system should be able to handle much bigger data amounts faster than sequential implementation. It is also possible to distribute the actors for different servers, but that would require code changes.

Avatar

Sami Kallio

Sami Kallio is an experienced software architect with passion to work with interesting technology and the great people he has as colleagues.

Do you know a perfect match? Sharing is caring