Sourcetrail 2018.4

Author: Eberhard Gräther

Sourcetrail 2018.4 features big improvements on indexing performance and reduced memory consumption. The new tab bar in the application window allows for opening multiple symbols simultaneously, just like in a web browser. Also new type-use edges were added to the graph visual, to make exploration of types easier when templates/generics are involved.

Sourcetrail's new tab bar, located at the top of the application window.

Sourcetrail Slack Channel

Firstly, we want to announce that we just started testing a public Sourcetrail Slack channel. It will help you to get in touch with both us and other users. We also hope to find out more about how Sourcetrail is used, to discuss new features and well… how the channel will really be used, will largely depend on the users who join.

So feel free to join:

Join Sourcetrail Slack channel

We plan to test the channel for a few months and then evaluate whether it makes sense to keep it or not. Please be aware that we are located in Salzburg, Austria (GMT+1, no kangaroos) regarding timezone.

New in this Release:

  • Faster indexing: Up to 66% speed-up! (depending on your hardware)
  • Lower memory consumption: During indexing and project loading
  • Tab bar: At the top of the main window
  • C++/Java: More type-use edges for easy templates/generics navigation
  • C/C++: Updated to LLVM/Clang 7.0.0
  • News Box: Located on the start screen

You can download Sourcetrail 2018.4 here. The full changelog is available here. Going into some more detail, I will now outline the most important updates new in this release.

Faster Indexing and Lower Memory Consumption

We spent most of our time since our first official Sourcetrail release in summer 2017 with improving usability, adding new UI features and extending project setup. That work was well received and we see a growing number of users.

However, many users started complaining that our indexing speed is pretty bad. Far worse than that of other static analysis tools. Therefore we started looking into this problem over the last months and we are happy about the speed-up we achieved. We released first improvements with the maintenance version 2018.3.55. With this new release we managed to get about the same speed-up on top of that!

I also want to explain a little bit what we did and give you some rough numbers. I will mainly focus on our C/C++ indexer here, but most improvements also apply to the Java indexer.

Indexing speed comparison between our last three releases on the LLVM/Clang codebase.

Clang LibTooling AST building

As you can see in the chart above, the largest part of indexing time is spent on Clang LibTooling AST building. This is the step where all parsing, tokenization and the assembly of the Abstract Syntax Tree (AST) is done by Clang LibTooling. For you this should be close to the time that your compiler takes to build your project. If your old Sourcetrail version took much longer to process your project than your compiler, chances are that your blue bars are much larger and the improvement is even more significant for you. However, the time it takes Clang LibTooling to build the AST is somewhat out of our control.

But this is the place where your project configuration matters. As a user you can optimize this step by reducing the number of Header Search Paths, reordering them or removing unnecessary flags. This can have a big influence on indexing time.

AST traversal and recording

In this step we visit parts of the AST and record all symbols, references and source locations we need for exploring C++ source code. This part of our codebase was in constant development for the last 4 years. Thus some things, like caching, were done in multiple places. We did not use the most efficient C++ syntax yet. By refactoring and modernizing this code we could achieve big performance improvements. These changes include:

  • Improved caching of symbol names and file paths.
  • Used integral identifiers to reference file paths and symbols.
  • Reduced number of copies and allocations when passing data.
  • Reduced accesses to the file system.

Most of these changes were easy to implement. Some needed a little refactoring. The hardest part was figuring out how data was recorded and passed through our application, which was not too hard because… you know, we have Sourcetrail to figure that out 😉.

AST traversal and data recording comparison between our last three releases on the LLVM/Clang codebase.

The chart above shows how much we improved on AST traversal and data recording over the last versions.

Database Storing

The steps discussed above, AST building and traversal, are done within indexer processes/threads that run in parallel. They are running autonomously and do not influence each other, so none of them ever has to wait for any other.

But when an indexer is done with a translation unit, it passes all the recorded data back to the main process for storing into our Sqlite database. Because merging of data records from all indexers runs in a single thread, it is a well known bottleneck. Especially when running on a high-end CPU with 12+ parallel indexers, the indexers may produce data way faster than what the database synchronization can keep up with. If too much memory piles up, then the indexers have to stop and wait. For that reason one of our main concerns was improving database insertion speed.

Database insertion comparison between our last three releases on the LLVM/Clang codebase.

The chart above shows how we improved insertion times over the last versions. If you have a lot of CPU cores and database insertion was the main bottleneck on your machine, then chances are you see an overall indexing speed improvement according to this chart.

Tab Bar

Over the months there have been lots of requests regarding multi window or tabs support to allow for looking at multiple symbols or even projects simultaneously. We made the first step with our new tab bar, located at the top of the window.

Sourcetrail's new tab bar, located at the top of the application window.

The user interaction and shortcuts are the same as you are used to from your web browser. With the only exception that tabs cannot be detached into separated windows yet. We also added Open in New Tab context menu actions to the graph and code view. Alternatively the middle mouse button can be used for that as well.

Type-use edges to template/generics argument type from parent context

Sourcetrail makes it possible to inspect how template/generics types are composed, by showing you all types that are used as template/generics argument. But sometimes it can be cumbersome to see which types really depend on which other types.

Let’s take a look at an example:

class Object;

template <typename T>
class SharedPointer;

SharedPointer<Object> object;

In Sourcetrail we display this type relationship as seen in the image below. The variable object is of type SharedPointer<Object> which is based on the template type SharedPointer and uses the type Object as template argument type.

Type relationships when using templates/generics.

This looks fine the way it is. But in case you are looking at the variable object or at the class Object individually, the dependency between them is hidden. If you look at object, you don’t easily see that it is using the type Object from the graph visual. When looking at the class Object, you don’t see right away that there is a variable object using this type.

Types and variables used to only show the connection to the composed template type.

With this new release we added type-use edges from the parent context to the type used as template/generic argument. That way it is easier to spot how types are used in your codebase when templates/generics are involved.

Now variables and types show a connection when template/generics are involved.

In combination with our node bundling, this also has a nice side effect. If we have an expression in C++ of the form std::vector<std::shared_ptr<Object>> objects, then it is very easy to see that this is a container of Object instances now, with all the standard library classes being hidden away in the Non-Indexed Symbols bundle.

Graph hides non-indexed standard library types into a bundle, leaving only the types defined within the project.

Closing comments

Thanks for reading, we hope that you like our progress! Don’t forget to download the new release build and to join the Sourcetrail Slack channel.

Follow us: mail - Twitter - Facebook - Google+

Sourcetrail 2020.2

Update to LLVM/Clang 10, Python indexer updates, revised Linux Tarball package and some fixes and usability improvements. Continue reading

Sourcetrail 2020.1

Published on March 31, 2020

Sourcetrail is now free and open-source software

Published on November 18, 2019