Class Template Argument Deduction (CTAD) is a C++17 Core Language feature that reduces code verbosity. C++17’s Standard Library also supports CTAD, so after upgrading your toolset, you can take advantage of this new feature when using STL types like std::pair and std::vector. Class templates in other libraries and your own code will partially benefit from CTAD automatically, but sometimes they’ll need a bit of new code (deduction guides) to fully benefit. Fortunately, both using CTAD and providing deduction guides is pretty easy, despite template metaprogramming’s fearsome reputation!

CTAD support is available in VS 2017 15.7 and later with the /std:c++17 and /std:c++latest compiler options.

Template Argument Deduction

C++98 through C++14 performed template argument deduction for function templates. Given a function template like “template <typename RanIt> void sort(RanIt first, RanIt last);”, you can and should sort a std::vector<int> without explicitly specifying that RanIt is std::vector<int>::iterator. When the compiler sees “sort(v.begin(), v.end());”, it knows what the types of “v.begin()” and “v.end()” are, so it can determine what RanIt should be. The process of determining template arguments for template parameters (by comparing the types of function arguments to function parameters, according to rules in the Standard) is known as template argument deduction, which makes function templates far more usable than they would otherwise be.

However, class templates didn’t benefit from these rules. If you wanted to construct a std::pair from two ints, you had to say “std::pair<int, int> p(11, 22);”, despite the fact that the compiler already knows that the types of 11 and 22 are int. The workaround for this limitation was to use function template argument deduction: std::make_pair(11, 22) returns std::pair<int, int>. Like most workarounds, this is problematic for a few reasons: defining such helper functions often involves template metaprogramming (std::make_pair() needs to perform perfect forwarding and decay, among other things), compiler throughput is reduced (as the front-end has to instantiate the helper, and the back-end has to optimize it away), debugging is more annoying (as you have to step through helper functions), and there’s still a verbosity cost (the extra “make_” prefix, and if you want a local variable instead of a temporary, you need to say “auto”).

Hello, CTAD World

C++17 extends template argument deduction to the construction of an object given only the name of a class template. Now, you can say “std::pair(11, 22)” and this is equivalent to “std::pair<int, int>(11, 22)”. Here’s a full example, with a C++17 terse static_assert verifying that the declared type of p is the same as std::pair<int, const char *>:

C:\Temp>type meow.cpp

#include <type_traits>

#include <utility>

int main() {

std::pair p(1729, “taxicab”);

static_assert(std::is_same_v<decltype(p), std::pair<int, const char *>>);

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 meow.cpp

meow.cpp

C:\Temp>

CTAD works with parentheses and braces, and named variables and nameless temporaries.

Another Example: array and greater

C:\Temp>type arr.cpp

#include <algorithm>

#include <array>

#include <functional>

#include <iostream>

#include <string_view>

#include <type_traits>

using namespace std;

int main() {

array arr = { “lion”sv, “direwolf”sv, “stag”sv, “dragon”sv };

static_assert(is_same_v<decltype(arr), array<string_view, 4>>);

sort(arr.begin(), arr.end(), greater{});

cout << arr.size() << “: “;

for (const auto& e : arr) {

cout << e << ” “;

}

cout << “\n”;

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 arr.cpp && arr

arr.cpp

4: stag lion dragon direwolf

This demonstrates a couple of neat things. First, CTAD for std::array deduces both its element type and its size. Second, CTAD works with default template arguments; greater{} constructs an object of type greater<void> because it’s declared as “template <typename T = void> struct greater;”.

CTAD for Your Own Types

C:\Temp>type mypair.cpp

#include <type_traits>

template <typename A, typename B> struct MyPair {

MyPair() { }

MyPair(const A&, const B&) { }

};

int main() {

MyPair mp{11, 22};

static_assert(std::is_same_v<decltype(mp), MyPair<int, int>>);

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 mypair.cpp

mypair.cpp

C:\Temp>

In this case, CTAD automatically works for MyPair. What happens is that the compiler sees that a MyPair is being constructed, so it runs template argument deduction for MyPair’s constructors. Given the signature (const A&, const B&) and the arguments of type int, A and B are deduced to be int, and those template arguments are used for the class and the constructor.

However, “MyPair{}” would emit a compiler error. That’s because the compiler would attempt to deduce A and B, but there are no constructor arguments and no default template arguments, so it can’t guess whether you want MyPair<int, int> or MyPair<Starship, Captain>.

Deduction Guides

In general, CTAD automatically works when class templates have constructors whose signatures mention all of the class template parameters (like MyPair above). However, sometimes constructors themselves are templated, which breaks the connection that CTAD relies on. In those cases, the author of the class template can provide “deduction guides” that tell the compiler how to deduce class template arguments from constructor arguments.

C:\Temp>type guides.cpp

#include <iterator>

#include <type_traits>

template <typename T> struct MyVec {

template <typename Iter> MyVec(Iter, Iter) { }

};

template <typename Iter> MyVec(Iter, Iter) -> MyVec<typename std::iterator_traits<Iter>::value_type>;

template <typename A, typename B> struct MyAdvancedPair {

template <typename T, typename U> MyAdvancedPair(T&&, U&&) { }

};

template <typename X, typename Y> MyAdvancedPair(X, Y) -> MyAdvancedPair<X, Y>;

int main() {

int * ptr = nullptr;

MyVec v(ptr, ptr);

static_assert(std::is_same_v<decltype(v), MyVec<int>>);

MyAdvancedPair adv(1729, “taxicab”);

static_assert(std::is_same_v<decltype(adv), MyAdvancedPair<int, const char *>>);

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 guides.cpp

guides.cpp

C:\Temp>

Here are two of the most common cases for deduction guides in the STL: iterators and perfect forwarding. MyVec resembles a std::vector in that it’s templated on an element type T, but it’s constructible from an iterator type Iter. Calling the range constructor provides the type information we want, but the compiler can’t possibly realize the relationship between Iter and T. That’s where the deduction guide helps. After the class template definition, the syntax “template <typename Iter> MyVec(Iter, Iter) -> MyVec<typename std::iterator_traits<Iter>::value_type>;” tells the compiler “when you’re running CTAD for MyVec, attempt to perform template argument deduction for the signature MyVec(Iter, Iter). If that succeeds, the type you want to construct is MyVec<typename std::iterator_traits<Iter>::value_type>”. That essentially dereferences the iterator type to get the element type we want.

The other case is perfect forwarding, where MyAdvancedPair has a perfect forwarding constructor like std::pair does. Again, the compiler sees that A and B versus T and U are different types, and it doesn’t know the relationship between them. In this case, the transformation we need to apply is different: we want decay (if you’re unfamiliar with decay, you can skip this). Interestingly, we don’t need decay_t, although we could use that type trait if we wanted extra verbosity. Instead, the deduction guide “template <typename X, typename Y> MyAdvancedPair(X, Y) -> MyAdvancedPair<X, Y>;” is sufficient. This tells the compiler “when you’re running CTAD for MyAdvancedPair, attempt to perform template argument deduction for the signature MyAdvancedPair(X, Y), as if it were taking arguments by value. Such deduction performs decay. If it succeeds, the type you want to construct is MyAdvancedPair<X, Y>.”

This demonstrates a critical fact about CTAD and deduction guides. CTAD looks at a class template’s constructors, plus its deduction guides, in order to determine the type to construct. That deduction either succeeds (determining a unique type) or fails. Once the type to construct has been chosen, overload resolution to determine which constructor to call happens normally. CTAD doesn’t affect how the constructor is called. For MyAdvancedPair (and std::pair), the deduction guide’s signature (taking arguments by value, notionally) affects the type chosen by CTAD. Afterwards, overload resolution chooses the perfect forwarding constructor, which takes its arguments by perfect forwarding, exactly as if the class type had been written with explicit template arguments.

CTAD and deduction guides are also non-intrusive. Adding deduction guides for a class template doesn’t affect existing code, which previously was required to provide explicit template arguments. That’s why we were able to add deduction guides for many STL types without breaking a single line of user code.

Enforcement

In rare cases, you might want deduction guides to reject certain code. Here’s how std::array does it:

C:\Temp>type enforce.cpp

#include <stddef.h>

#include <type_traits>

template <typename T, size_t N> struct MyArray {

T m_array[N];

};

template <typename First, typename… Rest> struct EnforceSame {

static_assert(std::conjunction_v<std::is_same<First, Rest>…>);

using type = First;

};

template <typename First, typename… Rest> MyArray(First, Rest…)

-> MyArray<typename EnforceSame<First, Rest…>::type, 1 + sizeof…(Rest)>;

int main() {

MyArray a = { 11, 22, 33 };

static_assert(std::is_same_v<decltype(a), MyArray<int, 3>>);

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 enforce.cpp

enforce.cpp

C:\Temp>

Like std::array, MyArray is an aggregate with no actual constructors, but CTAD still works for these class templates via deduction guides. MyArray’s guide performs template argument deduction for MyArray(First, Rest…), enforcing all of the types to be the same, and determining the array’s size from how many arguments there are.

Similar techniques could be used to make CTAD entirely ill-formed for certain constructors, or all constructors. The STL itself hasn’t needed to do that explicitly, though. (There are only two classes where CTAD would be undesirable: unique_ptr and shared_ptr. C++17 supports both unique_ptrs and shared_ptrs to arrays, but both “new T” and “new T[N]” return T *. Therefore, there’s insufficient information to safely deduce the type of a unique_ptr or shared_ptr being constructed from a raw pointer. As it happens, this is automatically blocked in the STL due to unique_ptr’s support for fancy pointers and shared_ptr’s support for type erasure, both of which change the constructor signatures in ways that prevent CTAD from working.)

Corner Cases for Experts: Non-Deduced Contexts

Here are some advanced examples that aren’t meant to be imitated; instead, they’re meant to illustrate how CTAD works in complicated scenarios.

Programmers who write function templates eventually learn about “non-deduced contexts”. For example, a function template taking “typename Identity<T>::type” can’t deduce T from that function argument. Now that CTAD exists, non-deduced contexts affect the constructors of class templates too.

C:\Temp>type corner1.cpp

template <typename X> struct Identity {

using type = X;

};

template <typename T> struct Corner1 {

Corner1(typename Identity<T>::type, int) { }

};

int main() {

Corner1 corner1(3.14, 1729);

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 corner1.cpp

corner1.cpp

corner1.cpp(10): error C2672: ‘Corner1’: no matching overloaded function found

corner1.cpp(10): error C2783: ‘Corner1<T> Corner1(Identity<X>::type,int)’: could not deduce template argument for ‘T’

corner1.cpp(6): note: see declaration of ‘Corner1’

corner1.cpp(10): error C2641: cannot deduce template argument for ‘Corner1’

corner1.cpp(10): error C2514: ‘Corner1’: class has no constructors

corner1.cpp(5): note: see declaration of ‘Corner1’

In corner1.cpp, “typename Identity<T>::type” prevents the compiler from deducing that T should be double.

Here’s a case where some but not all constructors mention T in a non-deduced context:

C:\Temp>type corner2.cpp

template <typename X> struct Identity {

using type = X;

};

template <typename T> struct Corner2 {

Corner2(T, long) { }

Corner2(typename Identity<T>::type, unsigned long) { }

};

int main() {

Corner2 corner2(3.14, 1729);

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 corner2.cpp

corner2.cpp

corner2.cpp(11): error C2668: ‘Corner2<double>::Corner2’: ambiguous call to overloaded function

corner2.cpp(7): note: could be ‘Corner2<double>::Corner2(double,unsigned long)’

corner2.cpp(6): note: or ‘Corner2<double>::Corner2(T,long)’

with

[

T=double

]

corner2.cpp(11): note: while trying to match the argument list ‘(double, int)’

In corner2.cpp, CTAD succeeds but constructor overload resolution fails. CTAD ignores the constructor taking “(typename Identity<T>::type, unsigned long)” due to the non-deduced context, so CTAD uses only “(T, long)” for deduction. Like any function template, comparing the parameters “(T, long)” to the argument types “double, int” deduces T to be double. (int is convertible to long, which is sufficient for template argument deduction; it doesn’t demand an exact match there.) After CTAD has determined that Corner2<double> should be constructed, constructor overload resolution considers both signatures “(double, long)” and “(double, unsigned long)” after substitution, and those are ambiguous for the argument types “double, int” (because int is convertible to both long and unsigned long, and the Standard doesn’t prefer either conversion).

Corner Cases for Experts: Deduction Guides Are Preferred

C:\Temp>type corner3.cpp

#include <type_traits>

template <typename T> struct Corner3 {

Corner3(T) { }

template <typename U> Corner3(U) { }

};

#ifdef WITH_GUIDE

template <typename X> Corner3(X) -> Corner3<X *>;

#endif

int main() {

Corner3 corner3(1729);

#ifdef WITH_GUIDE

static_assert(std::is_same_v<decltype(corner3), Corner3<int *>>);

#else

static_assert(std::is_same_v<decltype(corner3), Corner3<int>>);

#endif

}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 corner3.cpp

corner3.cpp

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 /DWITH_GUIDE corner3.cpp

corner3.cpp

C:\Temp>

CTAD works by performing template argument deduction and overload resolution for a set of deduction candidates (hypothetical function templates) that are generated from the class template’s constructors and deduction guides. In particular, this follows the usual rules for overload resolution with only a couple of additions. Overload resolution still prefers things that are more specialized (N4713 16.3.3 [over.match.best]/1.7). When things are equally specialized, there’s a new tiebreaker: deduction guides are preferred (/1.12).

In corner3.cpp, without a deduction guide, the Corner3(T) constructor is used for CTAD (whereas Corner3(U) isn’t used for CTAD because it doesn’t mention T), and Corner3<int> is constructed. When the deduction guide is added, the signatures Corner3(T) and Corner3(X) are equally specialized, so paragraph /1.12 steps in and prefers the deduction guide. This says to construct Corner3<int *> (which then calls Corner3(U) with U = int).

Reporting Bugs

Please let us know what you think about VS. You can report bugs via the IDE’s Report A Problem and also via the web: go to the VS Developer Community and click on the C++ tab.

Today we have a guest post from Marc Gregoire, Software Architect at Nikon Metrology and Microsoft MVP since 2007.

The C++14 standard already contains a wealth of different kinds of algorithms. C++17 adds a couple more algorithms and updates some existing ones. This article explains what’s new and what has changed in the C++17 Standard Library.

New Algorithms

Sampling

C++17 includes the following new sampling algorithm:

sample(first, last, out, n, gen)

It uses the given random number generator (gen) to pick n random elements from a given range [first, last) and writes them to the given output iterator (out).

Here is a simple piece of code that constructs a vector containing the integers 1 to 20. It then sets up a random number generator, and finally generates 10 sequences of 5 values, in which each value is randomly sampled from the data vector:

using namespace std;

vector<int> data(20);
iota(begin(data), end(data), 1);
copy(cbegin(data), cend(data), ostream_iterator<int>(cout, " "));
cout << '\n';

random_device seeder;
const auto seed = seeder.entropy() ? seeder() : time(nullptr);
default_random_engine generator(
       static_cast<default_random_engine::result_type>(seed));

const size_t numberOfSamples = 5;
vector<int> sampledData(numberOfSamples);

for (size_t i = 0; i < 10; ++i)
{
    sample(cbegin(data), cend(data), begin(sampledData),
           numberOfSamples, generator);
    copy(cbegin(sampledData), cend(sampledData),
         ostream_iterator<int>(cout, " "));
    cout << '\n';
}

Here is an example of a possible output:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
4 8 9 17 19
4 7 12 13 18
5 7 8 14 18
1 4 5 10 20
2 4 8 13 17
2 3 4 5 20
4 7 8 9 13
1 7 8 10 15
4 5 8 12 13
1 3 8 10 19

Iterating

The C++ Standard Library already included for_each() to process each element in a given range. C++17 adds a for_each_n(first, n, func) algorithm. It calls the given function object (func) for each element in the range given by a first iterator (first) and a number of elements (n). As such, it is very similar to for_each(), but for_each_n() only processes the first n elements of the range.

Here is a simple example that generates a vector of 20 values, then uses for_each_n() to print the first 5 values to the console:

using namespace std;

vector<int> data(20);
iota(begin(data), end(data), 1);

for_each_n(begin(data), 5,
           [](const auto& value) { cout << value << '\n'; });

Searching

C++17 includes a couple of specialized searchers, all defined in <functional>:

default_searcher
boyer_moore_searcher
boyer_moore_horspool_searcher

The Boyer-Moore searchers are often used to find a piece of text in a large block of text, and are usually more efficient than the default searcher. In practice, the two Boyer-Moore searchers are able to skip certain characters instead of having to compare each individual character. This gives these algorithms a sublinear complexity, making them much faster than the default searcher. See the Wikipedia article for more details of the algorithm.

To use these specialized searchers, you create an instance of one of them and pass that instance as the last parameter to std::search(), for example:

using namespace std;

const string haystack = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.";
const string needle = "consectetur";

const auto result = search(cbegin(haystack), cend(haystack),
       boyer_moore_searcher(cbegin(needle), cend(needle)));

if (result != cend(haystack))
    cout << "Found it.\n";
else
    cout << "Not found.\n";

If you want to carry out multiple searches on the same range, you can construct a single instance of std::boyer_moore_searcher and reuse it, rather than creating a new one for each std::search() call.

Generalized Sum Algorithms

Scan

The following generalized sum algorithms have been added to the C++17 Standard Library:

exclusive_scan(first, last, out, init[, bin_op])
inclusive_scan(first, last, out[, bin_op[, init]])
transform_exclusive_scan(first, last, out, init, bin_op, un_op)
transform_inclusive_scan(first, last, out, bin_op, un_op[, init])

Where bin_op is a binary operator (std::plus<>() by default), and un_op is a unary operator.

All these algorithms calculate a sequence of sums of the elements in a given range [first, last), denoted as [e₀, e_n). The calculated sums are written to [out, out + (last-first)), denoted as [s₀, s_n). Suppose further that we denote the binary operator (bin_op) as ⊕. The exclusive_scan() algorithm then calculates the following sequence of sums:

s₀ = init
s₁ = init ⊕ e₀
s₂ = init ⊕ e₀ ⊕ e₁
…
s_n-1 = init ⊕ e₀ ⊕ e₁ ⊕ … ⊕ e_n₋₂

While inclusive_scan() calculates the following sums:

s₀ = init ⊕ e₀
s₁ = init ⊕ e₀ ⊕ e₁
…
s_n-1 = init ⊕ e₀ ⊕ e₁ ⊕ … ⊕ e_n₋₁

The only difference is that inclusive_scan() includes the i^th element in the i^th sum, while exclusive_scan() does not include the i^th element in the i^th sum.

exclusive_scan() and inclusive_scan() are similar to partial_sum(). However, partial_sum() evaluates everything from left to right, while exclusive_scan() and inclusive_scan() evaluate everything in a non-deterministic order. That means that the result of these will be non-deterministic if the binary operator that is used is not associative. Because of the non-deterministic order, these algorithms can be executed in parallel by specifying a parallel execution policy, see later.

The sums calculated by inclusive_scan() with init equal to 0 and an associative binary operator are exactly the same as the sums calculated by partial_sum().

transform_exclusive_scan() and transform_inclusive_scan() are very similar. The only difference is that they apply the given unary operator before calculating the sum. Suppose the unary operator is denoted as a function call f(). The transform_exclusive_scan() algorithm then calculates the following sums sequence:

s₀ = init
s₁ = init ⊕ f(e₀)
s₂ = init ⊕ f(e₀) ⊕ f(e₁)
…
s_n-1 = init ⊕ f(e₀) ⊕ f(e₁) ⊕ … ⊕ f(e_n₋₂)

And the transform_inclusive_scan() algorithm calculates the following sums:

s₀ = init ⊕ f(e₀)
s₁ = init ⊕ f(e₀) ⊕ f(e₁)
…
s_n-1 = init ⊕ f(e₀) ⊕ f(e₁) ⊕ … ⊕ f(e_n₋₁)

Reduce

Additionally, the following two reduce algorithms have been added:

reduce(first, last[, init[, bin_op]])
transform_reduce(first, last, init, bin_op, un_op)

Where bin_op is a binary operator, std::plus<>() by default. These algorithms result in a single value, similar to accumulate().

Suppose again that the range [first, last) is denoted as [e₀, e_n). reduce() then calculates the following sum:

init ⊕ e₀ ⊕ e₁ ⊕ … ⊕ e_n₋₁

While transform_reduce() results in the following sum, assuming the unary operator is denoted as a function call f():

init ⊕ f(e₀) ⊕ f(e₁) ⊕ … ⊕ f(e_n₋₁)

Unlike accumulate(), reduce() supports parallel execution. The accumulate() algorithm always evaluates everything deterministically from left to right, while the evaluation order is non-deterministic for reduce(). A consequence is that the result of reduce() will be non-deterministic in case the binary operator is not associative or not commutative.

The sum calculated by reduce() with init equal to 0 is exactly the same as the result of calling accumulate() as long as the binary operator that is used is associative and commutative.

Finally, there is another set of overloads for transform_reduce():

transform_reduce(first1, last1, first2, init[, bin_op1, bin_op2])

It requires two ranges: a range [first1, last1), denoted as [a₀, a_n), and a range starting at first2, denoted as [b₀, b_n). Suppose bin_op1 (std::plus<>() by default) is denoted as ⊕, and bin_op2 (std::multiplies<>() by default) is denoted as ⊖, then it calculates the following sum:

init ⊕ (a₀ ⊖ b₀) ⊕ (a₁ ⊖ b₁) ⊕ … ⊕ (a_n-1 ⊖ b_n-1)

Parallel Algorithms

A major addition to the C++17 Standard Library is support for parallel execution of more than 60 of its algorithms, such as sort(), all_of(), find(), transform(), …

If a Standard Library algorithm supports parallel execution, then it accepts an execution policy as its first parameter. This policy determines to what degree the algorithm may parallelize or vectorize its execution. Currently, the following policy types and instances are defined in the std::execution namespace in the <execution> header:

*Execution Policy Type*	Global Instance	Description
sequenced_policy	`seq`	No parallel execution is allowed.
parallel_policy	`par`	Parallel execution is allowed.
parallel_unsequenced_policy	`par_unseq`	Parallel and vectorized execution is allowed. Execution is also allowed to switch between different threads.

The parallel_unsequenced_policy imposes a lot of restrictions on what the algorithm’s function callbacks are allowed to do. With that policy, the calls to the callbacks are unsequenced. As such, its callbacks are not allowed to perform memory allocation/deallocation, acquire mutexes, and more. The other policies do not have such restrictions, and in fact guarantee that their callback calls are sequenced, although of course they can be in a non-deterministic order. In any case, you are responsible to prevent data races and deadlocks.

Using these parallel policies is straightforward. Here is a quick example that generates a vector of 1 billion double values, then uses the std::transform() algorithm to calculate the square root of each value in parallel:

using namespace std;

vector<double> data(1'000'000'000);
iota(begin(data), end(data), 1);

transform(execution::par_unseq, begin(data), end(data), begin(data),
          [](const auto& value) { return sqrt(value); });

If you run this piece of code on an 8-core machine, the CPU load can look as follows. The peak you see on all eight cores is the parallel execution of the call to std::transform().

Utility Functions

C++17 also includes a couple of handy utility functions that are not really algorithms, but still useful to know.

clamp()

std::clamp(value, low, high) is defined in the <algorithm> header. It ensures that a given value is within a given range [low, high]. The result of calling clamp() is:

a reference to low if value < low
a reference to high if value > high
otherwise, a reference to the given value

One use-case is to clamp audio samples to a 16-bit range:

using namespace std;
const int low = -32'768;
const int high = 32'767;
cout << clamp(12'000, low, high) << '\n';
cout << clamp(-36'000, low, high) << '\n';
cout << clamp(40'000, low, high) << '\n';

The output of this code snippet is as follows:

12000
-32768
32767

gcd() and lcm()

std::gcd() returns the greatest common divisor of two integer types, while lcm() returns the least common multiple of two integer types. Both algorithms are defined in the <numeric> header.

Using these algorithms is straightforward, for example:

cout << gcd(24, 44) << '\n';
cout << lcm(24, 44) << '\n';

The output is as follows:

4
264

Removed Algorithms

C++17 has removed one algorithm: std::random_shuffle(). This algorithm was previously already marked as deprecated by C++14. You should use std::shuffle() instead.

Refactoring at Scale

Let’s start with intent and motivation to get everyone on the same page.

Imagine we wish to rename the time() method on the QDateTimeEdit class to value(). We could start with using a simple find-replace to transform all instances of the text time in our code to value. This will obviously change too much – variables or class members named time which are local to functions will be changed to value for example.

We could try to refine our pattern to find expressions which look like calls to time() and convert only those to value(), but that leaves the problem that other classes with a method named time() would be affected, even though we don’t want that. Occurrences of time() in strings and comments would also be affected, even though we may or may not want that. We only want to change the method on one class and its subclasses, and all relevant callers.

One strategy would be to make the change in the QDateTimeEdit class header and try to build the code, fixing any method calls as they are reported as build errors by the compiler. Anyone who has tried that method knows that it doesn’t scale, so we try to automate it. If the name of the method we want to port is unique enough, a naïve approach with a find-replace script can work. Even with methods which have common names, modern developer IDEs offer features for some semantic porting, so the example is a bit contrived. However, we can come up with examples which are not so generic that they are built into IDEs.

Clang tooling possibilities go far beyond what we might be able to achieve with regular expressions and text processing tools such as sed. Early presentations of the tooling such as at the LLVM conference and C++Now emphasize several goals:

Contextual porting of constructs spelled the same way
Differentiation of methods on different classes with the same name
A batch mode of operation as opposed to an interactive/human intervention
Make barrier to creating a “one-off tool solving a hard problem” very low
Refactor code based on implicit constructs and rules in C++
Refactor code based on implicit constructs and rules in C++

As a C++ engineer maintaining a codebase, you may immediately see things which you can use this for, such as:

Rename particular methods, variables or classes (including porting callers)
Replace a type with a new type, (including in function signatures and users of APIs)
Make a constructor explicit (including adding explicit construction in calling code)

This is compelling because it requires less programmer time to create such tooling than to directly attempt to make the changes in the source code. Coding practices change over time as do the people advocating them! Even a well-maintained codebase will have constructs which are worth changing if it is old enough.

Dive in

Let’s follow the steps that a C++ developer would take to create a new clang-tidy tool. We’ll follow the spirit of ‘dive in’ and see how far we can get with basic concepts, extending our knowledge as needed. This means we will gloss over some code, APIs, command line options and concepts where they get in the way of what we are focusing on.

It is possible to create stand-alone tools based on clang, but in this case, extending clang-tidy itself provides several advantages, such as test infrastructure and existing build system. However, because clang-tidy does not support external plugins, we are currently required to build llvm, clang and clang-tidy from source.

One way to do that relatively quickly is the following:

cd ${SRC_ROOT}
git clone https://git.llvm.org/git/llvm.git
cd llvm/tools
git clone https://git.llvm.org/git/clang.git
cd clang/tools
git clone https://git.llvm.org/git/clang-tools-extra.git extra
cd ../../..
mkdir build
cd build
cmake .. -G "Visual Studio 15 2017" ^
    -DCMAKE_GENERATOR_PLATFORM=x64 ^
    -Thost=x64 ^
    -DLLVM_INCLUDE_TESTS=OFF ^
    -DLLVM_TARGETS_TO_BUILD="" ^
    -DCLANG_ENABLE_STATIC_ANALYZER=OFF ^
    -DCLANG_ENABLE_ARCMT=OFF
cmake --build . --target clang-tidy --config RelWithDebInfo
cmake --build . --target clang-query --config RelWithDebInfo

It is important to ensure that clang-tools-extra is cloned to a directory called extra, as the build system relies on that.

clang-tidy can be used both to check for issues about code and to actually implement source-to-source transformation. Extensions to it are simply called checks in the documentation and in the extension API, but those checks can be responsible for more than just “checking”.

The development loop for creating a clang-tidy extension looks something like the image below.

Create a new check

We start by creating a new clang-tidy check. We then examine the Abstract Syntax Tree to determine how it relates to the source code, prototype a Matcher to process the AST and use the Clang FixIt system to replace patterns in the source code. This process is iterated until all relevant patterns are ported by the tool. We will cover all parts of this process as we progress through the blog series.

Once the LLVM/Clang build has finished, we can run the create_new_check.py script in the clang-tidy source to generate code for our porting tool.

cd ${SRC_ROOT}\llvm\tools\clang\tools\extra\clang-tidy
python add_new_check.py misc my-first-check

This generates code in the clang-tidy source. Examining the source of ./misc/MyFirstCheck.cpp, we can see that it adds the prefix awesome_ to functions which do not already have it. Examining ./misc/MiscTidyModule.cpp, we can see that our new check is registered in the tool with the name misc-my-first-check.

Rebuild clang-tidy and try it out on some test code:

void foo()
{  
}

void awesome_bar()
{  
}

Run clang-tidy from the command line with our new check:

clang-tidy.exe -checks=-*,misc-my-first-check testfile.cpp --

The -checks option accepts a comma-separated mini-language which is used to enable and disable checks to run on the specified source file. clang-tidy has several checks which are enabled by default. The -* part disables any default checks, and the misc-my-first-check part enables only our new check. Further information about enabling and disabling checks is available in the clang-tidy documentation.

The two dashes trailing the command are used to silence a warning about missing compilation database. Additional compile options, such as include directories and pre-processor definitions may be specified after the dashes if needed.

The output shows that clang found our function and recommended adding the awesome_ prefix:

1 warning generated.
testfile.cpp:2:6: warning: function 'foo' is insufficiently awesome [misc-my-first-check]
void foo()
     ^~~
     awesome_

Adding the -fix command line parameter actually causes clang-tidy to rewrite our source to modify the functions:

clang-tidy.exe -checks=-*,misc-my-first-check -fix testfile.cpp

1 warning generated.
testfile.cpp:2:6: warning: function 'foo' is insufficiently awesome [misc-my-first-check]
void foo()
     ^~~
     awesome_
testfile.cpp:2:6: note: FIX-IT applied suggested code changes
clang-tidy applied 1 of 1 suggested fixes.

So, clang-tidy reports that it changed our source file, and if we check, we will see that the testfile.cpp now has updated content!

Validation

It looks like we have a semi-generic renaming tool, so let’s try something a little more complicated. Let’s revert the “fix” and call foo() from awesome_bar():

void foo()
{
}

void awesome_bar()
{
    foo();
}

If we run clang-tidy again with the -fix option, we will see that the void foo() function definition was ported but the call to foo() was not ported and the result does not build.

This may be obvious to some readers – declarations of functions are different to uses or calls of functions. The code auto-generated by the create_new_check.py script is not sophisticated enough yet to make code totally awesome.

The next blog post will explore how Clang represents C++ source code. We will then be in a position to extend this new tool to also port the function calls.

What kinds of mechanical source transformations do you intend to implement in your codebase? Let us know in the comments below or contact the author directly via e-mail at stkelly@microsoft.com, or on Twitter @steveire.

In the last post, we created a new clang-tidy check following documented steps and encountered the first limitation in our own knowledge – how can we change both declarations and expressions such as function calls?

In order to create an effective refactoring tool, we need to understand the code generated by the create_new_check.py script and learn how to extend it.

Exploring C++ Code as C++ Code

When Clang processes C++, it creates an Abstract Syntax Tree representing the code. The AST needs to be able to represent all of the possible complexity that can appear in C++ code – variadic templates, lambdas, operator overloading, declarations of various kinds etc. If we can use the AST representation of the code in our tooling, we won’t be discarding any of the meaning of the code in the process, as we would if we limit ourselves to processing only text.

Our goal is to harness the complexity of the AST so that we can describe patterns in it, and then replace those patterns with new text. The Clang AST Matcher API and FixIt API satisfy those requirements respectively.

The level of complexity in the AST means that detailed knowledge is required in order to comprehend it. Even for an experienced C++ developer, the number of classes and how they relate to each other can be daunting. Luckily, there is a rhythm to it all. We can identify patterns, use tools to discover what makes up the Clang model of the C++ code, and get to the point of having an instinct about how to create a clang-tidy check quickly.

Exploring a Clang AST

Let’s dive in and create a simple piece of test code so we can examine the Clang AST for it:

 
int addTwo(int num) 
{ 
    return num + 2; 
} 

int main(int, char**) 
{ 
    return addTwo(3); 
}

There are multiple ways to examine the Clang AST, but the most useful when creating AST Matcher based refactoring tools is clang-query. We need to build up our knowledge of AST matchers and the AST itself at the same time via clang-query.

So, let’s return to MyFirstCheck.cpp which we created in the last post. The MyFirstCheckCheck::registerMatchers method contains the following line:

Finder->addMatcher(functionDecl().bind("x"), this);

The first argument to addMatcher is an AST matcher, an Embedded Domain Specific Language of sorts. This is a predicate language which clang-tidy uses to traverses the AST and create a set of resulting ‘bound nodes’. In the above case, a bound node with the name x is created for each function declaration in the AST. clang-tidy later calls MyFirstCheckCheck::check for each set of bound nodes in the result.

Let’s start clang-query passing our test file as a parameter and following it with two dashes. Similar to use of clang-tidy in Part 1, this allows us to specify compile options and avoid warnings about a missing compilation database.

This command drops us into an interactive interpreter which we can use to query the AST:

$ clang-query.exe testfile.cpp -- 

clang-query>

Type help for a full set of commands available in the interpreter. The first command we can examine is match, which we can abbreviate to m. Let’s paste in the matcher from MyFirstCheck.cpp:

clang-query> match functionDecl().bind("x") 

Match #1: 
 
testfile.cpp:1:1: note: "root" binds here 
int addTwo(int num) 
^~~~~~~~~~~~~~~~~~~ 
testfile.cpp:1:1: note: "x" binds here 
int addTwo(int num) 
^~~~~~~~~~~~~~~~~~~ 
 
Match #2: 
 
testfile.cpp:6:1: note: "root" binds here 
int main(int, char**) 
^~~~~~~~~~~~~~~~~~~~~ 
testfile.cpp:6:1: note: "x" binds here 
int main(int, char**) 
^~~~~~~~~~~~~~~~~~~~~ 
2 matches.

clang-query automatically creates a binding for the root element in a matcher. This gets noisy when trying to match something specific, so it makes sense to turn that off if defining custom binding names:

clang-query> set bind-root false 
clang-query> m functionDecl().bind("x") 

Match #1: 

testfile.cpp:1:1: note: "x" binds here 
int addtwo(int num) 
^~~~~~~~~~~~~~~~~~~ 

Match #2: 

testfile.cpp:6:1: note: "x" binds here 
int main(int, char**) 
^~~~~~~~~~~~~~~~~~~~~ 
2 matches.

So, we can see that for each function declaration that appeared in the translation unit, we get a resulting match. clang-tidy will later use these matches one at a time in the check method in MyFirstCheck.cpp to complete the refactoring.

Use quit to exit the clang-query interpreter. The interpreter must be restarted each time C++ code is changed in order for the new content to be matched.

Nesting matchers

The AST Matchers form a ‘predicate language’ where each matcher in the vocabulary is itself a predicate, and those predicates can be nested. The matchers fit into three broad categories as documented in the AST Matchers Reference.

functionDecl() is an AST Matcher which is invoked for each function declaration in the source code. In normal source code, there will be hundreds or thousands of results coming from external headers for such a simple matcher.

Let’s match only functions with a particular name:

clang-query> m functionDecl(hasName("addTwo")) 

Match #1: 

testfile.cpp:1:1: note: "root" binds here 
int addTwo(int num) 
^~~~~~~~~~~~~~~~~~~ 
1 match.

This matcher will only trigger on function declarations which have the name “addTwo“. The middle column of the documentation indicates the name of each matcher, and the first column indicates the kind of matcher that it can be nested inside. The hasName documentation is not listed as being usable with the Matcher<FunctionDecl>, but instead with Matcher<NamedDecl>.

Here, a developer without prior experience with the Clang AST needs to learn that the FunctionDecl AST class inherits from the NamedDecl AST class (as well as DeclaratorDecl, ValueDecl and Decl). Matchers documented as usable with each of those classes can also work with a functionDecl() matcher. That familiarity with the inheritance structure of Clang AST classes is essential to proficiency with AST Matchers. The names of classes in the Clang AST correspond to “node matcher” names by making the first letter lower-case. In the case of class names with an abbreviation prefix CXX such as CXXMemberCallExpr, the entire prefix is lowercased to produce the matcher name cxxMemberCallExpr.

So, instead of matching function declarations, we can match on all named declarations in our source code. Ignoring some noise in the output, we get results for each function declaration and each parameter variable declaration:

clang-query> m namedDecl() 
... 
Match #8: 

testfile.cpp:1:1: note: "root" binds here 
int addTwo(int num) 
^~~~~~~~~~~~~~~~~~~ 

Match #9: 

testfile.cpp:1:12: note: "root" binds here 
int addTwo(int num) 
           ^~~~~~~ 

Match #10: 

testfile.cpp:6:1: note: "root" binds here 
int main(int, char**) 
^~~~~~~~~~~~~~~~~~~~~ 

Match #11: 

testfile.cpp:6:10: note: "root" binds here 
int main(int, char**) 
         ^~~ 

Match #12: 

testfile.cpp:6:15: note: "root" binds here 
int main(int, char**) 
              ^~~~~~

Parameter declarations are in the match results because they are represented by the ParmVarDecl class, which also inherits NamedDecl. We can match only parameter variable declarations by using the corresponding AST node matcher:

clang-query> m parmVarDecl() 

Match #1: 

testfile.cpp:1:12: note: "root" binds here 
int addTwo(int num) 
           ^~~~~~~ 

Match #2: 

testfile.cpp:6:10: note: "root" binds here 
int main(int, char**) 
         ^~~ 

Match #3: 

testfile.cpp:6:15: note: "root" binds here 
int main(int, char**) 
              ^~~~~~

clang-query has a code-completion feature, triggered by pressing TAB, which shows the matchers which can be used at any particular context. This feature is not enabled on Windows however.

Discovery Through Clang AST Dumps

clang-query gets most useful as a discovery tool when exploring deeper into the AST and dumping intermediate nodes.

Let’s query our testfile.cpp again, this time with the output set to dump:

clang-query> set output dump 
clang-query> m functionDecl(hasName(“addTwo”)) 

Match #1: 

Binding for "root": 
FunctionDecl 0x17a193726b8 <testfile.cpp:1:1, line:4:1> line:1:5 used addTwo 'int (int)' 
|-ParmVarDecl 0x17a193725f0 <col:12, col:16> col:16 used num 'int' 
`-CompoundStmt 0x17a19372840 <line:2:1, line:4:1> 
  `-ReturnStmt 0x17a19372828 <line:3:5, col:18>
      `-BinaryOperator 0x17a19372800 <col:12, col:18> 'int' '+' 
          |-ImplicitCastExpr 0x17a193727e8 <col:12> 'int' <LValueToRValue>
            | `-DeclRefExpr 0x17a19372798 <col:12> 'int' lvalue ParmVar 0x17a193725f0 'num' 'int' 
            `-IntegerLiteral 0x17a193727c0 <col:18> 'int' 2

There is a lot here to take in, and a lot of noise which is not relevant to what we are interested in to make a matcher, such as pointer addresses, the word used appearing inexplicably and other content whose structure is not obvious. For the sake of brevity in this blog post, I will elide such content in further listings of AST content.

The reported match has a FunctionDecl at the top level of a tree. Below that, we can see the ParmVarDecl nodes which we matched previously, and other nodes such as ReturnStmt. Each of these corresponds to a class name in the Clang AST, so it is useful to look them up to see what they inherit and know which matchers are relevant to their use.

The AST also contains source location and source range information, the latter denoted by angle brackets. While this detailed output is useful for exploring the AST, it is not as useful for exploring the source code. Diagnostic mode can be re-entered with set output diag for source code exploration. Unfortunately, both outputs (dump and diag) can not currently be enabled at once, so it is necessary to switch between them.

Tree Traversal

We can traverse this tree using the has() matcher:

clang-query> m functionDecl(has(compoundStmt(has(returnStmt(has(callExpr())))))) 

Match #1: 

Binding for "root": 
FunctionDecl <testfile.cpp:6:1, line:9:1> line:6:5 main 'int (int, char **)' 
|-ParmVarDecl <col:10> col:13 'int' 
|-ParmVarDecl <col:15, col:20> col:21 'char **' 
`-CompoundStmt <line:7:1, line:9:1> 
  `-ReturnStmt <line:8:5, col:20> 
      `-CallExpr <col:12, col:20> 'int' 
          |-ImplicitCastExpr <col:12> 'int (*)(int)'
            | `-DeclRefExpr <col:12> 'int (int)' 'addTwo'
            `-IntegerLiteral <col:19> 'int' 3

With some distracting content removed, we can see that the AST dump contains some source ranges and source locations. The ranges are denoted by angle brackets, which have a beginning and possibly an end position. To avoid repeating the filename and the keywords line and col, only difference from the previously printed source location are printed. For example, <testfile.cpp:6:1, line:9:1> describes a span from line 6 column 1 in testfile.cpp to line 9 column 1 also in testfile.cpp. The range <col:15, col:20> describes the span from column 15 to column 20 in line 6 (from a few lines above) in testfile.cpp as that is the last filename printed.

Because each of the nested predicates match, the top-level functionDecl() matches and we get a binding for the result. We can additionally use a nested bind() call to add nodes to the result set:

clang-query> m functionDecl(has(compoundStmt(has(returnStmt(has(callExpr().bind("functionCall"))))))) 

Match #1: 

Binding for "functionCall": 
CallExpr <testfile.cpp:8:12, col:20> 'int' 
|-ImplicitCastExpr <col:12> 'int (*)(int)'
| `-DeclRefExpr <col:12> 'int (int)' 'addTwo'
`-IntegerLiteral <col:19> 'int' 3 

Binding for "root": 
FunctionDecl <testfile.cpp:6:1, line:9:1> line:6:5 main 'int (int, char **)' 
|-ParmVarDecl <col:10> col:13 'int' 
|-ParmVarDecl <col:15, col:20> col:21 'char **' 
`-CompoundStmt <line:7:1, line:9:1> 
  `-ReturnStmt <line:8:5, col:20> 
      `-CallExpr <col:12, col:20> 'int' 
          |-ImplicitCastExpr <col:12> 'int (*)(int)'
            | `-DeclRefExpr <col:12> 'int (int)' 'addTwo'
            `-IntegerLiteral <col:19> 'int' 3

The hasDescendant() matcher can be used to match the same node as above in this case:

clang-query> m functionDecl(hasDescendant(callExpr().bind("functionCall")))

Note that over-use of the has() and hasDescendant() matchers – and their complements hasParent() and hasAncestor() – is usually an anti-pattern and can lead to unintended results, particularly while matching nested Expr subclasses in source code. Usually, higher-level matchers should be used instead. For example, while has() may be used to match a desired IntegerLiteral argument in the case above, it would not be possible to specify which argument we wish to match in a function which has multiple arguments. The hasArgument() matcher should be used in the case of callExpr() to resolve this issue, as it can specify which argument should be matched if there are multiple:

clang-query> m callExpr(hasArgument(0, integerLiteral()))

The above matcher will match on every function call whose zeroth argument is an integer literal.

Usually we want to use more narrowing criteria to only match on a particular category of matches. Most matchers accept multiple arguments and behave as though they have an implicit allOf() within them. So, we can write:

clang-query> m callExpr(hasArgument(0, integerLiteral()), callee(functionDecl(hasName("addTwo"))))

to match calls whose zeroth argument is an integer literal only if the function being called has the name “addTwo“.

A matcher expression can sometimes be obvious to read and understand, but harder to write or discover. The particular node types which may be matched can be discovered by examining the output of clang-query. However, the callee() matcher here may be difficult to independently discover because it did not appear to be referenced in the AST dumps from clang-query and it is only one matcher in the long list in the reference documentation. The code of the existing clang-tidy checks are educational both to discover matchers which are commonly used together, and to find a context where particular matchers should be used.

A nested matcher creating a binding in clang-query is another important discovery technique. If we have source code such as:

int add(int num1, int num2) 
{
  return num1 + num2; 
} 

int add(int num1, int num2, int num3) 
{
  return num1 + num2 + num3; 
} 

int main(int argc, char**) 
{ 
  int i = 42; 

  return add(argc, add(42, i), 4 * 7); 
}

and we intend to introduce a safe_int type to use instead of int in the signature of add. All existing uses of add must be ported to some new pattern of code.

The basic workflow with clang-query is that we must first identify source code which is exemplary of what we want to port and then determine how it is represented in the Clang AST. We will need to identify the locations of arguments to the add function and their AST types as a first step.

Let’s start with callExpr() again:

clang-query> m callExpr() 

Match #1: 

testfile.cpp:15:10: note: "root" binds here 
    return add(argc, add(42, i), 4 * 7); 
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

Match #2: 

testfile.cpp:15:20: note: "root" binds here 
    return add(argc, add(42, i), 4 * 7); 
                     ^~~~~~~~~~

This example uses various different arguments to the add function: the first argument is a parameter from a different function, then a return value of another call, then an inline multiplication. clang-query can help us discover how to match these constructs. Using the hasArgument() matcher we can bind to each of the three arguments, and using bind-root false for brevity:

clang-query> set bind-root false 
clang-query> m callExpr(hasArgument(0, expr().bind("a1")), hasArgument(1, expr().bind("a2")), hasArgument(2, expr().bind("a3"))) 

Match #1: 

testfile.cpp:15:14: note: "a1" binds here 
return add(argc, add(42, i), 4 * 7); 
           ^~~~ 

testfile.cpp:15:20: note: "a2" binds here 
return add(argc, add(42, i), 4 * 7); 
                 ^~~~~~~~~~ 

testfile.cpp:15:32: note: "a3" binds here 
return add(argc, add(42, i), 4 * 7); 
                             ^~~~~

Changing the output to dump and re-running the same matcher:

clang-query> set output dump 
clang-query> m callExpr(hasArgument(0, expr().bind("a1")), hasArgument(1, expr().bind("a2")), hasArgument(2, expr().bind("a3"))) 

Match #1: 

Binding for "a1": 
DeclRefExpr <testfile.cpp:15:14> 'int' 'argc'

Binding for "a2": 
CallExpr <testfile.cpp:15:20, col:29> 'int' 
|-ImplicitCastExpr <col:20> 'int (*)(int, int)' 
| `-DeclRefExpr <col:20> 'int (int, int)' 'add' 
|-IntegerLiteral <col:24> 'int' 42 
`-ImplicitCastExpr <col:28> 'int' 
  `-DeclRefExpr <col:28> 'int' 'i' 

Binding for "a3": 
BinaryOperator <testfile.cpp:15:32, col:36> 'int' '*' 
|-IntegerLiteral <col:32> 'int' 4 
`-IntegerLiteral <col:36> 'int' 7

We can see that the top-level AST nodes of the arguments are DeclRefExpr, CallExpr and BinaryOperator respectively. When implementing our refactoring tool, we might want to wrap the argc as safe_int(argc), ignore the nested add() call, as its return type will be changed to safe_int, and change the BinaryOperator to some safe operation.

As we learn about the AST we are examining, we can also replace the expr() with something more specific to explore further. Because we now know the second argument is a CallExpr, we can use a callExpr() matcher to check the callee. The callee() matcher only works if we specify callExpr() instead of expr():

clang-query> m callExpr(hasArgument(1, callExpr(callee(functionDecl().bind("func"))).bind("a2"))) 

Match #1: 

Binding for "a2": 
CallExpr <testfile.cpp:15:20, col:29> 'int' 
|-ImplicitCastExpr <col:20> 'int (*)(int, int)'
| `-DeclRefExpr <col:20> 'int (int, int)' 'add'
|-IntegerLiteral <col:24> 'int' 42 
`-ImplicitCastExpr <col:28> 'int' 
  `-DeclRefExpr <col:28> 'int' 'i' 

Binding for "func": 
FunctionDecl <testfile.cpp:1:1, line:4:1> line:1:5 add 'int (int, int)' 
... etc 

1 match. 
clang-query> set output diag 
clang-query> m callExpr(hasArgument(1, callExpr(callee(functionDecl().bind("func"))).bind("a2"))) 

Match #1: 

testfile.cpp:15:20: note: "a2" binds here 
return add(argc, add(42, i), 4 * 7); 
                 ^~~~~~~~~~ 

testfile.cpp:1:1: note: "func" binds here 
int add(int num1, int num2) 
^~~~~~~~~~~~~~~~~~~~~~~~~~~

Avoiding the Firehose

Usually when you need to examine the AST it will make sense to run clang-query on your real source code instead of a single-file demo. Starting off with a callExpr() matcher will result in a firehose problem – there will be tens of thousands of results and you will not be able to determine how to make your matcher more specific for the lines of source code you are interested in. Several tricks can come to your aid in this case.

First, you can use isExpansionInMainFile() to limit the matches to only the main file, excluding all results from headers. That matcher can be used with Exprs, Stmts and Decls, so it is useful for everything you might want to start matching.

Second, if you still get too many results from your matcher, the has Ancestor matcher can be used to limit the results further.

Third, often particular names of variables can anchor your match to some particular piece of code of interest.

Exploring the AST of code such as

 
void myFuncName() 
{ 
  int i = someFunc() + Point(4, 5).translateX(9);   
}

might start with a matcher which anchors to the name of the variable, the function it is in and the location in the main file:

varDecl(isExpansionInMainFile(), hasAncestor(functionDecl(hasName("myFuncName"))), hasName("i"))

This starting point will make it possible to explore how the rest of the line is represented in the AST without being drowned in noise.

Conclusion

clang-query is an essential asset while developing a refactoring tool with AST Matchers. It is a prototyping and discovery tool, whose input can be pasted into the implementation of a new clang-tidy check.

In this blog post, we explored the basic use of the clang-query tool – nesting matchers and binding their results – and how the output corresponds to the AST Matcher Reference. We also saw how to limit the scope of matches to enable easy creation of matchers in real code.

In the next blog post, we will explore the corresponding consumer of AST matcher results. This will be the actual re-writing of the source code corresponding to the patterns we have identified as refactoring targets.

Which AST Matchers do you think will be most useful in your code? Let us know in the comments below or contact the author directly via e-mail at stkelly@microsoft.com, or on Twitter @steveire.

I will be showing even more new and future developments in clang-query at code::dive in November. Make sure to put it in your calendar if you are attending!

In the previous post in this series, we used clang-query to examine the Abstract Syntax Tree of a simple source code file. Using clang-query, we can prototype an AST Matcher which we can use in a clang-tidy check to refactor code in bulk.

This time, we will complete the rewriting of the source code.

Let’s return to MyFirstCheck.cpp we generated earlier and update the registerMatchers method. First we can refactor it to port both function declarations and function calls, using the callExpr() and callee() matchers we used in the previous post:

void MyFirstCheckCheck::registerMatchers(MatchFinder *Finder) {
    
  auto nonAwesomeFunction = functionDecl(
    unless(matchesName("^::awesome_"))
    );

  Finder->addMatcher(
    nonAwesomeFunction.bind("addAwesomePrefix")
    , this);

  Finder->addMatcher(
    callExpr(callee(nonAwesomeFunction)).bind("addAwesomePrefix")
    , this);
}

Because Matchers are really C++ code, we can extract them into variables and compose them into multiple other Matchers, as done here with nonAwesomeFunction.

In this case, I have narrowed the declaration matcher to match only on function declarations which do not start with awesome_. That matcher is then used once with a binder addAwesomePrefix, then again to specify the callee() of a callExpr(), again binding the relevant expression to the name addAwesomePrefix.

Because large scale refactoring often involves primarily changing particular expressions, it generally makes sense to separately define the matchers for the declaration to match and the expressions referencing those declarations. In my experience, the matchers for declarations can get complicated for example with exclusions due to limitations of a reflection system, or with more specifics about functions with particular return types or argument types. Centralizing those cases helps keep your refactoring code maintainable.

Another change I have made is that I renamed the binding from x to addAwesomePrefix. This is notable because it uses verbs to describe what should be done with the matches. It should be clear from reading matcher bindings what the result of invoking the fix is to be. Binding names can then be seen as a weakly-typed string-based language interface between the matcher and the replacement code.
We can then implement MyFirstCheckCheck::check to consume the bindings. A first approximation might look like:

void MyFirstCheckCheck::check(const MatchFinder::MatchResult &Result) {
  if (const auto MatchedDecl = Result.Nodes.getNodeAs<FunctionDecl>("addAwesomePrefix"))
  {
    diag(MatchedDecl->getLocation(), "function is insufficiently awesome")
      << FixItHint::CreateInsertion(MatchedDecl->getLocation(), "awesome_");
  }

  if (const auto MatchedExpr = Result.Nodes.getNodeAs<CallExpr>("addAwesomePrefix"))
  {
    diag(MatchedExpr->getExprLoc(), "code is insufficiently awesome")
      << FixItHint::CreateInsertion(MatchedExpr->getExprLoc(), "awesome_");
  }
}

Perhaps a better implementation would reduce the duplication of the diagnostic code:

void MyFirstCheckCheck::check(const MatchFinder::MatchResult &Result) {
  SourceLocation insertionLocation;
  if (const auto MatchedDecl = Result.Nodes.getNodeAs<FunctionDecl>("addAwesomePrefix"))
  {
    insertionLocation = MatchedDecl->getLocation();
  } else if (const auto MatchedExpr = Result.Nodes.getNodeAs<CallExpr>("addAwesomePrefix"))
  {
    insertionLocation = MatchedExpr->getExprLoc();
  }
  diag(insertionLocation, "code is insufficiently awesome")
      << FixItHint::CreateInsertion(insertionLocation, "awesome_");
}

Because the FunctionDecl and the CallExpr do not share an inheritance hierarchy, we need separate casting conditions for each. Even if they did share an inheritance hierarchy, we need to call getLocation in one case, and getExprLoc in another. The reason for that is that Clang records many relevant locations for each AST node. The developer of the clang-tidy check needs to know which location accessor method is appropriate or required for each situation.
A further improvement is to change the casts to accept the relevant types of FunctionDecl and CallExpr – NamedDecl and Expr respectively.

if (const auto MatchedDecl = Result.Nodes.getNodeAs<NamedDecl>("addAwesomePrefix"))
{
  insertionLocation = MatchedDecl->getLocation();
} else if (const auto MatchedExpr = Result.Nodes.getNodeAs<Expr>("addAwesomePrefix"))
{
  insertionLocation = MatchedExpr->getExprLoc();
}

This change enforces the idea that the names of bound nodes form a weakly-typed interface between the Matcher code and the Rewriter code. Because the Rewriter code now expects the addAwesomePrefix to be used with the base types NamedDecl and Expr, other Matcher code can take advantage of that. We can now re-use the addAwesomePrefix binding name to add a prefix to field declarations or member expressions for example because their corresponding Clang AST classes also inherit NamedDecl:

auto nonAwesomeField = fieldDecl(unless(hasName("::awesome_")));
Finder->addMatcher(
  nonAwesomeField.bind("addAwesomePrefix")
  , this);

Finder->addMatcher(
  memberExpr(member(nonAwesomeField)).bind("addAwesomePrefix")
  , this);

Notice that this code is comparable to the matchers we wrote for the functionDecl/callExpr pairing. Taking advantage of the binding name interface, we can continue extending our matcher code to port variable declarations without changing the rewriter side of that interface:

void MyFirstCheckCheck::registerMatchers(MatchFinder *Finder) {
  
  auto nonAwesome = namedDecl(
    unless(matchesName("::awesome_.*"))
    );

  auto nonAwesomeFunction = functionDecl(nonAwesome);
  // void foo(); 
  Finder->addMatcher(
    nonAwesomeFunction.bind("addAwesomePrefix")
    , this);

  // foo();
  Finder->addMatcher(
    callExpr(callee(nonAwesomeFunction)).bind("addAwesomePrefix")
    , this);

  auto nonAwesomeVar = varDecl(nonAwesome);
  // int foo;
  Finder->addMatcher(
    nonAwesomeVar.bind("addAwesomePrefix")
    , this);

  // foo = 7;
  Finder->addMatcher(
    declRefExpr(to(nonAwesomeVar)).bind("addAwesomePrefix")
    , this);

  auto nonAwesomeField = fieldDecl(nonAwesome);
  // int m_foo;
  Finder->addMatcher(
    nonAwesomeField.bind("addAwesomePrefix")
    , this);

  // m_foo = 42;
  Finder->addMatcher(
    memberExpr(member(nonAwesomeField)).bind("addAwesomePrefix")
    , this);
}

Location Location Location

Let’s return to the check implementation and examine it. This method is responsible for implementing the rewriting of the source code as described by the matchers and their bound nodes.
In this case, we have inserted code at the SourceLocation returned by either getLocation() or getExprLoc() of NamedDecl or Expr respectively. Clang AST classes have many methods returning SourceLocation which refer to various places in the source code related to particular AST nodes.
For example, the CallExpr has SourceLocation accessors getBeginLoc, getEndLoc and getExprLoc. It is currently difficult to discover how a particular position in the source code relates to a particular SourceLocation accessor.

clang::VarDecl represents variable declarations in the Clang AST. clang::ParmVarDecl inherits clang::VarDecl and represents parameter declarations. Notice that in all cases, end locations indicate the beginning of the last token, not the end of it. Note also that in the second example below, the source locations of the call used to initialize the variable are not part of the variable. It is necessary to traverse to the initialization expression to access those.

clang::FunctionDecl represents function declarations in the Clang AST. clang::CXXMethodDel inherits clang::FunctionDecl and represents method declarations. Note that the location of the return type is not always given by getBeginLoc in C++.

clang::CallExpr represents function calls in the Clang AST. clang::CXXMemberCallExpr inherits clang::CallExpr and represents method calls. Note that when calling free functions (represented by a clang::CallExpr), the getExprLoc and the getBeginLoc will be the same. Always chose the semantically correct location accessor, rather than a location which appears to indicate the correct position.

It is important to know that locations on AST classes point to the start of tokens in all cases. This can be initially confusing when examining end locations. Sometimes to get to a desired location, it is necessary to use getLocWithOffset() to advance or retreat a SourceLocation. Advancing to the end of a token can be achieved with Lexer::getLocForEndOfToken.

The source code locations of arguments to the function call are not accessible from the CallExpr, but must be accessed via AST nodes for the arguments themselves.

// Get the zeroth argument:
Expr* arg0 = someCallExpr->getArg(0);
SourceLocation arg0Loc = arg0->getExprLoc();

Every AST node has accessors getBeginLoc and getEndLoc. Expression nodes additionally have a getExprLoc, and declaration nodes have an additional getLocation accessor. More-specific subclasses have more-specific accessors for locations relevant to the C++ construct they represent. Source code locations in Clang are comprehensive, but accessing them can get complex as requirements become more advanced. A future blog post may explore this topic in more detail if there is interest among the readership.

Once we have acquired the locations we are interested in, we need to insert, remove or replace source code fragments at those locations.

Let’s return to MyFirstCheck.cpp:

diag(insertionLocation, "code is insufficiently awesome")
    << FixItHint::CreateInsertion(insertionLocation, "awesome_");

diag is a method on the ClangTidyCheck base class. The purpose of it is to issue diagnostics and messages to the user. It can be called with just a source location and a message, causing a diagnostic to be emitted at the specified location:

diag(insertionLocation, "code is insufficiently awesome");

Resulting in:

    testfile.cpp:19:5: warning: code is insufficiently awesome [misc-my-first-check]
    int addTwo(int num)
        ^

The diag method returns a DiagnosticsBuilder to which we can stream fix suggestions using FixItHint.

The CreateRemoval method creates a FixIt for removal of a range of source code. At its heart, a SourceRange is just a pair of SourceLocations. If we wanted to remove the awesome_ prefix from functions which have it, we might expect to write something like this:

void MyFirstCheckCheck::registerMatchers(MatchFinder *Finder) {
  
  Finder->addMatcher(
    functionDecl(
      matchesName("::awesome_.*")
      ).bind("removeAwesomePrefix")
    , this);
}

void MyFirstCheckCheck::check(const MatchFinder::MatchResult &Result) {

  if (const auto MatchedDecl = Result.Nodes.getNodeAs<NamedDecl>("removeAwesomePrefix"))
  {
      auto removalStartLocation = MatchedDecl->getLocation();
      auto removalEndLocation = removalStartLocation.getLocWithOffset(sizeof("awesome_") - 1);
      auto removalRange = SourceRange(removalStartLocation, removalEndLocation);

      diag(removalStartLocation, "code is too awesome")
          << FixItHint::CreateRemoval(removalRange);
  }
}

The matcher part of this code is fine, but when we run clang-tidy, we find that the removal is applied to the entire function name, not only the awesome_ prefix. The problem is that Clang extends the end of the removal range to the end of the token pointed to by the end. This is symmetric with the fact that AST nodes have getEndLoc() methods which point to the start of the last token. Usually, the intent is to remove or replace entire tokens.

To make a replacement or removal in source code which extends into the middle of a token, we need to indicate that we are replacing a range of characters instead of a range of tokens, using CharSourceRange::getCharRange:

auto removalRange = CharSourceRange::getCharRange(removalStartLocation, removalEndLocation);

Conclusion

This concludes the mini-series about writing clang-tidy checks. This series has been an experiment to gauge interest, and there is a lot more content to cover in further posts if there is interest among the readership.

Further topics can cover topics that occur in the real world such as

Creation of compile databases
Creating a stand-alone buildsystem for clang-tidy checks
Understanding and exploring source locations
Completing more-complex tasks
Extending the matcher system with custom matchers
Testing refactorings
More tips and tricks from the trenches.

This would cover everything you need to know in order to quickly and effectively create and use custom refactoring tools on your codebase.

Do you want to see more! Let us know in the comments below or contact the author directly via e-mail at stkelly@microsoft.com, or on Twitter @steveire.

I will be showing even more new and future developments in clang-query and clang-tidy at code::dive tomorrow, including many of the items listed as future topics above. Make sure to schedule it in your calendar if you are attending code::dive!

We’re happy to announce that the ongoing conformance work in the MSVC compiler has reached a new milestone: support for Eric Niebler’s range-v3 library. It’s no longer necessary to use the range-v3-vs2015 fork that was introduced for MSVC 2015 Update 3 support; true upstream range-v3 is now usable directly with MSVC 2017.

The last push to achieve range-v3 support involved Microsoft-sponsored changes in both the MSVC compiler and range-v3. The compiler changes involved fixing about 60 historically blocking bugs, of which 30+ were alias template bugs in /permissive- mode. Changes to range-v3 were to add support for building the test suite with MSVC, and some workarounds for roughly a dozen minor bugs that we will be working on fixing in future releases of MSVC.

How do I get range-v3 to try it out?

The range-v3 changes haven’t yet flowed into a release, so for now MSVC users should use the master branch. You can get range-v3:

Via vcpkg with “vcpkg install range-v3″, or
Directly from https://github.com/ericniebler/range-v3.

Note that range-v3’s master branch is under active development, so it’s possible that the head of the branch may be unusable at some times. Releases after 0.4.0 will have MSVC support; until then the commit at 01ccd0e5 is known to be good. Users of vcpkg should have no issues: the range-v3 packager will ensure that vcpkg installs a known-good release.

What’s next

Continue fixing bugs that get reported from range-v3 usage and development.

Prioritize implementation of C++20 Concepts in preparation to support the concept-based Ranges library that will hopefully be part of the upcoming C++20.

In closing

This range-v3 announcement follows our previous announcement about supporting Boost.Hana in 15.8. The C++ team here at Microsoft is strongly motivated to continue improving our support for open source libraries.

We’d love for you to download Visual Studio 2017 version 15.9 and try out all the new C++ features and improvements. As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter other problems with MSVC or have a suggestion for Visual Studio 2017 please let us know through Help > Send Feedback > Report A Problem / Provide a Suggestion in the product, or via Developer Community. You can also find us on Twitter (@VisualC) and Facebook (msftvisualcpp).

If you have any questions, please feel free to post in the comments below. You can also send any comments and suggestions directly to cacarter@microsoft.com, or @CoderCasey.

This post is part of a regular series of posts where the C++ product team here at Microsoft answers questions we have received from customers. The questions can be about anything C++ related: Visual C++, the standard language and library, the C++ standards committee, isocpp.org, CppCon, etc. Today’s Q&A is by Herb Sutter.

Question

Reader @thesamhughescom recently asked:

Has there ever been a consideration for allowing individual private functions to whitelist other classes or functions to call them? Similar to the per class friend method, I was thinking you could annotate a function with [[friend void foo(int)]] or [[friend class baz]] just an idea, I wouldn’t know where to get started on writing my own proposal, thanks Sam

Answer

There are occasionally proposals, but friend itself should be used very rarely so there’s not a lot of motivation to encourage it.

One technique you can use today is to provide a set of friend helper types, each of which provides specific access to a given subset of things and can name its own friends, be handed out to callers, etc. That’s a flexible form of access control that works for granting different degrees of friendship statically to a set of types, granting access to some parts of the class’s interface dynamically to a set of callers, and so on.

But the last part of your question is the easiest, and perhaps the most important:

I wouldn’t know where to get started on writing my own proposal

Always (always, always, …) start with use cases. Write two or three concrete examples of the kind of code you want to write (initially pseudocode), explain clearly why you want to express it that way, and show the workarounds if any that are available for approximating it today. Compelling examples serve two major purposes: they motivate the feature (answer “why” questions like why have the features), and they also provide reference points to guide the feature’s design (answer “what” and “how” questions like what things the feature must be able to express and how it should be able to be used).

Your questions?

If you have any question about C++ in general, please comment about it below. Someone in the community may answer it, or someone on our team may consider it for a future blog post. If instead your question is about support for a Microsoft product, you can provide feedback via Help > Report A Problem in the product, or via Developer Community.

A great strength of C++ is the ability to target multiple platforms without sacrificing performance. If you are using the same codebase for multiple targets, then CMake is the most common solution for building your software. You can use Visual Studio for your C++ cross platform development when using CMake without needing to create or generate Visual Studio projects. Just open the folder with your sources in Visual Studio (File > Open Folder). Visual Studio will recognize CMake is being used, then use metadata CMake produces to configure IntelliSense and builds automatically. You can quickly be editing, building and debugging your code locally on Windows, and then switching your configuration to do the same on Linux all from within Visual Studio.

Teams working on these types of code bases may have developers who have different primary operating systems, e.g. some people are on Linux (and may be using the Visual Studio Code editor) and some are on Windows (probably using the Visual Studio IDE). In an environment like this, the choice of tools may be up to the developers themselves. You can use Visual Studio in an environment like this without perturbing your other team members or making changes to your source as is. If or when additional configuration is needed it is saved in flat json files that can be saved locally, or shared in source control with other developers using Visual Studio without impacting developers that are not using it.

Visual Studio isn’t just for Windows C and C++ development anymore. If you follow the tutorial below on your own machine, you will clone an open source project from GitHub, open it in Visual Studio, edit, build and debug on Windows with no changes to the project. Then Visual Studio will add a connection to a Linux machine and edit, build and debug it on that remote machine.

The next section shows you how to setup Visual Studio, followed by a section on how to configure your Linux target, and last the tutorial itself – have fun!

Setting up Visual Studio for Cross Platform C++ Development

First you need to have Visual Studio installed. If you have it installed already confirm that you have the Desktop development with C++ and Linux development with C++ workloads installed. If you don’t have Visual Studio installed use this link to install it with the minimal set of components for this tutorial selected. This minimal install is only a 3GB, depending on your download speed installation should not take more than 10 minutes.

Once that is done you are ready to go on Windows.

Configuring your Linux machine for cross platform C++ development

Visual Studio does not have a requirement for a specific distribution of Linux; use any you would like to. That can be on a physical machine, in a VM, the cloud, or even running on Windows Subsystem for Linux. The tools Visual Studio requires to be present on the Linux machine are: C++ compilers, GDB, ssh, and zip. On Debian based systems you can install these dependencies as follows.

sudo apt install -y openssh-server build-essential gdb zip

Visual Studio also of course requires CMake. However, it needs a recent version of CMake that has server mode enabled (at least 3.8). Our team produces a universal build of CMake that you can install on any Linux distro. We recommend using this build over what may be in your package manager as it is built from our fork of the CMake source. Using that fork ensures that you have the latest features in case they haven’t made it back up stream. We document how to configure CMake here, and you can get the CMake binaries from here. Go to that page and download the version that matches your system architecture on your Linux machine. Mark it as an executable:

wget chmod +x cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh
chmod +x cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh

You can see the options for running it with –help. We recommend that you use the –prefix option to specify installing in the /usr/local path as that is the default location Visual Studio looks for CMake at.

sudo ./cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh --skip-license --prefix=/usr/local

Tutorial: Using the Bullet Physics SDK GitHub repo in Visual Studio

Now that you have Visual Studio and a Linux machine ready to go lets walk through getting a real open source C++ project working in Visual Studio targeting Windows and Linux. For this tutorial we are going to use the Bullet Physics SDK on GitHub. This is a library that provides collision detection and physics simulations for a variety of different applications. There are sample executable programs within it to use so we have something to interact with without having to write additional code. You will not have to modify any of this source or build scripts in the steps that follow.

Note that you can use any Linux distro for this tutorial, however using Windows Subsystem for Linux for this one is not a good idea since the executable we are going to run is graphical which is not supported officially there.

Step 1 – Clone and open the bullet3 repo

To start, clone the bullet3 repository from GitHub on the machine where you have Visual Studio installed. If you have git installed on your command line it will be as simple as running git clone wherever you would like to keep this repository.

git clone https://github.com/bulletphysics/bullet3.git

Now open the root project folder, bullet3, that was created by cloning the repo in Visual Studio. Use the menu option File > Open > Folder which will detect and use the CMakeLists.txt file or you can use File > Open > CMake to select the desired CMakeLists.txt file directly.

You can also clone a git repo directly within Visual Studio which will automatically open the folder when you are done.

Visual Studio menu for File > Open > CMake

As soon as you open the folder your folder structure will be visible in the Solution Explorer.

Visual Studio Solution Explorer Folder View

This view shows you exactly what is on disk, not a logical or filtered view. By default, it does not show hidden files. To see them, select the show all files button in the Solution Explorer.

Visual Studio Solution Explorer Show All Files

Step 2 – Use targets view

When you open a folder that uses CMake, Visual Studio will automatically generate the CMake cache. This will take a few moments, or longer depending on the size of your project. The status of this process is placed in the output window. It is complete when you see the message “Target info extraction done”.

Visual Studio Output window showing output from CMake

After this completes, IntelliSense is configured, the project can build, and you can launch the application and debug it. Visual Studio also now understands the build targets that the CMake project produces. This enables an alternate CMake Targets View that provides a logical view of the solution. Use the Solutions and Folders button in the Solution Explorer to switch to this view.

Solutions and Folders button in the Solution Explorer to show CMake targets view

Here is what that view looks like for the Bullet SDK.

Solution Explorer CMake targets view

This gives us a more intuitive view of what is in this source base. You can see some targets are libraries and others are executables. You can expand these nodes and see the source that comprises them independent of how it is represented on disk.

Step 3 – Set breakpoint, build and run

For this tutorial, use an executable to get something that can just run and get into the debugger. Select AppBasicExampleGui and expand it. Open the file BasicExample.cpp. This is an example program that demonstrates the Bullet Physics library by rendering a bunch of cubes arranged as a single block that are falling and smash apart on hitting a surface. Next set a break point that will be triggered when you click in the running application. That is handled in a method within a helper class used by this application. To quickly get there select CommonRigidBodyBase that the struct BasicExample is derived from around line 30. Right click and choose Go to Definition. Now you are in the header CommonRigidBodyBase.h. In the browser view above your source you should see that you are in the CommonRigidBodyBase. To the right you can select members within to examine, drop that selection down and select mouseButtonCallbackwhich will take you to the definition of that function in the header.

Visual Studio member list toolbar

Place a breakpoint on the first line within this function. This will trigger when you click a mouse button within the window of the application when launched under the Visual Studio debugger.

To launch the application, select the launch dropdown with the play icon that says “Select Startup Item” in the toolbar.

Visual Studio toolbar launch drop down for Select Startup Item

In the dropdown select AppBasicExampleGui.exe. Now press the launch button. This will cause the project to build our application and necessary dependencies, then launch it with the Visual Studio debugger attached. It will take a few moments while this process starts, then the application will appear.

Visual Studio debugging a Windows application

Move your mouse into the application window, click a button, and the breakpoint will be triggered. This pause execution of your program, bring Visual Studio back to the foreground, and you will be at your breakpoint. You will be able to inspect the application variables, objects, threads, memory, and step through your code interactively using Visual Studio. You can click continue to let the application resume and exit it normally or cease execution within Visual Studio using the stop button.

What you have seen so far is by simply cloning a C++ repo from GitHub you can open the folder with Visual Studio and get an experience that provides IntelliSense, a file view, a logical view based on the build targets, source navigation, build, and debugging with no special configuration or Visual Studio specific project files. If you were to make changes to the source you would get a diff view from the upstream project, make commits, and push them back without leaving Visual Studio. There’s more though. Let’s use this project with Linux.

Step 4 – Add a Linux configuration

So far, you have been using the default x64-Debug configuration for our CMake project. Configurations are how Visual Studio understands what platform target it is going to use for CMake. The default configuration is not represented on disk. When you explicitly add a configuration a file CMakeSettings.json is created that has parameters Visual Studio uses to control how CMake is run, as well as when it is run on a remote target like Linux. To add a new configuration, select the Configuration drop down in the toolbar and select “Manage Configurations…”

The Add Configuration to CMakeSettings dialog will appear.

Add Configuration to CMakeSettings dialog

Here you see Visual Studio has preconfigured options for many of the platforms Visual Studio can be configured to use with CMake. If you want to continue to use the default x64-Debug configuration that should be the first one you add. You want that for this tutorial so can switch back and forth between Windows and Linux configurations. Select x64-Debug and click Select. This creates the CMakeSettings.json file with a configuration for “x64-Debug” and switches Visual Studio to use that configuration instead of the default. This happens very quickly as the provided settings are the same as the default. You will see the configuration drop down no longer says “(default)” as part of the name.

Launch drop down configured for X64-Debug

You can use whatever names you like for your configurations by changing the name parameter in the CMakeSettings.json.

Now that you have a configuration specified in the configuration dropdown Manage Configurations option opens the CMakeSettings.json file so you can adjust values there. To add a Linux configuration right click the CMakeSettings.json file in the solution explorer view and select Add Configuration.

CMakeSettings.json context menu for Add Configuration

This provides the same Add Configuration to CMakeSettings dialog you saw before. This time select Linux-Debug, then save the CMakeSettings.json file. Now in the configuration drop down select Linux-Debug.

Launch configuration drop down with X64-Debug and Linux-Debug options

Since this is the first time you are connecting to a Linux system the Connect to Remote System dialog will appear.

Visual Studio Connect to Remote System dialog

Provide the connection information to your Linux machine and click Connect. This will add that machine as your default remote machine which is what the CMakeSetttings.json for Linux-Debug is configured to use. It will also pull down the headers from your remote machine so that you get IntelliSense specific to that machine when you use it. Now Visual Studio will send your files to the remote machine, then generate the CMake cache there, and when that is done Visual Studio will be configured for using the same source base with that remote Linux machine. These steps may take some time depending on the speed of your network and power of your remote machine. You will know this is complete when the message “Target info extraction done” appears in the CMake output window.

Step 5 – Set breakpoint, build and run on Linux

Since this is a desktop application you need to provide some additional configuration information to the debug configuration. In the CMake Targets view right click AppBasicExampleGui and choose Debug and Launch settings.

Debug and Launch Settings context menu

This will open a file launch.vs.json that is in the hidden .vs subfolder. This file is local to your development environment. You can move it into the root of your project if you wish to check it in and save it with your team. In this file a configuration has been added for AppBasicExampleGui. These default settings work in most cases, as this is a desktop application you need to provide some additional information to launch the program in a way you can see it on our Linux machine. You need to know the value of the environment variable DISPLAY on your Linux machine, run this command to get it.

echo $DISPLAY

In my case this was :1. In the configuration for AppBasicExampleGui there is a parameter array “pipeArgs”. Within there is a line “${debuggerCommand}”. This is the command that launches gdb on the remote machine. Visual Studio needs to export the display into this context before that command runs. Do so by modifying that line as follows using the value of your display.

"export DISPLAY=:1;${debuggerCommand}",

Now in order to launch and debug our application, choose the “Select Startup Item” dropdown in the toolbar and choose AppBasicExampleGui.

Select Startup Item drop down options

Now press that button or hit F5. This will build the application and its dependencies on the remote Linux machine then launch it with the Visual Studio debugger attached. On your remote Linux machine, you should see an application window appear with the same falling bunch of cubes arranged as a single block.

Linux application launched from Visual Studio

Move your mouse into the application window, click a button, and the breakpoint will be triggered. This pause execution of your program, bring Visual Studio back to the foreground, and you will be at your breakpoint. You should also see a Linux Console Window appear in Visual Studio. This window provides output from the remote Linux machine, and it can also accept input for stdin. It can of course be docked where you prefer to see it and it’s position will be used again in future sessions.

Visual Studio Linux Console Window

You will be able to inspect the application variables, objects, threads, memory, and step through your code interactively using Visual Studio. This time on a remote Linux machine instead of your local Windows environment. You can click continue to let the application resume and exit it normally or cease execution within Visual Studio using the stop button. All the same things you’d expect if this were running locally.

Look at the Call Stack window and you will see this time the Calls to x11OpenGLWindow since Visual Studio has launched the application on Linux.

Call Stack window showing Linux call stack

What you learned and where to learn more

So now you have seen the same code base, cloned directly from GitHub, build, run, and debugged on Windows with no modifications. Then with some minor configuration settings build, run and debugged on a remote Linux machine as well. If you are doing cross platform development, we hope you find a lot to love here. Visual Studio C and C++ development is not just for Windows anymore.

Further articles

Original Linux C++ announcement post, this goes into depth on Linux C++ projects that use msbuild to drive remote compilation.

Documentation links

CMake projects
Open folder projects for C++ for any build system
Download and setup the Linux workload, the root page for Visual Studio C++ Linux documentation

This section will be updated in the future with links to new articles on Cross Platform Development with Visual Studio.

Give us feedback

Use this link to download Visual Studio 2017 with everything you need to try the steps in this tutorial, then try it with your projects.

Your feedback is very important to us. We look forward to hearing from you and seeing the things you make.

As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter other problems with MSVC or have a suggestion for Visual Studio please let us know through Help > Send Feedback > Report A Problem / Provide a Suggestion in the product, or via Developer Community. You can also find us on Twitter (@VisualC).

The previous series about clang-tidy on this blog covered the basics of creating a clang-tidy extension and tooling to support that in the form of clang-query.

While the series focused on single-file examples for simplicity, developers progressing in this direction will need to run the tooling on all of the files in their project at once, or on all files which match a specific pattern.

Delayed refactoring

The first problem with processing multiple files is that we can no longer change files as we process them and discover locations to refactor. Tools like clang-tidy only work if the source code compiles, so a process which changed a header file while processing the first source file would cause the next source file to not be compilable.

To resolve this problem, clang-tidy has the ability to export refactoring changes to a .yaml file, instead of changing the files directly.

The clang-apply-replacements tool can then be run on a directory of .yaml files in order to apply the changes to all of the files at once.

The run-clang-tidy script in the clang repository helps with these tasks. It accepts a pattern of files and processes all matching files in parallel, making use of all available cores.

Build tools

Consider the similarity between using a compiler with a .cpp file to produce an object file and using clang-tidy to produce a .yaml file.

This similarity implies that we can use build tools with clang-tidy.

We can use any tool to generate a Ninja buildsystem, but generally they are not currently optimized for generating commands which invoke clang-tidy instead of a compiler. Although CMake has clang-tidy support, it doesn’t have direct support for delayed refactoring, so the CMake integration is currently more suitable to linting instead of refactoring tasks.

For now, using some tricks, we can use CMake to generate a buildsystem from a compile_commands.json file. The generated ‘buildsystem’ simply uses clang-tidy in place of the compiler, so that it outputs .yaml files instead of object files. The CMake script produces a ‘buildsystem’ based on the content of a compile_commands.json file which you have already generated.

We can instruct CMake to generate a Ninja ‘buildsystem’ and run a ‘build’ in the normal way to invoke the refactor:

cmake .. -G Ninja -DCMAKE_CXX_COMPILER=<path_to_clang_tidy>
cmake --build .

Ninja processes the inputs in parallel, so this results in a collection of .yaml files in the fixes directory. We can use clang-apply-replacements to apply those fixes to the source code.

Using CMake and Ninja brings advantages that the run-clang-tidy script doesn’t provide. Because we are modelling mechanical refactoring as a build task, we can use other build tools which work with Ninja and CMake. To start, we can convert the log of Ninja performing the refactor to a trace which is compatible with the Chrome about:tracing tool. This gives output showing the length of time taken for each translation unit:

We can also take advantage of the fact that we are now using CMake to handle the refactoring. Using Visual Studio Code and the CMake Tools plugin, we can simply open the folder containing the CMakeLists.txt and trigger the refactoring task from there.

Add a custom kit to the CMake Tools for running clang-tidy:

{
  "name": "Clang tidy",
  "compilers": {
    "CXX": "C:/dev/prefix/bin/clang-tidy.exe"
  }
}

Now, when we invoke Build in Visual Studio Code, the refactoring is started. Diagnostics are also collected with easy navigation to the source code.

Because CMake can generate Visual Studio solutions, it is also possible to control the refactoring from within Visual Studio. As this requires creating a Toolset file to replace the compiler with clang-tidy, it is slightly out of scope of this post but it follows the same pattern to achieve the result.

Distributing the refactor

Consider how we distribute our build tasks on the network.

If we treat clang-tidy as a compiler, then we should be able to use a build-distribution tool to distribute our refactoring task on the network.

One such build distribution tool is Icecream, which is popular on Linux systems and available under the GPL. Icecream works by sending an archive of the build tools to client machines so that the actual compilation is run on the remote machine and the resulting object file is sent back to the client.

By packaging the clang-tidy executable, renamed to clang so that Icecream accepts it, we can refactor on remote machines and send resulting .obj files (named so that Icecream accepts them, but containing yaml content) to clients. The Icecream Monitor tool then shows a progress of the distributed task among the build nodes.

This work distribution brings a significant increase in speed to the refactoring task. Using this technique I have been able to make mechanical changes to the LLVM/Clang source (millions of lines of code) in minutes which would otherwise take hours if run only locally. Because there is no need to link libraries while refactoring, each refactor does not conflict with any other and the process can be embarrassingly parallel.

Conclusion

Mechanical refactoring with clang-tidy requires distribution over a network in order to complete in reasonable time on large codebases. What other build tools do you think would be adaptable for refactoring tasks? Let us know in the comments below or contact the author directly via e-mail at stkelly@microsoft.com, or on Twitter @steveire.

Overview

It has been a long time since we last talked about C++ Modules. We feel it is time to revisit what has been happening under the hood of MSVC for modules.

The Visual C++ Team has been dedicated to pushing conformance to the standard with a focus on making the overall compiler implementation more robust and correct with the rejuvenation effort. This rejuvenation effort has given us the ability to substantially improve our modules implementation. We’ve mostly done this work transparently over the past few months until now. We are proud to say the work has reached a point where talking about it would hopefully provide developers with even more reasons to use C++ Modules with MSVC!

What is new?

Two-phase name lookup is now a requirement to use modules. We now store templates in a form that is far more structured than the previous token stream model preserving information like bound names at the point of declaration.
Better constexpr support allows users to write far more complex code in an exported module.
Improved diagnostics provide users with more safety and correctness when using modules with MSVC.

Two-Phase Name Lookup Support

This point is best illustrated through example. Let’s take the VS2017 15.7 compiler and build some modules. Given a module m.ixx:

#include <type_traits>
export module m;

export
template <typename T>;
struct ptr_holder {
  static_assert(std::is_same_v<T, std::remove_pointer_t<T>>);
};

Now, let’s use the module in a very simple program to try and trigger the static_assert to fail, main.cpp:

import m;

int main() {
  ptr_holder<char*> p;
}

Using the command line, we can build this module and program like so:

cl /experimental:module /std:c++17 /c m.ixx cl /experimental:module /std:c++17 /module:reference m.ifc main.cpp m.obj

However, you will quickly find this results in a failure:

m.ixx(7): error C2039: 'is_same_v': is not a member of 'std'
predefined C++ types (compiler internal)(238): note: see declaration of 'std'
main.cpp(4): note: see reference to class template instantiation 'ptr_holder' being compiled
m.ixx(7): error C2065: 'is_same_v': undeclared identifier
m.ixx(7): error C2275: 'T': illegal use of this type as an expression
m.ixx(7): error C2039: 'remove_pointer_t': is not a member of 'std'
predefined C++ types (compiler internal)(238): note: see declaration of 'std'
m.ixx(7): error C2061: syntax error: identifier 'remove_pointer_t'
m.ixx(7): error C2238: unexpected token(s) preceding ';'
main.cpp(35): fatal error C1903: unable to recover from previous error(s); stopping compilation
INTERNAL COMPILER ERROR in 'cl.exe'
    Please choose the Technical Support command on the Visual C++
    Help menu, or open the Technical Support help file for more information

It failed to compile, just not in the way we expected. It appears as though the compiler did not handle this scenario. The good news is that 15.9 is here and it is coming with some much-needed improvement! Let’s build this module and program with the 15.9 compiler:

main.cpp(4): error C2607: static assertion failed
main.cpp(4): note: see reference to class template instantiation 'ptr_holder' being compiled

This! This is what we are looking for! So what gives here? Why is 15.9 able to handle this scenario while the 15.7 compiler fails in the way that it does? It all comes down to how modules works with two-phase name lookup.

As mentioned in our two-phase name lookup blog, templates were historically stored in the compiler as streams of tokens which does not save information about what identifiers were seen during the parsing of that template declaration.

The 15.7 modules implementation did not have any awareness of two-phase name lookup so template code compiled with it would suffer from many of the same problems that are described in that blog along with the by-design lookup hiding nature of non-exported module code (in our case is_same_v was a non-exported declaration).

Due to MSVC now supporting two-phase name lookup, our modules implementation is now able to handle much more complex and correct code!

Better Constexpr Support

Constexpr is something that is very important to C++ now and having support for it in combination with new language features can dramatically impact usability of that feature. As such we have made some significant improvements to the constexpr handling of our modules implementation. Once again, let us start with a concrete example, given a module m.ixx:

export module m;

struct internal { int value = 42; };

export {

struct S {
  static constexpr internal value = { };
  union U {
    int a;
    double b;
    constexpr U(int a) : a{ a } { }
    constexpr U(double b) : b{ b } { }
  };
  U u = { 1. };
  U u2 = { 1 };
};
constexpr S s;
constexpr S a[2] = { {}, {.2, 2} };

}

Using the module in a program, main.cpp:

import m;

int main() {
  static_assert(S::value.value == 42);
  static_assert(s.u.b == 1. && s.u2.a == 1);
  static_assert(a[1].u.b == .2 && a[1].u2.a == 2);
  return s.u2.a + a[1].u2.a;
}

You will identify another problem:

main.cpp(5): error C3865: '__thiscall': can only be used on native member functions
main.cpp(5): error C2028: struct/union member must be inside a struct/union
main.cpp(5): fatal error C1903: unable to recover from previous error(s); stopping compilation
Internal Compiler Error in cl.exe.  You will be prompted to send an error report to Microsoft later.
INTERNAL COMPILER ERROR in 'cl.exe'
    Please choose the Technical Support command on the Visual C++
    Help menu, or open the Technical Support help file for more information

Well those are some cryptic errors… One might discover that once you rewrite “constexpr S s;” in the form of “constexpr S s = { };” that the errors go away and you then face a new runtime failure in that the return value from main is 0, rather than the expected 3. In general, prior 15.9 constexpr objects and arrays have been a source of numerous bugs in modules. Failures such as the ones mentioned above are completely gone due in part to our recent rework of the constexpr implementation.

Improved Diagnostics

The MSVC C++ Modules implementation is not just about exporting/importing code correctly, but also providing a safe and user-friendly experience around it.

One such feature is the ability for the compiler to diagnose if the module interface unit has been tampered with. Let us see a simple example:

C:\> cl /experimental:module /std:c++latest /c m.ixx m.ixx C:\> echo 1 >> "m.ifc" C:\> cl /experimental:module /std:c++latest main.cpp main.cpp main.cpp(1): error C7536: ifc failed integrity checks. Expected SHA2: '66d5c8154df0c71d4cab7665bab4a125c7ce5cb9a401a4d8b461b706ddd771c6'

Here the compiler refuses to try and use an interface file which has failed a basic integrity check. This protects users from malicious interface files attempting to be processed by MSVC.

Another usability feature we have added is the capability to issue warnings whenever compiler flags differ from a built module to the compiler flags used to import that module. Having some command line switches omitted from the import side could produce an erroneous scenario:

C:\> cl /experimental:module /std:c++17 /MDd /c m.ixx m.ixx C:\> cl /experimental:module /std:c++14 /MD main.cpp main.cpp main.cpp(1): warning C5050: Possible incompatible environment while importing module 'm': _DEBUG is defined in module command line and not in current command line main.cpp(1): warning C5050: Possible incompatible environment while importing module 'm': mismatched C++ versions. Current "201402" module version "201703"

The compiler is telling us that the macro “_DEBUG” was defined when the module was built, that macro is implicitly defined when using the /MDd switch. The presence of this macro can affect how libraries like the STL behave; it can even affect their binary interface (ABI). Additionally, the standard C++ versions between these two components don’t agree so a warning is produced to inform the user.

What now (call to action)?

Download Visual Studio 2017 Version 15.9 today and try out C++ Modules with your projects. Export your template metaprogramming libraries and truly hide your implementation details behind non-exported regions! No longer fear exporting constexpr objects from your interface units! Finally, enjoy using modules with better diagnostics support to create a much more user-friendly experience!

What’s next…

Modules standardization is in progress: Currently, MSVC supports all of the features of the current TS. As the C++ Modules TS evolves, we will continue to update our implementation and feature set to reflect the new proposal. Breaking changes will be documented as per usual via the release notes.
Throughput is improving: One consequence of using old infrastructure with new infrastructure is newer code often depends on older behavior. MSVC is no exception here. We are constantly updating the compiler to be faster and rely less on outdated routines and as this begins to happen our modules implementation will speedup as a nice side-effect. Stay tuned for a future blog on this.

As always, we welcome your feedback. Feel free to send any comments through e-mail at visualcpp@microsoft.com, through Twitter @visualc, or Facebook at Microsoft Visual Cpp.

If you encounter other problems with MSVC in VS 2017 please let us know via the Report a Problem option, either from the installer or the Visual Studio IDE itself. For suggestions, let us know through DevComm. Thank you!

C++ developers using Visual Studio 2019 16.0 Preview 1 or Visual Studio Code can now use Live Share. With Live Share you can share the full context of your code, enabling collaborative editing and debugging.

Collaborative Editing:

Collaborative Debugging:

In a Live Share session there is a host and a guest(s). The host of the session provides the guest with everything it needs to be productive; the guest doesn’t need any of the source files locally. Furthermore, the guest doesn’t need the right compiler, external dependencies, or even the same installed components. The guest even gets IntelliSense from the host!

While in a C++ Live Share session, you can use:

Member List
Parameter Help
Quick Info
Debugging/Breakpoints
Find All References
Go To Definition
Symbol Search (Ctrl+T)
Reference Highlighting
Diagnostics/Errors/Squiggles
Completion

How to Install Live Share

Visual Studio 2019 16.0 Preview 1 includes Live Share by default as part of the “Desktop development with C++” workload. If you are using Visual Studio Code, you’ll need to download the Live Share Extension.

Using Live Share – Visual Studio

With Live Share installed, Visual Studio and Visual Studio Code can act as the host or the guest of the Live Share session. This gives your team a lot of flexibility. For example, you could have a Visual Studio 2019 host on Windows that is sharing with a Visual Studio Code guest on Linux.

To start a Live Share session in Visual Studio, click the Share button in the top right (or go to File > Start Collaboration Session). This generates a link that you can share with your collaborators.

To join a session in Visual Studio, simply go to File > Join Collaboration Session… and enter a collaboration session invite link.

To end a session, select “End Collaboration Session” from the “Sharing” dropdown:

Please visit the Live Share home page for more information, including videos.

Using Live Share – Visual Studio Code

To start a Live Share session in Visual Studio Code, click the “Live Share” button in the bottom Command Palette.

This will automatically copy the sharing link to your clipboard. The Command palette will now display an icon next to your name to indicate you are sharing. You’ll also notice a new icon with an indication of how many participants have joined your session.

To join a session in Visual Studio Code, click your name in the bottom Command Palette. A dropdown will appear, giving you the option to join a session. After clicking that option, you’ll need to paste a Live Share session invite link.

To end a session , click your name in the bottom Command Palette. A dropdown will appear, giving you the option to stop the session. (You’ll also see the option to invite others if you need to recopy the session link.)

For more information, please refer to the Visual Studio Code Live Share documentation.

Known Issues

When sharing with a Visual Studio host to a Visual Studio guest, the Member List descriptions do not appear on the guest. This is a known issue and will be fixed in the next Preview.

Give Us Your Feedback

Live Share is new for Visual Studio C++, so we are eager to hear your feedback as we continue to improve your experience. We can be reached via the comments below or via email at visualcpp@microsoft.com. If you encounter problems with Live Share for C++ or have a suggestion for Visual Studio please let us know through the Send Feedback button in the top right of Visual Studio, or via Developer Community. You can also find us on Twitter @VisualC.

After reading and writing enough code, you begin to notice certain usage patterns. For example, if a stream is open, it will eventually be closed. More interestingly, if a string is used in the context of an if-statement, it will often be to check if the string is empty or if it has a certain size. You begin to identify and use these coding patterns over time, but what if Visual Studio already knew these common patterns and could suggest them to you as you code? That’s exactly what IntelliCode does.

IntelliCode uses machine learning to train over thousands of real-world projects including open-source projects on GitHub. As such, IntelliCode will be most helpful when using common libraries such as STL. Based on this training, IntelliCode saves your time by putting what you’re most likely to use at the top of your IntelliSense completion list. IntelliCode for C++ is now available as an extension for Visual Studio 2019.

As you use the IntelliCode extension, you will start to notice starred items at the top of your Member List – those are IntelliCode recommendations. For example, below we see “cend” being recommended based on the context of using “cbegin”. This is important since mixing “cbegin” with plain “end” is a compiler error with STL algorithms.

What’s Next

In a future release we will give C++ developers the ability to let IntelliCode learn from your own code. We are also considering adding C++ IntelliCode support to Visual Studio Code.

Give Us Your Feedback 

IntelliCode is new for C++ developers using Visual Studio, so we are eager to hear your feedback as we continue to improve your experience. We can be reached via the comments below or via email at visualcpp@microsoft.com. If you encounter problems with IntelliCode for C++ or have a suggestion for Visual Studio please let us know through the Send Feedback button in the top right corner of Visual Studio, or via Developer Community. You can also find us on Twitter @VisualC.

The October 2018 update of the Visual Studio Code C++ extension has recently shipped. It comes with a ton of bug fixes, improved Go to Definition support, integrated terminal support when debugging, and a simpler way to opt into our extension’s Insiders program. For a detailed list of this release’s improvements, check out the release notes.

Go to Definition improvements

Go to Definition takes advantage now of the full semantic information coming from the C++ IntelliSense engine. When C++ IntelliSense is enabled (which is the default, and as long as the fallback to Tag Parser does not kick in), you will see improved results, including correct overload resolution and a more accurate navigation to the definition instead of a declaration.

VSCode editor with context menu, with Go to Definition menu item highlighted

Integrated terminal support when debugging

You can now debug your C++ programs using the integrated terminal instead of creating an external console. After enabling debugging, to take advantage of this behavior, modify your launch.json to specify:

"externalConsole": "false"

This is currently supported for the cppdbg debugger type, only on Linux and Windows.

VSCode in debugging mode, with terminal visible

Insiders program

We would like to thank everyone who already tried our Insiders builds for the Visual Studio Code C++ extension. We want to make it as easy as possible to opt into this program, so we have significantly simplified the sign-up steps.

To opt-in, all you have to do is go to File > Preferences > Settings (Ctrl+,) and under Extensions > C/C++, change the “C_Cpp: Update Channel” to “Insiders”.

VSCode Settings editor, with C++ extension's Update Channel option selected

Tell us what you think

Download the C/C++ extension for Visual Studio Code today, give it a try and let us know what you think. If you run into any issues, or have any suggestions, please report them on our GitHub page. Please also take our quick survey to help us shape this extension to meet your needs. We can be reached via the comments below or via email (visualcpp@microsoft.com). You can also find us on Twitter (@VisualC).

This post is also available on Simon Brand’s blog

C++17 merged in a paper called Guaranteed copy elision through simplified value categories. The changes mandate that no copies or moves take place in some situations where they were previously allowed, e.g.:

 
struct non_moveable { 
    non_moveable() = default; 
    non_moveable(non_moveable&&) = delete; 
}; 
non_moveable make() { return {}; } 
non_moveable x = make(); //compiles in C++17, error in C++11/14

You can see this behavior in compiler versions Visual Studio 2017 15.6, Clang 4, GCC 7, and above.

Despite the name of the paper and what you might read on the Internet, the new rules do not guarantee copy elision. Instead, the new value category rules are defined such that no copy exists in the first place. Understanding this nuance gives a deeper understanding of the current C++ object model, so I will explain the pre-C++17 rules, what changes were made, and how they solve real-world problems.

Value Categories

To understand the before-and-after, we first need to understand what value categories are (I’ll explain copy elision in the next section). Continuing the theme of C++ misnomers, value categories are not categories of values; they are characteristics of expressions. Every expression in C++ has one of three value categories: lvalue, prvalue (pure rvalue), or xvalue (eXpiring value). There are then two super-categories, as shown in the diagram below.

For an explanation of what these are, we can look at the standard (C++17 [basic.lval]/1):

A glvalue [(generalized lvalue)] is an expression whose evaluation determines the identity of an object, bit-field, or function.
A prvalue is an expression whose evaluation initializes an object or a bit-field, or computes the value of an operand of an operator, as specified by the context in which it appears.
An xvalue is a glvalue that denotes an object or bit-field whose resources can be reused (usually because it is near the end of its lifetime).
An lvalue is a glvalue that is not an xvalue.
An rvalue is a prvalue or an xvalue.

Some examples:

 
std::string s;  
s //lvalue: identity of an object  
s + " cake" //prvalue: could perform initialization/compute a value  

std::string f();  
std::string& g();  
std::string&& h();  

f() //prvalue: could perform initialization/compute a value  
g() //lvalue: identity of an object  
h() //xvalue: denotes an object whose resources can be reused  

struct foo {  
    std::string s;  
};  

foo{}.s //xvalue: denotes an object whose resources can be reused

C++11

What are the properties of the expression std::string{"a pony"}?

It’s a prvalue. Its type is std::string. It has the value "a pony". It names a temporary.

That last one is the key point I want to talk about, and it’s the real difference between the C++11 rules and C++17. In C++11, std::string{"a pony"} does indeed name a temporary. From C++11 [class.temporary]/1:

Temporaries of class type are created in various contexts: binding a reference to a prvalue, returning a prvalue, a conversion that creates a prvalue, throwing an exception, entering a handler, and in some initializations. […]

Let’s look at how this interacts with this code:

 
struct copyable { 
    copyable() = default; 
    copyable(copyable const&) { /*...*/ } 
}; 
copyable make() { return {}; } 
copyable x = make();

make() results in a temporary. This temporary will be moved into x. Since copyable has no move constructor, this calls the copy constructor. However, this copy is unnecessary since the object constructed on the way out of make will never be used for anything else. The standard allows this copy to be elided by constructing the return value at the call-site rather than in make (C++11 [class.copy]/31). This is called copy elision.

The unfortunate part is this: even if all copies of the type are elided, the constructor still must exist.

This means that if we instead have:

 
struct non_moveable { 
    non_moveable() = default; 
    non_moveable(non_moveable&&) = delete; 
}; 
non_moveable make() { return {}; } 
auto x = make();

then we get a compiler error:

(7): error C2280: 'non_moveable::non_moveable(non_moveable &&)': attempting to reference a deleted function 
(3): note: see declaration of 'non_moveable::non_moveable' 
(3): note: 'non_moveable::non_moveable(non_moveable &&)': function was explicitly deleted

Aside from returning non-moveable types by value, this presents other issues:

Use of Almost Always Auto style is prevented for immobile types:
```
 
auto x = non_moveable{}; //compiler error 
```
The language makes no guarantees that the constructors won’t be called (in practice this isn’t too much of a worry, but guarantees are more convincing than optional optimizations).
If we want to support some of these use-cases, we need to write copy/move constructors for types which they don’t make sense for (and do what? Throw? Abort? Linker error?)
You can’t pass non-moveable types to functions by value, in case you have some use-case which that would help with.

What’s the solution? Should the standard just say “oh, if you elide all copies, you don’t need those constructors”? Maybe, but then all this language about constructing temporaries is really a lie and building an intuition about the object model becomes even harder.

C++17

C++17 takes a different approach. Instead of guaranteeing that copies will be elided in these cases, it changes the rules such that the copies were never there in the first place. This is achieved through redefining when temporaries are created.

As noted in the value category descriptions earlier, prvalues exist for purposes of initialization. C++11 creates temporaries eagerly, eventually using them in an initialization and cleaning up copies after the fact. In C++17, the materialization of temporaries is deferred until the initialization is performed.

That’s a better name for this feature. Not guaranteed copy elision. Deferred temporary materialization.

Temporary materialization creates a temporary object from a prvalue, resulting in an xvalue. The most common places it occurs are when binding a reference to or performing member access on a prvalue. If a reference is bound to the prvalue, the materialized temporary’s lifetime is extended to that of the reference (this is unchanged from C++11, but worth repeating). If a prvalue initializes a class type of the same type as the prvalue, then the destination object is initialized directly; no temporary required.

Some examples:

 
struct foo { 
    int i; 
}; 
 
foo make(); 
auto const& a = make();  //temporary materialized and lifetime-extended 
auto&& b = make(); //ditto 
 
foo{}.i //temporary materialized 
 
auto c = make(); //no temporary materialized

That covers the most important points of the new rules. Now on to why this is actually useful past terminology bikeshedding and trivia to impress your friends.

Who cares?

I said at the start that understanding the new rules would grant a deeper understanding of the C++17 object model. I’d like to expand on that a bit.

The key point is that in C++11, prvalues are not “pure” in a sense. That is, the expression std::string{"a pony"} names some temporary std::string object with the contents "a pony". It’s not the pure notion of the list of characters “a pony”. It’s not the Platonic ideal of “a pony”.

In C++17, however, std::string{"a pony"} is the Platonic ideal of “a pony”. It’s not a real object in C++’s object model, it’s some elusive, amorphous idea which can be passed around your program, only being given form when initializing some result object, or materializing a temporary. C++17’s prvalues are purer prvalues.

If this all sounds a bit abstract, that’s okay, but internalizing this idea will make it easier to reason about aspects of your program. Consider a simple example:

 
struct foo {}; 
auto x = foo{};

In the C++11 model, the prvalue foo{} creates a temporary which is used to move-construct x, but the move is likely elided by the compiler.

In the C++17 model, the prvalue foo{} initializes x.

A more complex example:

 
std::string a() { 
    return "a pony"; 
} 
 
std::string b() { 
    return a(); 
} 
 
int main() { 
    auto x = b(); 
}

In the C++11 model, return "a pony"; initializes the temporary return object of a(), which move-constructs the temporary return object of b(), which move-constructs x. All the moves are likely elided by the compiler.

In the C++17 model, return "a pony"; initializes the result object of a(), which is the result object of b(), which is x.

In essence, rather than an initializer creating a series of temporaries which in theory move-construct a chain of return objects, the initializer is teleported to the eventual result object.

Closing

The “guaranteed copy elision” rules do not guarantee copy elision; instead they purify prvalues such that the copy doesn’t exist in the first place. Next time you hear or read about guaranteed copy elision, think instead about deferred temporary materialization. Even if you don’t find the terminology important, this knowledge will help reason about the behavior of your code more easily.

Deferred temporary materialization/guaranteed copy elision has been supported in MSVC since Visual Studio 2017 version 15.6. We’d love for you to download the latest version and try it out. As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter other problems with MSVC or have a suggestion for Visual Studio 2017 please let us know through Help > Send Feedback > Report A Problem / Provide a Suggestion in the product, or via Developer Community. You can also find us on Twitter (@VisualC) and Facebook (msftvisualcpp).

If you have any questions about this article, please use the comments below or send them directly to Simon at simon.brand@microsoft.com or @TartanLlama.

Containers are a great tool for configuring reproducible build environments. It’s fairly easy to find Dockerfiles that provide various C++ environments. Unfortunately, it is hard to find guidance on how to use newer techniques like multi-stage builds. This post will show you how you can leverage the capabilities of multi-stage containers for your C++ development. This is relevant to anyone doing C++ development regardless what tools you are using.

Multi-stage builds are Dockerfiles that use multiple FROM statements where each begins a new stage of the build. You can also name your build stages and copy output from early stages into the later stages. Prior to the availability of this capability it was common to see build container definitions where output was copied to the host and later copied into deployment containers. This spread the definition of related containers across multiple Dockerfiles that were often driven together via scripts. Multi-stage builds are a lot more convenient than that approach and less fragile. To see a full example of before and after multi-stage builds were available I recommend looking at the official Docker multi-stage build documentation.

Let’s look at a multi-stage build Dockerfile for a C++ app. This is an app that exposes a service to receive an image, processes the image using OpenCV to circle any found faces, and exposes another endpoint to retrieve the processed image. Here is a complete multi-stage Dockerfile that produces a build container for compiling the application, followed by a runtime container that takes that output and only has the dependencies necessary for running the application as opposed to building it. Here is the source for this article.

FROM alpine:latest as build

LABEL description="Build container - findfaces"

RUN apk update && apk add --no-cache \ 
    autoconf build-base binutils cmake curl file gcc g++ git libgcc libtool linux-headers make musl-dev ninja tar unzip wget

RUN cd /tmp \
    && wget https://github.com/Microsoft/CMake/releases/download/untagged-fb9b4dd1072bc49c0ba9/cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh \
    && chmod +x cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh \
    && ./cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh --prefix=/usr/local --skip-license \
    && rm cmake-3.11.18033000-MSVC_2-Linux-x86_64.sh

RUN cd /tmp \
    && git clone https://github.com/Microsoft/vcpkg.git -n \ 
    && cd vcpkg \
    && git checkout 1d5e22919fcfeba3fe513248e73395c42ac18ae4 \
    && ./bootstrap-vcpkg.sh -useSystemBinaries

COPY x64-linux-musl.cmake /tmp/vcpkg/triplets/

RUN VCPKG_FORCE_SYSTEM_BINARIES=1 ./tmp/vcpkg/vcpkg install boost-asio boost-filesystem fmt http-parser opencv restinio

COPY ./src /src
WORKDIR /src
RUN mkdir out \
    && cd out \
    && cmake .. -DCMAKE_TOOLCHAIN_FILE=/tmp/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-linux-musl \
    && make

FROM alpine:latest as runtime

LABEL description="Run container - findfaces"

RUN apk update && apk add --no-cache \ 
    libstdc++

RUN mkdir /usr/local/faces
COPY --from=build /src/haarcascade_frontalface_alt2.xml /usr/local/faces/haarcascade_frontalface_alt2.xml

COPY --from=build /src/out/findfaces /usr/local/faces/findfaces

WORKDIR /usr/local/faces

CMD ./findfaces

EXPOSE 8080

The first section of the Dockerfile describes the build environment for our application. We’ve used the AS keyword on the FROM line to identify this stage of the build so we can refer to in subsequent stages. The first line invokes the package manager and pulls down packages like build-essential for compilers and the dev packages for the libraries I need. We are pulling in vcpkg to get our library dependencies. The COPY line copies the host machine src folder into the container. The final RUN statement builds the application using CMake. There is no entry point for this container because all we need it to do is run once and build our output.

The second FROM statement begins the next section of our multi-stage build. Here our dependencies are reduced as we don’t need the compilers or dev packages. Since we statically linked our libraries we don’t even need those, but we do need the GCC libs since we built with GCC and Alpine uses musl C. Note the COPY lines that use –from=build. Here we are copying content from the build container without needing to export to the host and copy back into our runtime container. The CMD statement starts our service in the container, and the EXPOSE statement documents the port the container is using for the service.

You could just go ahead and build this container. If you do you can tag the image, but only the final image is tagged; the images from the earlier stages of the multi-stage build are not. If you’d like to use these earlier images as the base for other containers you can tag by specifying the target of the build stage you’d like to stop at. For example, to just run the build stage I can run this command to stop there and tag the image:

docker build --target build -t findfaces/build .

If I want to run the entire build I can run the following command and tag the image from the final stage. If you ran the earlier stage already the image cache for it will be used.

docker build -t findfaces/run .

Now to use my runtime container:

docker run -d --rm -p 8080:8080 --name findfaces findfaces/run

This command runs a container based on the findfaces/run image, detach from it when it starts (-d), remove the container when it stops (–rm), expose port 8080 mapped to the same port on the container (-p), and name the container findfaces.

Now that the container is up I can access the service using curl:

curl -X PUT -T mypicture.jpg localhost:8080/files?submit=picture.jpg
curl -X GET localhost:8080/files/facespicture.jpg > facespicture.jpg

If there were faces in the image our OpenCV library could identify they are circled in the output image, which has the name used in the submission prepended with “faces”.

When we are done with the application, we can stop it and the container is deleted:

docker stop findfaces

Alpine vs Debian

Above we used Alpine Linux as our base. This is a very small Linux distro that has some differences from more common ones like Debian. If you examine the Dockerfile you’ll notice we copied a file, x64-linux-musl.cmake, from the src directory into the vcpkg triplets directory. That’s because Alpine uses musl c instead of GCC, so we created a new triplet to use with musl c. This is experimental which is why we have not brought it into vcpkg directly yet. One limitation we found with this triplet is that boost-locale does not compile with it today. This caused us to switch some of our libraries, notably to use the restinio http library which was one of the few http ones we could find that would compile with this triplet. We have provided an alternate Dockerfile that targets Debian instead and does not use any experimental features.

So why would you want to try Alpine instead of Debian today? If we look at our images, this is what we see for the image sizes when using Debian.

docker image ls
REPOSITORY            TAG              IMAGE ID         CREATED            SIZE
findfaces/run         latest           0e20b1ff7f82     2 hours ago        161MB
findfaces/build       latest           7d1675936cdd     2 hours ago        6.5GB

You can see our build container is much larger than the runtime container, obviously desirable for optimizing our resource usage. Our application is 40MB, so the base image we’re running on is the remaining 121MB. Let’s compare that to Alpine.

REPOSITORY             TAG             IMAGE ID         CREATED            SIZE
findfaces/run          latest          0ef4c0b68551     2 hours ago        50.1MB
findfaces/build        latest          fa7fe2783c58     2 hours ago        5.57GB

Not much savings for the build container, but the run container is only 10MB larger than our application.

Summary

This post showed how to use C++ code in containers. We used a multistage build container where we compiled our assets in one container and consumed those in a runtime container in a single Dockerfile. This results in a runtime container optimized for size. We also used vcpkg to get the latest versions of the libraries we are using rather than what is available in the package manager. By statically linking those libraries we reduced the complexity of what binaries we need to manage in our runtime container. We used the Alpine distribution to further ensure that our final runtime container was as small as possible with a comparison to a Debian based container. We also exposed our application as a service using an http library to access it outside the container.

What next

We are planning to continue looking at containers in future posts. We will have one up soon showing how to us Visual Studio and VS Code with the containers from this post. We will follow that showing how to deploy these containers to Azure. We will also revisit this application using Windows containers.

Give us feedback

We’d love to hear from you about what you’d like to see covered in the future about containers. We’d love it even more to see the C++ community producing own content about using C++ with containers. There is very little material out there today and we believe the potential for C++ in the cloud with containers is huge.

As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter other problems or have a suggestion for Visual Studio please let us know through Help > Send Feedback > Report A Problem / Provide a Suggestion in the product, or via Developer Community. You can also find us on Twitter (@VisualC).

Two features available in Visual Studio 2019 Preview 1 for C++ developers are the start window and a revamped new project dialog.

The main goal of the start window is to make it easier to get to a state where code is loaded in the IDE by concentrating on the commands that a developer will require most often. It is also aims to improve the Getting Started experience with the IDE, following feedback from new users and months of research in UX labs, where we found that users’ first impressions of the IDE was that it is overwhelming on first use due to the large number of features visible in the user interface.

The start window moves the core features from the Visual Studio Start Page, which normally appeared in the editor space when Visual Studio is launched, out into a separate window that appears before the IDE launches. The window includes five main sections: Open recent, Clone or checkout code, Open a project or solution, Open a local folder, and Create a new project. It is also possible to continue past the window without opening any code by choosing “Continue without code”. To learn more about our motivations for creating the start window, check out the blog post: The story of the Visual Studio start window.

Let’s dig into the features of the start window:

Open recent

The start window, like the Start Page, keeps track of projects and folders of code that have been previously opened with Visual Studio. It is easy to open these again as needed by clicking on one of the options in the list on the left side of the window.

Clone or checkout code

If your code is in an online source control repository like GitHub or Azure DevOps, you can clone your code directly to a local folder and quickly open it in Visual Studio.

Open a project or solution

This button functions exactly like the File > Open > Open Project/Solution command in the IDE itself. You can use it to select a .sln or Visual Studio project file directly if you have an MSBuild-based solution. If you are using CMake or some non-MSBuild build system though, we recommend going with the Open a local folder option below.

Open a local folder

If you are working with C++ code using a build system other than MSBuild, such as CMake, opening the folder is recommended. Visual Studio 2019, like 2017, contains built-in CMake support that allows you to browse, edit, build, and debug your code without ever generating a .sln or a .vcxproj file. You can also configure a different build system to work with Open Folder. To learn more about Open Folder, check out our documentation on the subject. This button in the start window is equivalent to the File > Open > Open Folder command in the IDE.

Create a new project

Creating a new project is a common task. For Visual Studio 2019 we have cleaned up and revamped the New Project Dialog to streamline this process. Now, the New Project Dialog no longer includes a “Table of Contents” style list of nodes and sub-nodes for the different templates. This is instead replaced by a section for “Recent project templates” (coming online for Preview 2) which functions similarly to the “Open Recent” section of the main start window. Rather than the New Project Dialog only remembering the precise page you were on last, it will remember the templates you used in the past, in case you would like to use them again.

Furthermore, the overhauled New Project Dialog is designed for a search-first experience. Simply type what you are looking for and the new dialog can find it for you quickly, whether a keyword you use is contained in the template title, description, and from Preview 2 onward, in one of the tags (boxed categories displayed under each template). You can take things even further by filtering by Language (C++, C#, etc.), Platform (Windows, Linux, Azure, etc.), or Project type (Console, Games, IoT, etc.). While the New Project Dialog will, by default, provide you with a list of templates, you can use these filtering capabilities to refine your experience as you search, and easily get back to your templates later when they are saved in the “Recents” list on the left.

Give us your feedback!

We understand that this is a big change for those of you who have been using Visual Studio for a while. We are interested in any feedback you may have on the new start window experience and the revamped New Project Dialog. Give it a try and let us know what you think!

Of course, we understand that some users may prefer going straight into the IDE and doing what they’re used to doing to load code. We provide a way to turn off the new window in Tools > Options > Startup > On startup, open, and choosing something other than the start window. To get the old Start Page back, simply select the “Start Page” option.

To send us feedback:

From the IDE, you can use Help > Send Feedback > Report A Problem to report bugs, or Help > Send Feedback > Suggest a Feature to suggest new features for us to work on. You can also leave us a comment below, or for general queries, you can email us at visualcpp@microsoft.com. Follow us on Twitter @VisualC.

Visual Studio 2019 Preview 1 introduces an improved debugger for C++ that uses an external 64-bit process for hosting its memory-intensive components. If you’ve experienced memory-related issues while debugging C++ applications before, these issues should now be largely resolved with Visual Studio 2019.

Background

One of the areas of feedback for the debugger in Visual Studio has been around its high memory usage during debugging of large and complex C++ applications. Most of the memory consumption in this scenario is due to the large amounts of symbol information that the debugger needs to load to show program data in the debugger windows. Memory usage tends to increase over the course of a debugging session as the debugger stops in different parts of the application and encounters new symbols that need to be loaded. Eventually the Visual Studio process might crash due to running out of memory.

We’ve made significant optimizations in Visual Studio 2017 to mitigate this problem. For example, the 15.6 update introduced memory optimizations to /Debug:fastlink , which resulted in a 30% reduction in the debugger’s memory consumption. As we look to further avoid this issue in Visual Studio 2019, we have moved memory-intensive components to a separate 64-bit process.

Case study: Debugging Gears of War 4

We closely worked with internal and external partner teams to ensure the changes we were making to the debugger were validated on large native real-world applications. The following video shows a side-by-side comparison of the memory usage between Visual Studio 2017 and Visual Studio 2019 while debugging of Gears of War 4 developed by The Coalition. Visual Studio 2017 memory usage climbs up to 1.3 GB after a few minutes of stepping through the game code and inspecting variables. Visual Studio 2019 gives far better memory usage in the same scenario: its memory usage stays flat around 285 MB as the symbol data is kept in the debugger’s 64-bit worker process.

We’ve also tested the runtime performance and ensured there is no noticeable overhead due to calls between components that are now going over a process boundary.

Unsupported scenarios and known issues

Here are the unsupported scenarios and known issues with the updated debugger:

The feature is not supported on 32-bit Windows.
Symbols for C++/CLI modules are still being loaded in-process.
Symbol data requested by the Disassembly Window will be loaded in-process. This will be fixed in Preview 2.
Legacy C++ Expression Evaluator add-ins (here is an example) are not supported in this mode.
Newer C++ EE add-ins (those written using the IDkmCustomVisualizer interface) will not work unless they are modified to support loading in the external process. If you own one of these extensions, more information can be found here.
Finally, for compatibility, 3^rd party extensions can still request and get symbols in-process.

If you need to continue using the debugger in-process due to the known issues mentioned above, you can turn it off by going to the Debugging tab in Tools -> Options and unchecking the option “Load debug symbols in external process (native only)”.

Give us your feedback

Download Visual Studio 2019 Preview 1 today, and start using the improved debugger on your large native applications. If you run into any issues with it (e.g. stability, performance, incompatible extensions or EE add-ins) or if you have a suggestion for us, please let us know via the ‘Send Feedback’ button in the top right corner of Visual Studio or via Developer Community.

Lifetime Profile Update in Visual Studio 2019 Preview 2

The C++ Core Guidelines’ Lifetime Profile, which is part of the C++ Core Guidelines, aims to detect lifetime problems, like dangling pointers and references, in C++ code. It uses the type information already present in the source along with some simple contracts between functions to detect defects at compile time with minimal annotation.

These are the basic contracts that the profile expects code to follow:

Don’t use a potentially dangling pointer.
Don’t pass a potentially dangling pointer to another function.
Don’t return a potentially dangling pointer from any function.

For more information on the history and goals of the profile, check out Herb Sutter’s blog post about version 1.0.

What’s New in Visual Studio 2019 Preview 2

In Preview 2, we’ve shipped a preview release of the Lifetime Profile Checker which implements the published version of the Lifetime Profile. This checker is part of the C++ Core Checkers in Visual Studio.

Support for iterators, string_views, and spans.
Better detection of custom Owner and Pointer types which allows custom types that behave like Containers, Owning-Pointers, or Non-Owning Pointers to participate in the analysis.
Type-aware default rules for function call pre and post conditions help reduce false-positives and improve accuracy.
Better support for aggregate types.
General correctness and performance improvements.
Some simple nullptr analysis.

Enabling the Lifetime Profile Checker Rules

The checker rules are not enabled by default. If you want to try out the new rules, you’ll have to update the code analysis ruleset selected for your project. You can either select the “C++ Core Check Lifetime Rules” – which enables only the Lifetime Profile rules – or you can modify your existing ruleset to enable warnings 26486 through 26489.

Screenshot of the Code Analysis properties page that shows the C++ Core Check Lifetime Rules ruleset selected.

Warnings will appear in the Error List when code analysis is run (Analyze > Run Code Analysis), or if you have Background Code Analysis enabled, lifetime errors will show up in the editor with green squiggles.

Screenshot showing a Lifetime Profile Checker warning with a green squiggle in source code.

Examples

Dangling Pointer

The simplest example – using a dangling pointer – is the best place to start. Here px points to x and then x leaves scope leaving px dangling. When px is used, a warning is issued.

void simple_test()
{
    int* px;
    {
        int x = 0;
        px = &x;
    }
    *px = 1; // error, dangling pointer to 'x'
}

Dangling Output Pointer

Returning dangling pointers is also not allowed. In this case, the parameter ppx is presumed to be an output parameter. In this case, it’s set to point to x which goes out of scope at the end of the function. This leaves *ppx dangling.

void out_parameter(int x, int** ppx)  // *ppx points to 'x' which is invalid
{
    *ppx = &x;
}

Dangling String View

The last two examples were obvious, but temporary instances can introduce subtle bugs. Can you find the bug in the following code?

std::string get_string();
void dangling_string_view()
{
    std::string_view sv = get_string();
    auto c = sv.at(0);
}

In this case, the string view sv is constructed with the temporary string instance returned from get_string(). The temporary string is then destroyed which leaves the string view referencing an invalid object.

Dangling Iterator

Another hard to spot lifetime issue happens when using an invalidated iterator into a container. In the case below, the call to push_back may cause the vector to reallocate its underlying storage which invalidates the iterator it.

void dangling_iterator()
{
    std::vector<int> v = { 1, 2, 3 };
    auto it = v.begin();
    *it = 0; // ok, iterator is valid
    v.push_back(4);
    *it = 0; // error, using an invalid iterator
}

One thing to note about this example is that there is no special handling for ‘std::vector::push_back’. This behavior falls out of the default profile rules. One rule classifies containers as an ‘Owner’. Then, when a non-const method is called on the Owner, its owned memory is assumed invalidated and iterators that point at the owned memory are also considered invalid.

Modified Owner

The profile is prescriptive in its guidance. It expects your that code uses the type system idiomatically when defining function parameters. In this next example, std::unique_ptr, an ‘Owner’ type, is passed to another function by non-const reference. According to the rules of the profile, Owners that are passed by non-const reference are assumed to be modified by the callee.

void use_unique_ptr(std::unique_ptr<int>& upRef);
void assumes_modification()
{
    auto unique = std::make_unique<int>(0); // Line A
    auto ptr = unique.get();
    *ptr = 10; // ok, ptr is valid
    use_unique_ptr(unique);
    *ptr = 10; // error, dangling pointer to the memory held by 'unique' at Line A
}

In this example, we get a raw pointer, ptr, to the memory owned by unique. Then unique is passed to the function use_unique_ptr by non-const reference. Because this is a non-const use of unique where the function could do anything, the analysis assumes that unique‘ is invalidated somehow (e.g. unique_ptr::reset) which would cause ptr to dangle.

More Examples

There are many other cases that the analysis can detect. Try it out in Visual Studio on your own code and see what you find. Also check out Herb’s blog for more examples and, if you’re curious, read through the Lifetime Profile paper.

Known Issues

The current implementation doesn’t fully support the analysis as described in the Lifetime Profile paper. Here are the broad categories that are not implemented in this release.

Annotations – The paper introduces annotations (i.e. [[gsl::lifetime-const]]) which are not supported. Practically this means that if the default analysis rules aren’t working for your code, there’s not much you can do other than suppressing false positives.
Exceptions – Exception handling paths, including the contents of catch blocks, are not currently analyzed.
Default Rules for STL Types – In lieu of a lifetime-const annotation, the paper recommends that for the rare STL container member functions where we want to override the defaults, we treat them as if they were annotated. For example, one overload of std::vector::at is not const because it can return a non-const reference – however we know that calling it is lifetime-const because it doesn’t invalidate the vector’s memory. We haven’t completed the work to do this implicit annotation of all the STL container types.
Lambda Captures – If a stack variable is captured by reference in a lambda, we don’t currently detect if the lambda leaves the scope of the captured variable.
```
auto lambda_test()
{
    int x;
    auto captures_x = [&x] { return x; };
    return captures_x; // returns a dangling reference to 'x'
}
```

Wrap Up

Try out the Lifetime Profile Checker in Visual Studio 2019 Preview 2. We hope that it will help identify lifetime problems in your projects. If you find false positives or false negatives, please report them so we can prioritize the scenarios that are important to you. If you have suggestions or problems with this check — or any Visual Studio feature — either Report a Problem or post on Developer Community and let us know. We’re also on Twitter at @VisualC.

In Visual Studio 2019 Preview 2 we have continued to improve the C++ backend with new features, new and improved optimizations, build throughput improvements, and quality of life changes.

New Features

Added a new inlining command line switch: -Ob3. -Ob3 is a more aggressive version of -Ob2. -O2 (optimize the binary for speed) still implies -Ob2 by default, but this may change in the future. If you find the compiler is under-inlining, consider passing -O2 -Ob3.
Added basic support for OpenMP SIMD vectorization which is the most widely used OpenMP feature in machine learning (ML) libraries. Our case study is the Intel MKL-DNN library, which is used as a building block for other well-known open source ML libraries including Tensor Flow. This can be turned on with a new CL switch -openmp:experimental. This allows loops annotated with “#pragma omp simd” to potentially be vectorized. The vectorization is not guaranteed, and loops annotated but not vectorized will get a warning reported. No SIMD clauses are supported, they will simply be ignored with a warning reported.
Added a new C++ exception handler __CxxFrameHandler4 that reduces exception handling metadata overhead by 66%. This provides up to a 15% total binary size improvement on binaries that use large amounts of C++ exception handling. Currently default off, try it out by passing “/d2FH4” when compiling with cl.exe. Note that /d2FH4 is otherwise undocumented and unsupported long term. This is not currently supported on UWP apps as the UWP runtime does not have this feature yet.
To support hand vectorization of loops containing calls to math library functions and certain other operations like integer division, MSVC now supports Short Vector Math Library (SVML) intrinsic functions that compute the vector equivalents. Support for 128-bit, 256-bit and 512-bit vectors is available for most functions, with the exceptions listed below. Note that these functions do not set errno. See the Intel Intrinsic Guide for definitions of the supported functions.
Exceptions include:
- Vector integer combined division and remainder is only available for 32-bit elements and 128-bit and 256-bit vector lengths. Use separate division and remainder functions for other element sizes and vector lengths.
- SVML square-root is only available in 128-bit and 256-bit vector lengths. You can use _mm512_sqrt_pd or _mm512_sqrt_ps functions for 512-bit vectors.
- Only 512-bit vector versions of rint and nearbyint functions are available. In many cases you can use round functions instead, e.g. use _mm256_round_ps(x, _MM_FROUND_CUR_DIRECTION) as a 256-bit vector version of rint, or _mm256_round_ps(x, _MM_FROUND_TO_NEAREST_INT) for nearbyint.
- Only 512-bit reciprocal is provided. You can compute the equivalent using set1 and div functions, e.g. 256-bit reciprocal could be computed as _mm256_div_ps(_mm256_set1_ps(1.0f), (x)).
- There are SVML functions for single-precision complex square-root, logarithm and exponentiation only in 128-bit and 256-bit vector lengths.

New and Improved Optimizations

Unrolled memsets and block initializations will now use SSE2 instructions (or AVX instructions if allowed). The size threshold for what will be unrolled has increased accordingly (compile for size with SSE2: unroll threshold moves from 31 to 63 bytes, compile for speed with SSE2: threshold moves from 79 to 159 bytes).
Optimized the code-gen for small memsets, primarily targeted to initall-protected functions.
Improvements to the SSA Optimizer’s redundant store elimination: better escape analysis and handling of loops
The compiler recognizes memmove() as an intrinsic function and optimizes accordingly. This improves code generation for operations built on memmove() including std::copy() and other higher level library code such as std::vector and std::string construction
The optimizer does a better job of optimizing short, fixed-length memmove(), memcpy(), and memcmp() operations.
Implemented switch duplication optimization for better performance of switches inside hot loops. We duplicated the switch jumps to help improve branch prediction accuracy and consequently, run time performance.
Added constant-folding and arithmetic simplifications for expressions using SIMD (vector) intrinsic, for both float and integer forms. Most of the usual expression optimizations now handle SSE2 and AVX2 intrinsics, either from user code or a result of automatic vectorization.
Several new scalar fused multiply-add (FMA) patterns are identified with /arch:AVX2 /fp:fast. These include the following common expressions: (x + 1.0) * y; (x – 1.0) * y; (1.0 – x) * y; (-1.0 – x) * y
Sequences of code that initialize a __m128 SIMD (vector) value element-by-element are identified and replaced by a _mm_set_ps intrinsic. This allows the new SIMD optimizations to consider the value as part of expressions, useful especially if the value has only constant elements. A future update will support more value types.
Common sub-expression elimination (CSE) is more effective in the presence of variables which may be modified in indirect ways because they have their address taken.
Useless struct/class copies are being removed in several more cases, including copies to output parameters and functions returning an object. This optimization is especially effective in C++ programs that pass objects by value.
Added a more powerful analysis for extracting information about variables from control flow (if/else/switch statements), used to remove branches that can be proven to be always true or false and to improve the variable range estimation. Code using gsl::span sees improvements, some range checks that are unnecessary being now removed.
The devirtualization optimization will now have additional opportunities, such as when classes are defined in anonymous namespaces.

Build Throughput Improvements

Filter debug information during compilation based on referenced symbols and types to reduce debug section size and improve linker throughput. Updating from 15.9 to 16.0 can reduce the input size to the linker by up to 40%.
Link time improvements in PDB type merging and creation.
Updating to 16.0 from 15.9 can improve link times by up to a 2X speedup. For example, linking Chrome resulted in a 1.75X link time speedup when using /DEBUG:full, and an 1.4X link time speedup when using /DEBUG:fastlink.

Quality of Life Improvements

The compiler displays file names and paths using user-provided casing where previously the compiler displayed lower-cased file names and paths.
The new linker will now report potentially matched symbol(s) for unresolved symbols, like:

        main.obj : error LNK2019: unresolved external symbol _foo referenced in function _main
          Hint on symbols that are defined and could potentially match:
            "int __cdecl foo(int)" (?foo@@YAHH@Z)
            "bool __cdecl foo(double)" (?foo@@YA_NN@Z)
            @foo@0
            foo@@4
        main.exe : fatal error LNK1120: 1 unresolved externals

When generating a static library, it is no longer required to pass the /LTCG flag to LIB.exe.
Added a linker option /LINKREPROTARGET:[binary_name] to only generate a link repro for the specified binary. This allows %LINK_REPRO% or /LINKREPRO:[directory_name] to be set in a large build with multiple linkings, and the linker will only generate the repro for the binary specified in /linkreprotarget.

We’d love for you to download Visual Studio 2019 and give it a try. As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter problems with Visual Studio or MSVC, or have a suggestion for us, please let us know through Help > Send Feedback > Report A Problem / Provide a Suggestion in the product, or via Developer Community. You can also find us on Twitter (@VisualC) and Facebook (msftvisualcpp).

In the first version of Template IntelliSense, we introduced the Template Bar which allowed you to provide sample arguments for your template in order to get a richer IntelliSense experience within the template body. Since then, we’ve received a lot of great feedback and suggestions which have led to significant improvements. Our latest iteration includes the following:

Peek Window UI
Live Edits
Nested Template support
Default Argument watermarks

Peek Window UI and Live Edits

Clicking the edit button on the Template Bar no longer brings up a modal dialog instead, it opens a Peek Window. The benefit of the Peek Window UI is that it integrates more smoothly into your workflow and allows you to perform live edits. As you type your sample template arguments, the IntelliSense in the template body will update in real-time to reflect your changes. This lets you quickly see how various arguments may affect your code. In the example below, we see that we get Member List completion for std::string, but we get a red squiggle when we change the sample argument to double.

Nested Template Support and Default Argument Watermarks

We’ve also improved our Template Bar support for nested templates. Previously, the Template Bar would only appear at the top-level parent. Now, the it appears at the template header of the inner-most template to the cursor. Note that even from within the member function template you will be able to modify the sample argument of the containing class template:

You’ll also notice that we auto-populate the Peek Window textbox with a watermark if there is a default argument (as in the case of V above). Keeping that textbox as-is will use the default value for IntelliSense; otherwise, you can specify a different sample argument.

Other Productivity Features in Preview 2

C++ Productivity Improvements in Visual Studio 2019 Preview 2

Talk to Us!

We’d love for you to download Visual Studio and give Template IntelliSense a try. As always, we welcome your feedback. We can be reached via the comments below or via email (visualcpp@microsoft.com). If you encounter other problems with MSVC or have a suggestion for Visual Studio please let us know through Help > Send Feedback > Report A Problem / Provide a Suggestion in the IDE, or via Developer Community. You can also find us on Twitter (@VisualC).

Today we have a guest post from Marc Gregoire, Software Architect at Nikon Metrology and Microsoft MVP since 2007.

New Algorithms

Sampling

Iterating

Searching

Generalized Sum Algorithms

Scan

Reduce

Parallel Algorithms

Utility Functions

clamp()

gcd() and lcm()

Removed Algorithms

Further Reading Material

Refactoring at Scale

Dive in

Create a new check

Validation

Exploring C++ Code as C++ Code

Exploring a Clang AST

Nesting matchers

Discovery Through Clang AST Dumps

Tree Traversal

Avoiding the Firehose

Conclusion

Location Location Location

Conclusion

How do I get range-v3 to try it out?

What’s next

In closing

Question

Answer

Your questions?

Setting up Visual Studio for Cross Platform C++ Development

Configuring your Linux machine for cross platform C++ development

Tutorial: Using the Bullet Physics SDK GitHub repo in Visual Studio

Step 1 – Clone and open the bullet3 repo

Step 2 – Use targets view

Step 3 – Set breakpoint, build and run

Step 4 – Add a Linux configuration

Step 5 – Set breakpoint, build and run on Linux

What you learned and where to learn more

Give us feedback

Delayed refactoring

Build tools

Distributing the refactor

Conclusion

Overview

What is new?

Two-Phase Name Lookup Support

Better Constexpr Support

Improved Diagnostics

What now (call to action)?

What’s next…

Collaborative Editing:

Collaborative Debugging:

How to Install Live Share

Using Live Share – Visual Studio

Using Live Share – Visual Studio Code

Known Issues

Give Us Your Feedback

What’s Next

Give Us Your Feedback

Go to Definition improvements

Integrated terminal support when debugging

Insiders program

Tell us what you think

Value Categories

C++11

C++17

Who cares?

Closing

Alpine vs Debian

Summary

What next

Give us feedback

Open recent

Clone or checkout code

Open a project or solution

Open a local folder

Give Us Your Feedback