When you want to convince people that Julia is the best language for some task, some are easily swayed by the beautiful syntax, rank polymorphism, and flirtations with functional programming. In all honesty, those people need little convincing. Others want to see hard numbers!

Here's a very small demo that shows how Julia can outperform any traditional language: computing polynomials.

Suppose we have a function described by

where we have the coefficients stored in some dense vector.

```
struct Polynomial{T}
c::Vector{T}
end
```

We can evaluate this function as follows:

```
function (f::Polynomial{T})(x::T) where T<:Number
result = f.c[1]
xpow = x
for c in f.c[2:end-1]
result += xpow * c
xpow *= x
end
result + xpow * f.c[end]
end
```

This is very much how you would implement such a function in most other languages. In Julia however, we can do something much nicer: generate a function.

```
function expand(f::Polynomial{T}) where T<:Number
:(function (x::$T)
r = $(f.c[1])
xp = x
$((:(r += xp*$c; xp*=x) for c in f.c[2:end-1])...)
r + xp * $(f.c[end])
end)
end
```

Now, when we `expand(f)`

, say we have `f = Polynomial(1.0, 0.5, 0.333, 0.25)`

the generated function will look as follows:

```
:(function (x::Float64,)
r = 1.0
xp = x
begin
r += xp * 0.5
xp *= x
end
begin
r += xp * 0.333
xp *= x
end
r + xp * 0.25
end)
```

To run this, we say `f_unroll = eval(expand(f))`

. Not only does this run much faster than our previous loopy implementation, it even outperforms C++ (GCC at `-O3`

) by more than a factor of two (depending on your machine). The reason is that we have no loop in our generated code, and more importantly, we are not accessing any memory. We have done something that a C++ programmer wouldn't dream of doing: write a code generator, compile the specialised code, and fly to the moon!

A completely literate and Documentered implementation of this demo can be found here.

Edit: Thanks to Nicos Pitsianis for point out to me that there is a more efficient method called Horner's method. This method applies to both Julia and C++ implementations, reducing the amount of multiplications needed. Using this method reduces the difference between Julia and C++ a little bit, but increasing the order of the polynomial grows it back again.

## Top comments (5)

I've tried to reproduce the benchmarks of all three versions of the algorithm, you presented in "

A completely literate and Documentered implementation...". There are three variants presented (variant 2 and 3 are shown in the post above).My environment is Julia 1.9.1 on Apple M1. I've used the

`@benchmark`

-macro from the`BenchmarkTools`

-package. This macro calls the expression to be measured several hundred times in order to obtain a reliable statistics. So there is no need for the functions`test_f1`

,`test_f2`

and`test_f3`

. I've basically evaluated the expression inside these functions (which is more or less equivalent to calling these functions with`n = 1`

).This resulted in the following run times in ms for evaluating 100,000 arguments with variants 1-3: 6.35/2.30/0.15 (

originalin the table below).Then I tried to insulate the pure algorithm a bit more, eliminating other effects which may degrade run time. There are basically two aspects:

`f`

and its arguments`xs`

) isn't recommended for the Julia compiler. It is better to pass them as function arguments of to declare them as. It did the latter.`const`

`LinRange`

for the arguments uses lazy evaluation. I.e. the numbers are only created when needed for evaluation. So I applied a`collect`

on them in order topre-allocateall arguments.Using

`const`

values is a huge improvement for variant 2. So the speedup from variant 2 to 3 gets much smaller (but still is great).The following table shows all results and the speedups based on these numbers. For comparison I've also added the numbers of the Julia standard function

`evalpoly`

.Shouldn't it be faster if you used Horner's rule to evaluate the polynomial? It would require a single multiplication (instead of two) and a single addition per coefficient.

The unrolled code will also be shorter.

Thanks for the tip! I'll add this to the code.

Note that Julia

`Base`

has a function for evaluating a polynomial given by coefficients:`Base.Math.evalpoly`

:docs.julialang.org/en/v1/base/math...

On the other hand, when you're willing to preprocess the polynomial (prior to evaluating it lots of times), techniques that may be faster than

`evalpoly`

are possible.Very interesting. Also something that most are unaware of. i.e literate programming. 👍