Julia Community 🟣

Steven Siew
Steven Siew

Posted on • Edited on

Generating Test Data

Generating Test Data for Julia Source Code

Sometimes you need to constuct some data for your julia program and you want to do it easily. You want the following

  1. Deterministic. You need the same data regardless of which version of Julia you are using. You also want it to be reproducible so that anyone who runs the source code get the exactly same data.

  2. Looks "nice" on your source code. You do not want 15 decimals digits of precision for each data point.

  3. Easy to generate. You do not want to waste 2 hours of your time to generate 200 data points. You want it in 2 minutes.

In this article, we will show some code to do this quickly. This code requires the following packages

  • StableRNGs
  • Distributions

Here is the Source code

using Distributions,StableRNGs

#=
We want to generate Float64 data deterministically
=#

function generatedata(formula;input=[Float64(_) for _ in 0:(8-1)],seed=1234,digits=0,
preamble="rawdata = ",noisedist=Normal(0.0,0.0),verbose=true)
    vprint(verbose,str) = print( verbose == true ? str : "")
    local myrng = StableRNG(seed)
    local array = formula.(input) + rand(myrng,noisedist,length(input))
    if digits > 0
        array = round.(array,digits=digits)
    end
        vprint(verbose,preamble)
    # Now print the begining array symbol '['
    vprint(verbose,"[")
    local counter = 0
    local firstitem = true
    for element in array
        if firstitem == true
            firstitem = false
        else
            vprint(verbose,",")
        end
        counter +=1
        if counter % 5 == 1
            vprint(verbose,"\n  ")
        end
        vprint(verbose,element)
    end
    vprint(verbose,"]\n")
    return array
end

generatedata(x->0.4*x+0.3,input=[ Float64(_) for _ in 0:(10-1)],digits=2,noisedist=Normal(0.0,0.2));
println("The end.")
Enter fullscreen mode Exit fullscreen mode

We generate the rawdata using the formula f(x)= 0.4*x + 0.3 using the input data of [0.0, 1.0, 2.0, ... , 9.0] with the additional noise distribution of NormalDistribution(mu=0.0 , sigma=0.2). We also make sure there is only 5 data points per line and each data points have only 2 digits after the decimal point.

Here is the output

rawdata = [
  0.4,0.88,0.76,1.24,1.76,
  2.16,2.72,3.08,3.21,3.52]
The end.
Enter fullscreen mode Exit fullscreen mode

Now you can cut and paste the generated data straight into your julia source code. If you are even lazier, you can also do this:

rawdata = generatedata(x->0.4*x+0.3,input=[ Float64(_) for _ in 0:(10-1)],digits=2,noisedist=Normal(0.0,0.2))
Enter fullscreen mode Exit fullscreen mode

Top comments (0)