Native File Format

In FitBenchmarking, the native file format is used to read IVP, Horace, and SASView problems.

In this format, data is separated from the function. This allows running the same dataset against multiple different models to assess which is the most appropriate.

Examples of native problems are:

# Fitbenchmark Problem
software = 'ivp'
name = 'Lorentz'
description = 'A simple lorentz system for testing the ivp parser. Exact results should be 10, 28, 8/3.'
input_file = 'lorentz3d.txt'
function = 'module=functions/lorentz,func=lorentz3d,step=0.1,sigma=11,r=30,b=3'
plot_scale = 'linear'

# FitBenchmark Problem
software = 'SASView'
name = '1D cylinder (synthetic neutron) IV0'
description = 'A first iteration synthetic dataset generated for the 1D cylinder SASView model in the fashion of neutron small angle scattering experiments. Generated on Fri May 28 10:31:19 2021.'
input_file = '1D_cylinder_20_400_nosmearing_neutron_synth.txt'
function = 'name=cylinder,radius=35.0,length=350.0,background=0.0,scale=1.0,sld=4.0,sld_solvent=1.0'
plot_scale = 'loglog'

These examples show the basic structure in which the file starts with a comment indicating it is a FitBenchmark problem followed by key-value pairs. Available keys are described below:

software

Either ‘IVP’, ‘SasView’, or ‘Horace’ (case insensitive).

This defines whether to use an IVP format or SasView to generate the model.

For information on the ‘Horace’ format, see Horace File Format.

The component of SasView we use is SasModels, which is available under a BSD 3-clause licence.

name

The name of the problem.

This will be used as a unique reference so should not match other names in the dataset. A sanitised version of this name will also be used in filenames with commas stripped out and spaces replaced by underscores.

description

A description of the dataset.

This will be displayed in the Problem Summary Pages and Fitting Reports produced by a benchmark.

input_file

The name of a file containing the data to fit.

The file must be in a subdirectory named data_files, and should have the form:

header

x11 [x12 [x13 ...]] y11 [y12 [y13 ...]] [e11 [e12 ...]]
x21 [x22 [x23 ...]] y21 [y22 [y23 ...]] [e21 [e22 ...]]
...

Most softwares uses the convention of # X Y E as the header and SASView uses the convention <X> <Y> <E>, although neither of these are enforced. The error column is optional in this format.

If the data contains multiple inputs or outputs, the header must be written in one of the above conventions with the labels as “x”, “y”, or “e” followed by a number. An example of this can be seen in examples/benchmark_problems/Data_Assimilation/data_files/lorentz.txt

plot_scale

The scale of the x and y axis for the plots. The options are ‘loglog’, ‘logy’, ‘logx’ and ‘linear’. If this is not set it will default to ‘linear’.

function

This defines the function that will be used as a model for the fitting.

Inside FitBenchmarking, this is passed on to the specified software and, as such, the format is specific to the package we wish to use, as described below.

IVP

The IVP parser allows a user to define f in the following equation:

\[x' = f(t, x, *args)\]

To do this we use a python module to define the function. As in the above formula, the function can take the following arguments:

t (float): The time to evaluate at
x (np.array): A value for x to evaluate at
*args (floats): The parameters to fit

To link to this function we use a function string with the following parameters:

module: The path to the module
func: The name of the function within the module
step: The time step that the input data uses (currently only fixed steps are supported - if you need varying time steps please raise an issue on our GitHub)
*args: Starting values for the parameters

SASView

SASView functions can be any of these.

Horace

The Horace functions are defined here Horace File Format .

jacobian

This is to define a dense jacobian function, or a sparse jacobian function, or both. The parser for this function allows the user to define g in the following:

\[\nabla_p f(x, *args) = g(x, *args)\]

To do this we use a python module, where the user can define a dense jacobian function and a sparse jacobian function (or just one of the two). As in the above formula, both functions can take the following arguments:

x (np.array): A value for x to evaluate at
*args (floats): The parameters to fit

To link to this functions we use a string with the following parameters:

module: The path to the module
dense_func: The name of the dense jacobian function within the module
sparse_func: The name of the sparse jacobian function within the module

The sparse jacobian function provided must return a matrix in sparse format (e.g. coo, csr, crs), otherwise an error will be thrown.

fit_ranges

This specifies the region to be fit.

It takes the form shown in the example, where the first number is the minimum in the range and the second is the maximum.

parameter_ranges

An optional setting which specifies upper and lower bounds for parameters in the problem.

Similarly to fit_ranges, it takes the form where the first number is the minimum in the range and the second is the maximum.

Note

Currently in Fitbenchmarking, problems with parameter_ranges can be handled by SciPy, Bumps, Minuit, Mantid, Matlab Optimization Toolbox, DFO, Levmar and RALFit fitting software. Please note that the following Mantid minimizers currently throw an exception when parameter_ranges are used: BFGS, Conjugate gradient (Fletcher-Reeves imp.), Conjugate gradient (Polak-Ribiere imp.) and SteepestDescent.