
Reads a file containing columns of data and creates a dictionary according to header information using the following format:

  1. comments start with # in the first column : # My Comment

  2. header information starts with #!, where # is also in the first column. It must precede the data.

  3. each columns is represented by name[dtype,]/, dtype (optional) is the data type and (starting at 0) is the column number

  4. data are separated by white space NOT commas!

  5. data types:

    • s : string

    • f : float

    • i : integer

  6. blank lines are ignored

NOTE: if data formats are entered they should be specified for all columns

Example data:

#! p_miss[0]/ siglt[2]/ s01[3]/ alt[4]/
200. 1.35e-4 -1.e-3    0.1
220. 2.56e-4 -2.e-4    -0.1
230. 3.47e-6 -3.e-5    1.1

The header can also contain data type information:

#! p_miss[f,0]/ siglt[f,2]/ s01[f,3]/ alt[f,4]/

200. 1.35e-4 -1.e-3    0.1
220. 2.56e-4 -2.e-4    -0.1
230. 3.47e-6 -3.e-5    1.1

Example that opens a file and create a dfile object:

>>> f0 = dfile('sig_LT.dat')

Loop over content:

>>> for l in f0:
>>> = l['p_miss']
>>> ...sig_lt = l['siglt']*10000.
>>> ...print pm, sig_lt, l['alt']
>>> # end for l

Variables can also be accessed directly as numpy arrays (if installed):

>>> pm = f0['p_miss']
>>> sig_lt = f0['siglt']*10000.

They can also be converted to attributes:

>>> f0.make_attr()
>>> print f0.p_miss
>>> print f0.sig_lt

Variable names are also called keys.

Save the data file as a csv file, this is useful for exporting and formatting for documents (other than latex)

>>> f0.write_csv(filename)

class LT.datafile.dfile(filename, debug=False, new=False, skip=True, use_numpy=True, fast=False, adata=None)

Open a file, read and interpret the contents and return a dfile object:

>>> df = dfile('my_datafile')


debug = False (default) : print additional iformation

fast = False (default)load data in a fast way (using np.loadtxt) this has

some limits for string entries. Best used for large purely numerical data.

use_numpy = True (if numpy is installed)set automatically but can be overridden

some attributes of datafile are not availables when numpy is not present

adata = my_lines (default = None)use the data provided in a list of strings (my_lines) as data file content.

The content of my_lines must follow the datafile syntax.

add_data(keys, line)

add data to the data file

d.add_data(‘x:y:z’,’xval yval zval’)

xval, yval, zval are the values as they would be entered in a data file line

both arguments are strings and describe a complete line i.e. values for all variables defined in the header line.


add a comment line to the header. The # at the beginning is automatically added !

  • for a normal comment start with a space

  • to add a parameter start the comment with a

add_key(key, format='f')

add a new key, this is useful if you want to add new data assign new values using a loop like:

>>> df.add_key('newkey',format='i')

The new data need to be stored as follows:

>>> for d in
>>>     d['newkey']= some_new_value

where df is the datafile and ‘newkey’ the new key. The new data set can be saved using fp = open('new_file','w') and df.write_all(fp)

add_parameter(name, value)

add a parameter to the comment section

check_data(func, data, key, *args)

function used to check if data fulfill a condition provided by the user. The function is assumed to return True of False.


remove a key and all values associated with it


iterator over all data evaluating an expression contained in eval_str


return a list of all data stored

get_data(key, sel_func=None, sel_args=None)

return all data for key subject to the results of a selector function.

one can define a selector function (sel_func) using the arguments stored in sel_args. This function is evaluated for each data record and only those data are returned for which the condition is fulfilled.

  1. sel_func: a user provided function returning True or False

  2. sel_args: a list of arguments used in the sel_func function


assume the data contain a variable (key) called ‘name’ you want to select only those data where name contains a certain substring ‘Jo’

Define this function:

>>> def myfind(data, key , what):
>>> ...where = data[key]
>>> ...return (str.find(where, what) >= 0)

now you can select the data using:

>>> df.get_data('name',myfind, ['name','Jo'] )

This should return a list of names containing the substring ‘Jo’

get_data_eval(key, eval_str)

return all data for the key under the condition that the expression in eval_str is True.

get_data_list(keylist, sel_func=None, sel_args=None)

return all the data corresponding to the key list as follows:

>>> a = df.get_data_list('key1:key2:key3')

>>> a = df.get_data_list('key1:key2:key3', myfind, ['name','J'])

return those data where the name values contain the character J

a contains the list of data

get_data_list_eval(keylist, eval_str)

similar function as select_data_eval but it only returns the values for the keys defined in keylist.


return all line up to the header line


return the header lines


return a list of keys


print the filename associate with this instance


Save the current datafile.

With the keyword: file = ‘new_file’

the datafile will be written into the new_file name

scale(key, factor)

multiply all values of key with a factor

select_data(sel_func=None, sel_args=None)

returns an iterator for the data. As in get_data() a selector function and its arguments can be supplied:

conditions can be applied:

  1. sel_func: a user provided function returning True or False

  2. sel_args: a list of arguments used in the sel_func function

The iterator returned can be used as follows:

>>> for d in df.select_data():
>>> ...print d

or the result can be converted to a list:

>>> list(df.select_data())

Using a selector function:

Example: assume the data contain a variable (key) called ‘name’

you want to select only those data where name contains a certain substring ‘sub’

Define this function:

>>> def myfind(data, key , what):
>>> ...where = data[key]
>>> ...return (str.find(where, what) >= 0)

now you can select the data using:

>>> list( df.select_data( myfind, 'name', 'sub') )

returns an iterator for data selected with an eval expression stored in the string eval_str each dataset item is accessed using the name data:

>>> df.select_data_eval("data['x'] >= 0.")

returns only those data items where the value of the x-column in the file is larger than or equal to 0.


print all data and keys stored


print all the data corresponding to the key list:

>>> df.show_data('key1:key2:key3')

print a list of variable names in the dictionary

sort(key, **kwargs)

sort the data according to the values in key


update header line of this data file including format. This makes sure that the header line is in sync with the dictionary keys

write_all(fp, complete_header=False)

write all data to a file associated to fp

if complete_header = True include the complete header including all comments


write entire header including comments and internal parameters to file with handle fp. Example:

fp = open(‘’,’w’)



save the current file as a csv file

f : file name to be used


write only header line of this data file into file fp, including format

write_line(fp, i)

write data line i into file fp

write_selected(fp, index_list, complete_header=False)

write a new datafile with an identical header but enter only those data with an index given in index_list. If the index does not exist print a message and skip it.

if complete_header = True include the complete header including all comments