datafile¶
Reads a file containing columns of data and creates a dictionary according to header information using the following format:
comments start with
#
in the first column :# My Comment
header information starts with
#!
, where#
is also in the first column. It must precede the data.each columns is represented by
name[dtype, col.nr.]/
, dtype (optional) is the data type and col.nr (starting at 0) is the column numberdata are separated by white space NOT commas!
data types:
s : string
f : float
i : integer
blank lines are ignored
NOTE: if data formats are entered they should be specified for all columns
Example data:
#! p_miss[0]/ siglt[2]/ s01[3]/ alt[4]/
200. 1.35e-4 -1.e-3 0.1
220. 2.56e-4 -2.e-4 -0.1
230. 3.47e-6 -3.e-5 1.1
The header can also contain data type information:
#! p_miss[f,0]/ siglt[f,2]/ s01[f,3]/ alt[f,4]/
200. 1.35e-4 -1.e-3 0.1
220. 2.56e-4 -2.e-4 -0.1
230. 3.47e-6 -3.e-5 1.1
Example that opens a file and create a dfile object:
>>> f0 = dfile('sig_LT.dat')
Loop over content:
>>> for l in f0:
>>> ...pm = l['p_miss']
>>> ...sig_lt = l['siglt']*10000.
>>> ...print pm, sig_lt, l['alt']
>>> # end for l
Variables can also be accessed directly as numpy arrays (if installed):
>>> pm = f0['p_miss']
>>> sig_lt = f0['siglt']*10000.
They can also be converted to attributes:
>>> f0.make_attr()
>>> print f0.p_miss
>>> print f0.sig_lt
Variable names are also called keys.
Save the data file as a csv file, this is useful for exporting and formatting for documents (other than latex)
>>> f0.write_csv(filename)
-
class
LT.datafile.
dfile
(filename, debug=False, new=False, skip=True, use_numpy=True)¶ Open a file, read and interpret the contents and return a dfile object:
>>> df = dfile('my_datafile')
-
add_data
(self, keys, line)¶ add data to the data file
d.add_data(‘x:y:z’,’xval yval zval’)
xval, yval, zval are the values as they would be netered in a data file line
both arguments are strings !
IMPORTANT: add the values to all current lines of the data file !
-
add_header_comment
(self, text)¶ add a comment line to the header. The # at the beginning is automatically added !
for a normal comment start with a space
to add a parameter start the comment with a
-
add_key
(self, key, format='f')¶ add a new key, this is useful if you want to add new data assign new values using a loop like:
>>> df.add_key('newkey',format='i')
The new data need to be stored as follows:
>>> for d in df.data: >>> d['newkey']= some_new_value
where df is the datafile and ‘newkey’ the new key. The new data set can be saved using
fp = open('new_file','w')
anddf.write_all(fp)
-
add_parameter
(self, name, value)¶ add a parameter to the comment section
-
check_data
(self, func, data, key, *args)¶ function used to check if data fulfill a condition provided by the user. The function is assumed to return True of False.
-
delete_key
(self, key)¶ remove a key and all values associated with it
-
eval_data
(self, eval_str)¶ iterator over all data evaluating an expression contained in eval_str
-
get_all_data
(self)¶ return a list of all data stored
-
get_data
(self, key, sel_func=None, sel_args=None)¶ return all data for key subject to the results of a selector function.
one can define a selector function (sel_func) using the arguments stored in sel_args. This function is evaluated for each data record and only those data are returned for which the condition is fulfilled.
sel_func: a user provided function returning True or False
sel_args: a list of arguments used in the sel_func function
- Example::
assume the data contain a variable (key) called ‘name’ you want to select only those data where name contains a certain substring ‘Jo’
Define this function:
>>> def myfind(data, key , what): >>> ...where = data[key] >>> ...return (str.find(where, what) >= 0)
now you can select the data using:
>>> df.get_data('name',myfind, ['name','Jo'] )
This should return a list of names containing the substring ‘Jo’
-
get_data_eval
(self, key, eval_str)¶ return all data for the key under the condition that the expression in eval_str is True.
-
get_data_list
(self, keylist, sel_func=None, sel_args=None)¶ return all the data corresponding to the key list as follows:
>>> a = df.get_data_list('key1:key2:key3') >>> a = df.get_data_list('key1:key2:key3', myfind, ['name','J'])
return those data where the name values contain the character J
a contains the list of data
-
get_data_list_eval
(self, keylist, eval_str)¶ similar function as select_data_eval but it only returns the values for the keys defined in keylist.
-
get_full_header
(self)¶ return all line up to the header line
-
get_header
(self)¶ return the header lines
-
get_keys
(self)¶ return a list of keys
-
name
(self)¶ print the filename associate with this instance
-
save
(self, file=None)¶ Save the current datafile.
With the keyword: file = ‘new_file’
the datafile will be written into the new file name
-
scale
(self, key, factor)¶ multiply all values of key with a factor
-
select_data
(self, sel_func=None, sel_args=None)¶ returns an iterator for the data. As in get_data() a selector function and its arguments can be supplied:
conditions can be applied:
sel_func: a user provided function returning True or False
sel_args: a list of arguments used in the sel_func function
The iterator returned can be used as follows:
>>> for d in df.select_data(): >>> ...print d
or the result can be converted to a list:
>>> list(df.select_data())
Using a selector function:
- Example: assume the data contain a variable (key) called ‘name’
you want to select only those data where name contains a certain substring ‘sub’
Define this function:
>>> def myfind(data, key , what): >>> ...where = data[key] >>> ...return (str.find(where, what) >= 0)
now you can select the data using:
>>> list( df.select_data( myfind, 'name', 'sub') )
-
select_data_eval
(self, eval_str)¶ returns an iterator for data selected with an eval expression stored in the string eval_str each dataset item is accessed using the name data:
>>> df.select_data_eval("data['x'] >= 0.")
returns only those data items where the value of the x-column in the file is larger than or equal to 0.
-
show_all_data
(self)¶ print all data and keys stored
-
show_data
(self, keylist)¶ print all the data corresponding to the key list:
>>> df.show_data('key1:key2:key3')
-
show_keys
(self)¶ print a list of variable names in the dictionary
-
sort
(self, key, **kwargs)¶ sort the data according to the values in key
-
update_header
(self)¶ update header line of this data file including format
-
write_all
(self, fp, complete_header=False)¶ write all data to a file associated to fp
if complete_header = True include the complete header including all comments
-
write_complete_header
(self, fp)¶ write entire header including comments and internal parameters
-
write_csv
(self, f)¶ save the current file as a csv file
f : file name to be used
-
write_header
(self, fp)¶ write only header line of this data file into file fp, including format
-
write_line
(self, fp, i)¶ write line i into file fp
-
write_selected
(self, fp, index_list, complete_header=False)¶ write a new datafile with an identical header but enter only those data with an index given in index_list. If the index does not exist print a message and skip it.
if complete_header = True include the complete header including all comments
-