Plotting the Data
=================

There are several possibilities to plot data. Most important is to
include error bars for measured data. The plotting used here is based on
`matplotlib`_. In
you analysis you typically plot raw data, or calculated values and error bars
together with the result of a fit. If you analyze an experiment that involves counting of radom events, 
histogramming is frequently used together with its plotting capabilities. Some of the most frequently used
functions included in
`LT\/LT_Fit`_ are described in the following.

Plot Columns of a Data File (:class:`~LT.pdatafile.pdfile`)
------------------------------------------------------------
Very often you would like to just quickly have a look at the
data in your file. For this the function :func:`~LT.plotting.dplot_exp` is very useful.
You can plot the values of the various columns in your datafile versus
each other. As an example I plot the time versus position of the data
in ``mf`` (:ref:`my_exp_1.data <my_exp_1>`) including the errors. Remember time was
called ``time``, position was called ``dist`` and its error was called
``d_err``. You can always get the names of all the variables in you
:class:`~LT.pdatafile.pdfile` object by doing :

.. sourcecode:: ipython
 
    In [21]: mf.show_keys() 

In  order or get a quick look at the data in data file you can use 
the following plotting command:

.. sourcecode:: ipython

    In [22]: B.dplot_exp(mf, 'time', 'dist','d_err')


You can even combine the loading of the file and the plotting into a single line (this is useful if you would like
to have a quick look at data while you are taking them:

.. sourcecode:: ipython

    In [22]: B.dplot_exp(B.get_file('me_exp_1.data'), 'time', 'dist','d_err')


In this case we replace the ``mf`` by the call ``B.get_file('me_exp_1.data')``.
    
If you do not have errors or do not want to plot them just leave them
off. If you cannot see a plot enter the command:

.. sourcecode:: ipython

    In [22]: show() # or
    In [22]: B.pl.show()

This assures that the plot is shown and updated. It is especially
important to issue this command in a script (see :ref:`script`)

*Example:* clear a previous figure and plot some values in a data file with a one-line command:

.. sourcecode:: ipython

    In [22]: clf(); B.dplot_exp( B.get_file('my_exp_1.data'), 'time', 'dist','d_err'); show()

This is a simple way to take a quick look at data stored in a datafile.  


Plotting Arrays of Data
-----------------------

Since you often have the data as a :func:`numpy.array` you need 
be able to plot them. For experimental data with error bars you should use the function :func:`~LT.plotting.plot_exp`.
This will be your most frequently used function to plot data.  
As an example we use the arrays from the previous example namely ``t``, ``dexp`` and
``derr``. To plot those you do:

.. sourcecode:: ipython

    In [25]: B.pl.clf()                              # clear the figure
    In [26]: B.plot_exp(t, dexp, derr); B.pl.show()  # show the new plot wit error bars

If you have additional errors in the independent variable (``x``) you can add them with
the statement ``xerr = sigma_x`` where ``sigma_x`` stands for the array
containing your ``x`` errors.  

To make a plot where you join the data points with a line you use the function :func:`~LT.plotting.plot_line`.
This function is generally used when you plot a function representing a theoretical result or a fit result.
A simple example is show below:

.. sourcecode:: ipython

  In [26]: B.plot_line(t, dexp); B.pl.show()  # join the data points with a line

Normally one does not join experimental data with lines, unless you really want to
highlight a trend and/or has very many data points.
  

Note that this time the title and the labels are None. You can add the
labels as shown later.  The plot looks the same as before. You
can leave off the errors if you leave off ``derr`` in the arguments to
``B.plot_exp``. There are many so-called keyword arguments to further
control the appearance. You can get more information on
:func:`~LT.plotting.plot_exp` or :func:`~LT.plotting.plot_line` and
even more on :func:`matplotlib.pyplot.plot` since this is the function
that the :mod:`LT.plotting` functions are based upon. You should also
familiarize yourself with how keywords are used.


Using a log-scale
+++++++++++++++++

If you need a log scale enter:

.. sourcecode:: ipython

    In [22]: B.pl.yscale('log'); show() # show() on the same line


This is also an example where two commands are given on one line. One
can enter several commands on one line if they are separated by a
``;``.

To switch back to a linear scale enter:

.. sourcecode:: ipython

    In [23]: B.pl.yscale('linear'); show()


Labeling the Plot
+++++++++++++++++
    
Every plot needs to be properly labeled in order that the viewer knows what is shown. At the least the x-axis and y-axis
need labels but it is often good practice to add a title to the plot. You might also want to change the range of the x-
or y-axis in order to show the viewer the important part of the plot with greater detail. This can be achieved by changing the axis limits. The commands for these tasks are shown below (*make a not of these as you will use them often*):

==============  =================================
to change       do this
==============  =================================
x-axis label    B.pl.xlabel( ’ a new x axis label’ )
y-axis label    B.pl.ylabel( ’ a new y axis label’ ) 
plot title      B.pl.title(’A new Plot title’)
x-axis limits   B.pl.xlim ( (xmin, xmax ) )
y-axis limits   B.pl.ylim( (ymin, ymax) )
==============  =================================

Note that the two parenthesis are needed since the limits are entered as
so-called tuples. These are basically as sequence of objects (numbers,
strings etc.) which cannot be changed. If you enter the command
``B.pl.xlim()`` you will get back the current limits. If you want to know
more about tuples have a look at the Python documentation.

Finally, below is the final plot including all the commands necessary:

.. plot:: pyplots/plot_1.py
   :include-source: False	  
   :alt: Final plot of D as a function of t with errorbars


Save the plot
+++++++++++++

You can save the plot as a pdf file by giving the command:

.. sourcecode:: ipython

    In [24]: B.pl.savefig('my_plod.pdf')

Note that all the commands that adjust the plot start with ``B.pl``. The
reason is that the :mod:`LT.box` module itself imports the matplotlib
plotting module and gives it the name ``B.pl``. Frequently (especially if you are using Spyder)
:mod:`matplotlib.pyplot` and :mod:`numpy` have already been
imported. However when you run a script (:ref:`script`) you need to import them
explicitly and it is therefore a better practice to use ``B.pl`` in
front of the commands (assuming you did ``import LT.box as B``) to get
used to it.


1D-Histograms
-------------

Histograms are typically used in counting experiments, especially in
nuclear physics experiments. One experiment you do very early on is to
count the number of decays from a radioactive source during a certain
amount of time. This measurement is then repeated many times. I have
simulated the result of such and experiment and the data are stored in
the file ’counts.data’. To work with these data I first load them in
the same way as I did before:

.. sourcecode:: ipython

    In [27]: mcf = B.get_file('counts.data') # get the file

To see what columns have been defined you do (as before)

.. sourcecode:: ipython

    In [28]: mcf.show_keys()

And the output should be something like

.. sourcecode:: ipython

    In [28]: mcf.show_keys()
    ['n', 'counts', 'indx']

Here ``n`` is the number of the measurement (the first one would be 0) and
``counts`` is the number of counts you obtained in this experiment. The
additional variable ``indx`` is used internally and is not interesting to
you (**DO NOT TOUCH IT!**). To get an idea what the data look like, I
could just plot ``counts`` as a function of ``n``. Since I want to plot the
data that are part of a :class:`~LT.pdatafile.pdfile` object I will used the :func:`~LT.plotting.dplot_exp`
function.

.. sourcecode:: ipython

    In [29]: B.dplot_exp(mcf, 'n', 'counts')

You will notice that the data scatter around a value of 150. To do
analyze them further we need to convert them to numpy arrays:

.. sourcecode:: ipython

    In [30]: ne = mcf['n']
    In [31]: counts = mcf['counts']

Now let’s calculate the average. To do this you need a ``for`` loop

.. sourcecode:: ipython

    In [32]: N = 0              # set the counter to 0
    In [33]: sum = 0.           # set the sum to 0
    In [34]: for cc in counts:  # loop over all counts
       ....:    N += 1          # increment the counter
       ....:    sum += cc       # add the current value to the total sum

Close the loop with two returns. Now you you can get the average by
doing

.. sourcecode:: ipython

    In [35]: N_av = sum/N       # evaluate the average

This was a little programming. First I created a variable called ``N`` to
count the number of terms in the sum and, in the first statement (``N =
0``), I set it to 0. Then I did the same for another variable, called
``sum``. ``sum`` should contain the running sum of the values which is
carried out in a ``for`` loop (``for cc in counts:``). In the loop, ``cc``
contains the current value of counts. There are two strange looking
lines. The first, ``N += 1`` means: *add 1 to the current value of N* or
increment ``N`` by 1. The second does the same for the variable ``sum``
but instead of adding 1 we add the current value of the counts stored in
``cc``. As a consequence at the end of the loop ``N`` contains the number of
measurements and ``sum`` the total ``sum`` of the values. To calculate the
average all I have to do is take the ratio of the two numbers and store
them in the variable ``N_av``. This was done in the last line (``N_av =
sum/N``). To look at its value:

.. sourcecode:: ipython

    In [35]: N_av
    Out[35]: 150.41499999999999

This was a small programming example. Since the average is such an
important quantity this procedure has been built into numpy.

.. sourcecode:: ipython

    In [36]: np.average( counts )

Will give you the same result. There are many useful function defined
for :func:`numpy.array` 's. Here are just a few useful ones using ``counts`` as an example:

=============   ============================================ 
Function     	Returns
=============   ============================================ 
counts.sum()    the sum of all elements
counts.mean()   the mean (average) of all elements
counts.var()    the variance of all elements
counts.min()    the smallest value of the array
counts.max()    the larges value of the array
=============   ============================================

Try them out ! 

When you create a histogram you want to count how often a certain
value occurs within certain limits. To do this there is a special
histogram object in :mod:`LT.box` called :class:`~LT.box.histo`. The simplest way to use it is as
follows:

.. _histo:

.. sourcecode:: ipython

    In [37]: hh = B.histo(counts)

It looks as if nothing has happened. But if all the work as already been
completed. You can now make a plot of the histogram by doing

.. sourcecode:: ipython

    In [38]: hh.plot()

And you will get a (blue) graph that represents the frequency with
which certain counts values occur. There are other useful plotting commands, such as
``hh.plot_exp()`` which will plot error bars associated with the bin contents.
Check the histogram documentation for more details.

.. plot:: pyplots/histo_plot_0.py
   :alt: Plotted histogram, vertical filled bars. x-axis label "x-bin", y-axis label "content" and title "my histogram" 
	  
What you see is a relatively coarse representation of this
distribution since in its simples form :class:`~LT.box.histo`
automatically determines the range of values that you provided. This
range is divided into 10 equally spaced regions, called ``bins`` centered
around a central value (``bin_center``) with a width called the ``bin_width``. It
then goes through all the data in the array ``counts`` and checks in
which region (``bin``) each value falls. To each ``bin`` belongs a
counter that is incremented each time a values falls within the
associated region. The value of each counter is called the ``bin
content``. The graph now shows the value of the ``bin content`` as a
function of the ``bin-center`` value.  The width of each step
corresponds to the width of a bin. You have much more control on how
your histogram is setup. Lets re-define it and this time we give it a
range and also the number of bins. From the first graph we see that
most ``bin-content`` values of counts lie between 120 and 180. If we
want to get a count of how often a specific value occurs we need to
make the bin width equal to one as follows

.. sourcecode:: ipython

    In [39]: hh = B.histo(counts, range = (120., 180.), bins = 60)

This is another example of using keywords to control what an
object/function is doing. the ``range = (120., 180.)`` determines the
range of values accepted, where ``range`` is the *keyword*. The other
statement ``bins = 60``, sets the number of bins to 60. In that way the
width of a bin is 1. Now plot the new histogram again.

.. sourcecode:: ipython

    In [40]: clf(); show()
    In [41]: hh.plot(); show()

(Remember the first line ``clf()`` clears the figure). Now you have a much
better representation of the histogram:

.. plot:: pyplots/histo_plot_1.py
   :alt: Plot of histogram with bin width 1, x-axis label "x-bin", y-axis label "content", title "my histogram"
  

The central values (``bin-center``), the ``bin-content`` (number of occurrences) and the
associated errors (``bin-error``) of each bin are can be accesses as follows:

.. sourcecode:: ipython

    In [42]: hh.bin_center (return)

Produces the the output:

.. sourcecode:: ipython

    In [43]: hh.bin_center

    Out[43]: 
    array([ 120.5,  121.5,  122.5,  123.5,  124.5,  125.5,  126.5,  127.5,
            128.5,  129.5,  130.5,  131.5,  132.5,  133.5,  134.5,  135.5,
            136.5,  137.5,  138.5,  139.5,  140.5,  141.5,  142.5,  143.5,
            144.5,  145.5,  146.5,  147.5,  148.5,  149.5,  150.5,  151.5,
            152.5,  153.5,  154.5,  155.5,  156.5,  157.5,  158.5,  159.5,
            160.5,  161.5,  162.5,  163.5,  164.5,  165.5,  166.5,  167.5,
            168.5,  169.5,  170.5,  171.5,  172.5,  173.5,  174.5,  175.5,
            176.5,  177.5,  178.5,  179.5])


Below is a summary of the most important information stored in
:class:`~LT.box.histo`. I use ``hh`` as the name for the histogram here by you can use
any other variable name when you create it:

==============  ===================================================================================== 
variable             meaning
==============  ===================================================================================== 
bin_center      Central value of each bin.
bin_content     Content of each bin (the value of the counter)
bin_error       Error of the content of each bin
bin_width       Width of each bin
title           Histogram title. You can store the short description of the data (used in plotting).
xlabel          Description of the bin_center values (used in plotting).
ylabel          Description of the bin_content (used in plotting).
==============  ===================================================================================== 

And here the most important (member) functions:

========== ====================================================================== 
 Function  Arguments and Results
========== ====================================================================== 
 histo     Create a histogram. There are several ways to create one depending
           on the arguments in histo here are the main possibilities:

           -  ``h = histo( data_array, bins = 20, range = (10., 30.) )``,
              ``data_array`` is a numpy array containing the data to be
              histogrammed, ``bins`` the number of bins, in this case 20 and
              ``range`` the range of values to be sorted into ``bins`` bins , in
              this example between 10 and 30.

           -  ``h = histo(file = ’histo.data’)``, read the histogram data from
              file ``’histo.data’`` and create a new histogram. The
              ``’histo.data’`` file should have been created previously by a
              ``histo.save(’histo.data’)`` command (see below)

plot       ``(ymin = 0., filled = True, color = ’b’)``

           Plot the histogram. There are several keywords that can be set to
           control the plot. You can assign a minimal value ``ymin`` for the
           smallest content to be drawn (default is 0) . ``filled = False`` the
           area below the curve is not filled. ``color = ’b’`` draw the line in
           blue. Other colors are: red(``r``), green(``g``) , 
           magenta(``m``), yellow(``y``) or
           any other `matplotlib`_ color.

sum        ``(xmin=10., xmax=20)``

           Add the content of the histogram for the values ranging between (and
           including) ``xmin = 10`` and ``xmax = 20``. Returns a tuple with the
           value and the error. If no arguments are given it returns the total
           of all bins.
 
save       ``(filename = ’histo.data’)``

           Save the histogram to the file ``histo.data``.

========== ====================================================================== 

For more information see :class:`LT.box.histo`.


2D-Histograms
-------------

If one is interested in analyzing correlations of data 2-dimensional histograms are frequently used. In this
case one counts the number of occurrences of pairs of data. The 2D-histogram
is setup as follows:


.. _histo2d:

.. sourcecode:: ipython

    In [37]: hh2 = B.histo2d(x_vals, y_vals, bins = [15,25]) # create a 2D-histogram for the values x_vals, y_vals
                                                             # use 15 bins in the x-direction
							     # use 25 bins in the y-direction

Once the histogram is created you can plot it with the command:

.. sourcecode:: ipython

    In [38]: hh2.plot()  # plot the 2d histogram

For more detailed information on 2D-histograms see :class:`LT.box.histo2d`.							     


.. include:: include/links.rst