Biochemical fingerprint of Colorectal Cancer cell lines using label-free live single-cell Raman spectroscopy

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%% MATLAB FILE FORMAT %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The data for each experiment is in .mat file format, that can be immediately imported into Matlab or Octave software. Each experiment contains as title the cell line and either:

- "datasetraw": contains a cell called "RamanSpectra" with the raw data before any alignment and a cell called "RananBackgroundSpectra" with the background spectra taken from the sample. A variable called "CalibrationPeakTemp" contains the measured silicon peak position for that day, later on used to correct the spectra.

- "datasetanalysed": Contains a variable called "Average" with the average of this experiment, a variable called "DataMatrix", whose columns are the corrected spectra for all the cells of the experiment and a variable called "Error", whose values are the standard deviation of the spectra.

- "Xvaluesfinal": Contains a variable called "XvaluesTrunc2" with the final wavenumbers for the columns in DataMatrix.

All multivariate methods were used onto the analysed data.

If in doubt, contact Julia Gala de Pablo, py12jg@leeds.ac.uk

%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%% CSV FILE FORMAT %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%

The data for each experiment is available in the raw or the preprocessed csv files:

- Raw csv files contains a first header line with the experiment name and the silicon calibration peak position (used to correct the data). Second header line shows the name of each column and units. Data contains a Wavenumber column, then multiple columns with the Raman raw spectra, then the wavenumber of the background and then multiple columns with the raw Raman spectra of the backgrounds.

- Preprocessed csv files contain a first header line with the experiment name and a second header line indicating the name of each column and the units. First column contains the Wavenumber data, followed by the preprocessed data for each cell of the dataset. Last 2 columns are the average and standard deviation of that dataset.