Running Kojak

Starting Kojak

Kojak is run from a command line. For simplicity, only a single argument is required, the name of the configuration file. Details for setting your configuration file and the parameters contained within can be found from their respective links. The recommended way to operate Kojak is to store the data and configuration file in a convenient directory, navigate to it, then execute the Kojak application with the configuration file:

$ cd projects/data_set1
$ /usr/bin/kojak data_set1.conf

In the above example, you would substitute your own data directories, configuration file, and path to Kojak with the appropriate names and locations for your installation.

Helpful Hint

Microsoft Windows users may have difficulty finding their command line prompt. It can be quickly accessed from the Start Menu by typing "cmd" in the menu search box. Once open, it operates similarly to a Unix/Linux prompt (or perhaps more accurately, a DOS prompt).

During Execution

The first statement you should see printed to the screen is the version, date, and source information for Kojak:

Kojak version 1.3.5, April 15 2015
Copyright Michael Hoopmann, Institute for Systems Biology

This information is important as Kojak is in continuous development and different versions may produce different results. It is recommended that you use the latest version of Kojak, but it may be helpful to also maintain legacy versions so that results may be replicated at a later date. Additionally, the version number is printed inside the Kojak results for your reference.

Analysis progress is reported during execution. The following stages are reported to the user:

Database parsing - number of proteins and peptides to be searched.
Spectra reading - number of spectra to be analyzed.
Preprocessing - precursor mapping, etc.
Spectral analysis

Spectral analysis is the core stage of execution where cross-linked peptide spectrum matches (PSMs) and other types of PSMs are identified. The scoring algorithm used is reported to the user and a start time for logging algorithm performance. Three substages are performed during analysis:

Non-linked peptide analysis - similar to database searching using standard algorithms.
Linked peptide analysis - part one of the two-pass approach described in the Kojak manuscript. This substage requires the most analysis time.
Final cross-linked analysis - part two of the two-pass approach. Relatively fast compared to part one.

The time at the completion of analysis is reported last and the command prompt is returned to the user.

Post Execution

After completion of the algorithm, several files will be exported to the storage device. Unless otherwise specified in the configuration file, exported files will be in the current working directory that you navigated to prior to starting Kojak.

All exported results are in text format. The basic set of Kojak results is provided in the file set by the output_file parameter. An additional set of files containing the same results, but formatted for use with Percolator, were exported with the base name set from the percolator_file parameter.

Additional details describing these file formats and using them in downstream analysis can be found on the blank and Validating PSMs pages.