Sample Configuration File
Overview
Below is a sample configuration file. Although it is not necessary to specify every parameter, doing so is highly recommended so that you know explicitly the parameters being used. For ease of reading, comments and notes are allowed after a # character. Feel free to include your own comments when creating and using your configuration files.
To obtain the latest configuration file, run kojak –config from your commandline.
Example
# Kojak version 2.0.0alpha16 parameter file
# Please see online documentation at:
# http://www.kojak-ms.org/param
# All parameters are separated from their values by an equals sign ('=')
# Anything after a '#' will be ignored for the remainder of the line.
#
# Computational resources
#
threads = 0 #0=all threads, or specify exact number. Use -1 for all but 1 thread, etc.
#
# Data input files: include full path if not in current working directory
#
database = yourDB.fasta
export_percolator = 1
export_pepXML = 1
export_mzID = 1
MS_data_file = yourData.mzML
percolator_version = 3.04
results_path = .
#
# Parameters used to described the data being input to Kojak
#
enrichment = 0 #Values between 0 and 1 to describe 18O APE.
#For example, 0.25 equals 25 APE.
instrument = 0 #Values are: 0=Orbitrap, 1=FTICR (such as Thermo LTQ-FT)
MS1_centroid = 1 #0=no, 1=yes
MS2_centroid = 1 #0=no, 1=yes
MS1_resolution = 50000 #Resolution at 400 m/z, value ignored if data are
#already centroided
MS2_resolution = 50000 #Resolution at 400 m/z, value ignored if data are
#already centroided
#
# Cross-link and mono-link masses allowed. May have more than one of each parameter.
#
# Format for cross_link is [amino acids] [amino acids] [mass mod] [identifier]
# Format for mono_link is [amino acids] [mass mod]
# One or more amino acids (uppercase only!!) can be specified for each linkage moiety
# Use lowercase 'n' or 'c' to indicate protein N-terminus or C-terminus
#
cross_link = nK nK 138.0680742 DSS
mono_link = nK 156.0786442515 #DSS_H2O_monolink
mono_link = nK 155.0946286667 #DSS_NH2_monolink
#
# Fixed modifications. Add as many as necessary.
#
fixed_modification = C 57.02146
fixed_modification_protC = 0
fixed_modification_protN = 0
#
# Differential modifications. Add as many as necessary. Uppercase only for amino acids!
# n = peptide N-terminus, c = peptide C-terminus
#
# If more than one modification is possible for an amino acid,
# list all modifications on separate lines
#
modification = M 15.9949
modification_protC = 0
modification_protN = 0
diff_mods_on_xl = 0
max_mods_per_peptide = 2
mono_links_on_xl = 0
#
# Digestion enzyme rules.
#
# See http://www.kojak-ms.org/param/enzyme.html
#
enzyme = [KR]|{P} Trypsin
#
# Scoring algorithm parameters
#
# fragment_bin_offset and fragment_bin_size influence algorithm precision and memory usage.
# They should be set appropriately for the data analyzed.
# For ion trap ms/ms: 1.0005 size, 0.4 offset
# For high res ms/ms: 0.01 size, 0.0 offset
#
fragment_bin_offset = 0.0 #between 0.0 and 1.0
fragment_bin_size = 0.01 #in Thomsons
ion_series_A = 0
ion_series_B = 1
ion_series_C = 0
ion_series_X = 0
ion_series_Y = 1
ion_series_Z = 0 #Z-dot values are used
#
# Additional parameters used in Kojak analysis
#
decoy_filter = REVERSE 0 #First value specifies identifier for all decoys in the database.
#Second value requests Kojak to generate the decoys on-the-fly: 0=off, 1=on
e_value_depth = 5000 #minimum number of values required when computing e-values. Default=5000
isotope_error = 2 #account for errors in precursor peak identification.
#Searches this number of isotope peak offsets.
#Values are 0,1,2, or 3.
max_miscleavages = 2 #number of missed trypsin cleavages allowed
max_peptide_mass = 6000.0 #largest allowed peptide mass in Daltons
min_peptide_mass = 600.0 #lowest allowed peptide mass in Daltons
min_peptide_score = 0.5 #minimim individual peptide score to be considered for second pass analysis.
min_spectrum_peaks = 20 #skip MS/MS spectra with fewer than this number of peaks.
max_spectrum_peaks = 0 #top N peaks to use during analysis. 0 uses all peaks.
precursor_refinement = 1 #0=off, 1=on. See online documentation for details.
ppm_tolerance_pre = 8 #mass tolerance on precursor when searching
prefer_precursor_pred = 2 #prefer precursor mono mass predicted by
#instrument software.
# 0 = ignore previous predictions
# 1 = use only previous predictions
# 2 = supplement predictions with additional analysis
spectrum_processing = 1 #0=no, 1=yes. See online documentation for details.
top_count = 10 #number of top scoring alpha peptides to maintain after first pass.
#not recommended to use below 5 or more than 50.
truncate_prot_names = 0 #Max protein name characters to export, 0=off