1. SVM_ROC.py

1.1. Description

Plot Receiver operating characteristic (ROC) curves using K-fold cross-validation.

Options:
--version

show program’s version number and exit

-h, --help

show this help message and exit

-i INPUT_FILE, --input_file=INPUT_FILE

Tab or space separated file. The first column contains sample IDs; the second column contains sample labels in integer (must be 0 or 1); the third column contains sample label names (string, must be consistent with column-2). The remaining columns contain featuers used to build SVM model.

-o OUT_FILE, --output=OUT_FILE

The prefix of the output file.

-n N_FOLD, --nfold=N_FOLD

The original sample is randomly partitioned into n equal sized subsamples (2 =< n <= 10). Of the n subsamples, a single subsample is retained as the validation data for testing the model, and the remaining n − 1 subsamples are used as training data. default=5.

-C C_VALUE, --cvalue=C_VALUE

C value. default=1.0

-s RAND_SEED, --seed=RAND_SEED

random_state seed used by the random number generator. default=0

-k S_KERNEL, --kernel=S_KERNEL

Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. default=linear

--xl=X_LOW

The lower limit of X-axis (false positive rate). default=-0.05

--xu=X_UPPER

The upper limit of X-axis (false positive rate). default=0.5

--yl=Y_LOW

The lower limit of Y-axis (true positive rate). default=0.5

--yu=Y_UPPER

The upper limit of Y-axis (true positive rate). default=1.05

1.2. Input files format

ID

Label

Label_name

feature_1

feature_2

feature_3

feature_n

sample_1

1

WT

1560

795

0.9716

feature_n

sample_2

1

WT

784

219

0.4087

feature_n

sample_3

1

WT

2661

2268

1.1691

feature_n

sample_4

0

Mut

643

198

0.5458

feature_n

sample_5

0

Mut

534

87

1.0545

feature_n

sample_6

0

Mut

332

75

0.5115

feature_n

1.3. Command

$ python3  SVM_ROC.py -i lung_CES_5features.tsv -o output_ROC -C 10