4 Workflows

4.1 IOBR Workflow

User-uploaded data workflow for preprocessing, feature calculation, clustering, visualization, survival analysis, correlation analysis, and group comparison.

This workflow starts from user-provided expression data and phenotype data, then connects multiple modules into a complete analysis pipeline for signature scoring or tumor microenvironment deconvolution and downstream interpretation.

Overview

The workflow is organized into five parts:

Part 1 · Preprocessing & Features
- Counts to TPM
- Detect Outliers
- Calculate Features
- TME Cluster
- Combine Pdata
Part 2 · Visualization
- Heatmap
- Box Plot
- Percent Bar Plot
- Cell Bar Plot
Part 3 · Survival Analysis
- Batch Survival
- Forest Plot
- Heatmap
- Survival Plot
- Survival Group
- Time ROC
- Sig ROC
Part 4 · Correlation
- Batch Correlation
- Partial Correlation
- Single Correlation
- Correlation Matrix
Part 5 · Group Comparison
- Wilcoxon Test
- Kruskal Test
- Heatmap
- Box Plot

Data source

The workflow uses user-uploaded data and intermediate results generated inside the workflow.

Main inputs include:

Expression matrix
- Raw count matrix uploaded in Counts to TPM

                  TCGA-BR-6455  TCGA-BR-7196  TCGA-BR-8371  TCGA-BR-8380
ENSG00000000003      8006          2114          767           1556
ENSG00000000005      1             0             5             5
ENSG00000000419      3831          2600          1729          1760
ENSG00000000457      1126          745           1040          1260

Phenotype / clinical table
- Uploaded in Combine Pdata

      ID           stage     status      Lauren     subtype       EBV     TMEscore_plus_binary     ARID1A       PIK3CA
TCGA-3M-AB46      Stage_I    Alive     Mixed                        NE          Low               wild_type    wild_type
TCGA-B7-5818      Stage_I    Alive     Diffuse         EBV    Positive          High              mutatant     mutatant
TCGA-B7-A5TI      Stage_III  Alive     Diffuse                      NE          Low               wild_type    wild_type
TCGA-B7-A5TJ      Stage_II   Alive     Intestinal                   NE          Low               wild_type    wild_type
TCGA-B7-A5TK      Stage_II   Alive     Intestinal                   NE          High              mutatant    wild_type

Intermediate data generated within the workflow include:

TPM expression matrix
outlier-filtered matrix
signature score matrix or TME deconvolution matrix
cluster annotation
combined analysis table for downstream modules

Counts to TPM

This step converts uploaded raw count data into a TPM matrix.

Counts to TPM steps

Open the Counts to TPM tab.
Upload the raw expression count matrix.
Set the required parameters for TPM conversion.
Click Run Analysis.
Review the generated TPM matrix.

Detect Outliers

This step identifies and removes outlier samples based on the TPM matrix generated in the previous step.

Detect Outliers steps

Open the Detect Outliers tab.
Adjust the outlier detection settings if needed.
Click Run Analysis.
Review the filtered expression matrix.

Calculate Features

This step calculates downstream feature matrices based on the cleaned expression matrix.

Two modes are available:

Calculate Sigscores
Deconvolute TME

If Calculate Features = Calculate Sigscores

The workflow runs the Calculate_sig_score module and returns a signature score matrix.

If Calculate Features = Deconvolute TME

The workflow runs the Deconvo_tme module and returns a TME deconvolution matrix.

Calculate Features steps

Open the Calculate Features tab.
Select Calculate Sigscores or Deconvolute TME.
Set the corresponding parameters.
Click Run Analysis.
Review the generated feature matrix.

TME Cluster

This step performs clustering based on the feature matrix generated in the previous step.

TME Cluster parameters

Features (features) — one or more numeric variables used for clustering
Min Clusters (min_nc) — minimum number of clusters evaluated
Max Clusters (max.nc) — maximum number of clusters evaluated

TME Cluster steps

Open the TME Cluster tab.
Select one or more clustering Features.
Set Min Clusters and Max Clusters.
Click Run Analysis.
Review the cluster assignment table and cluster summary.

Combine Pdata

This step merges the feature matrix and cluster results with the user-uploaded phenotype table.

This combined table is used as the main shared input for most downstream analyses.

Combine Pdata steps

Open the Combine Pdata tab.
Upload the phenotype or clinical data table.
Match the sample ID columns between phenotype data and feature data.
Click Run Analysis.
Review the combined dataset.

How downstream modules use the prepared data

After preprocessing and feature generation, the workflow automatically builds a combined dataset that includes:

phenotype or clinical variables
calculated signature scores or TME deconvolution features
optional cluster assignment from TME Cluster

This combined table is then used as the shared input for downstream modules in Parts 2–5.

Part 2 · Visualization

This section provides direct plotting modules for the prepared dataset:

Heatmap — visualize selected signatures or scores across groups
Box Plot — compare one signature across categorical groups
Percent Bar Plot — display proportions of categorical annotations
Cell Bar Plot — show deconvolution-based cell composition across samples

Part 3 · Survival Analysis

This section provides survival-related modules using the combined dataset:

Batch Survival — screen multiple variables by Cox analysis
Forest Plot — visualize hazard ratios from batch survival results
Heatmap — display selected survival-associated variables
Survival Plot — Kaplan–Meier curves for a selected signature
Survival Group — Kaplan–Meier curves for a categorical variable
Time ROC — time-dependent ROC for prognostic variables
Sig ROC — ROC analysis for selected variables against outcome

Part 4 · Correlation

This section provides correlation-based analyses:

Batch Correlation — correlate one target with multiple features
Partial Correlation — correlate variables while adjusting for a control variable
Single Correlation — visualize correlation between two variables
Correlation Matrix — compute and plot feature-set correlation matrices

Part 5 · Group Comparison

This section provides statistical comparison modules:

Wilcoxon Test — compare numeric variables between two groups
Kruskal Test — compare numeric variables across multiple groups
Heatmap — visualize selected group-associated variables
Box Plot — visualize group differences for one selected variable

Output

The workflow returns a processed user dataset that can be reused across multiple downstream modules, including:

TPM expression matrix
outlier-filtered expression matrix
signature score matrix or TME deconvolution matrix
optional cluster annotation
combined phenotype-feature table
downstream plots and result tables generated in each part

Download

Tables generated in downstream modules can be exported from their corresponding Download panels.
Plots generated in downstream modules can be exported from their corresponding Download panels.
An initial plot size is provided, which can be adjusted if needed.
If needed, you can adjust the plot width and height before downloading to obtain a more suitable layout.

4.2 Mutation Workflow

User-uploaded workflow for mutation matrix construction and mutation-associated signature analysis.

This workflow starts from a mutation annotation file and converts it into a binary mutation matrix, then combines the generated mutation matrix with a user-provided signature matrix to identify phenotype- or signature-associated mutations.

Overview

The workflow is organized into two parts:

Build Mutation Matrix
Find Mutations

Data source

The workflow uses user-uploaded data and intermediate results generated inside the workflow.

Main inputs include:

Mutation annotation file
- Uploaded in Build Mutation Matrix
- Typically a MAF-format mutation file

Mutation Annotation Format (MAF) table

                Hugo_Symbol   Tumor_Sample_Barcode   Variant_Classification
1                 TP53          TCGA-3M-AB46          Missense_Mutation
2                 ARID1A        TCGA-3M-AB46          Frame_Shift_Del
3                 PIK3CA        TCGA-3M-AB47          Missense_Mutation
4                 CDH1          TCGA-B7-5818          Nonsense_Mutation
5                 FAT4          TCGA-B7-A5TI          Frame_Shift_Ins

Signature matrix
- Uploaded in Find Mutations
- Used to test mutation-associated differences in a selected signature

              ID          CD_8_T_effector       DDR           APM    Immune_Checkpoint  CellCycle_Reg
1         TCGA-2F-A9KO         4.7093          -4.3653       3.1724          4.5259           -1.3468
2         TCGA-2F-A9KP        -1.6480           5.0614      -1.3928         -1.4447            3.2313
3         TCGA-2F-A9KQ        -2.1915         -11.1568      -1.8568         -1.7691            0.6771
4         TCGA-2F-A9KR         0.0528           3.2845       1.6877         -0.2206           -1.3867
5         TCGA-2F-A9KT        -0.9226           7.1762      -1.6106         -1.0915           -1.1749

Intermediate data generated within the workflow include:

binary mutation matrix
subtype-specific mutation matrices such as SNP, INDEL, or Frameshift tables
mutation-associated plots generated in downstream analysis

Build Mutation Matrix

This step converts the uploaded mutation annotation file into a binary mutation matrix for downstream analysis.

The generated matrix is automatically passed to the next step in the workflow.

Parameters

TCGA (isTCGA)
- True
- False
Type to show and download (table_type)
- All
- SNP
- INDEL
- Frameshift

Build Mutation Matrix steps

Open the Build Mutation Matrix tab.
Upload the mutation annotation file.
Set whether the file uses TCGA sample naming.
Select the mutation table type to display.
Click Run Analysis.
Review the generated mutation matrix.

Find Mutations

This step identifies mutations associated with a selected signature using the mutation matrix generated in the previous step.

It produces at least two plots:

Oncoprint
Box Plot

Parameters

ID Column (id_signature_matrix)
- Sample ID column in the uploaded signature matrix
Signature (signature)
- Numeric signature or score column selected from the signature matrix
Min Mutation Frequency (min_mut_freq)
- 0.01
- 0.05
- 0.1
Method (method)
- Multi(Cuzick and Wilcoxon)
- Wilcoxon

Oncoprint parameters

Group By (oncoprint_group_by)
- Mean
- Quantile
Gene Counts (gene_counts)
- Number of top mutated genes displayed in the oncoprint

Box Plot parameters

Point Size (point_size)
- Controls point size in the box plot
Point Transparency (point_alpha)
- Controls point transparency in the box plot
Show Jitter (jitter)
- True
- False

Find Mutations steps

Open the Find Mutations tab.
Upload the signature matrix.
Set the ID Column and select the target Signature.
Choose the mutation frequency threshold and statistical method.
Adjust oncoprint and box plot parameters if needed.
Click Run Analysis.
Review the generated Oncoprint and Box Plot results.

How the workflow connects the two parts

After Build Mutation Matrix is completed, the generated mutation matrix is automatically used as the mutation input for Find Mutations.

You only need to upload the signature matrix in the second step and then run the mutation-association analysis.

Output

The workflow returns results that can be reused for mutation-related interpretation, including:

binary mutation matrix
subtype-specific mutation matrix tables
mutation-associated oncoprint
mutation-associated box plot
result files generated by the mutation analysis step

Download

Tables generated in the workflow can be exported from their corresponding Download panels.
Plots generated in the workflow can be exported from the Download panels.
An initial plot size is provided, which can be adjusted if needed.
If needed, you can adjust the plot width and height before downloading to obtain a more suitable layout.

4.3 Signature-Gene Workflow

User-uploaded workflow for preprocessing, signature calculation, and signature-gene correlation analysis. It supports correlation analysis between built-in IOBR signatures and their related genes, as well as screening correlations between selected signatures and all genes in the TPM matrix.

This workflow starts from a user-provided count matrix, converts it to TPM, removes outlier samples if needed, calculates signature scores, and then combines signature scores with gene expression values for downstream correlation analysis.

Overview

The workflow is organized into two parts:

Part 1 · Preprocessing & Features
- Counts to TPM
- Detect Outliers
- Calculate Signatures
Part 2 · Correlation
- Batch Correlation
- Single Correlation
- Correlation Matrix

Data source

The workflow uses user-uploaded data and intermediate results generated inside the workflow.

Main inputs include:

Expression matrix
- Raw count matrix uploaded in Counts to TPM

                  TCGA-BR-6455  TCGA-BR-7196  TCGA-BR-8371  TCGA-BR-8380
ENSG00000000003      8006          2114          767           1556
ENSG00000000005      1             0             5             5
ENSG00000000419      3831          2600          1729          1760
ENSG00000000457      1126          745           1040          1260

Intermediate data generated within the workflow include:

TPM expression matrix
outlier-filtered TPM matrix
signature score matrix
combined signature-gene matrix for downstream correlation analysis

Counts to TPM

This step converts uploaded raw count data into a TPM matrix.

Counts to TPM steps

Open the Counts to TPM tab.
Upload the raw count matrix.
Set the required TPM conversion parameters.
Click Run Analysis.
Review the generated TPM matrix.

Detect Outliers

This step identifies and removes outlier samples based on the TPM matrix generated in the previous step.

Detect Outliers steps

Open the Detect Outliers tab.
Adjust the outlier detection settings if needed.
Click Run Analysis.
Review the filtered TPM matrix.

Calculate Signatures

This step calculates signature scores from the cleaned TPM matrix.

The generated signature score matrix is used together with the TPM matrix in downstream correlation analysis.

Calculate Signatures steps

Open the Calculate Signatures tab.
Select the signature scoring parameters.
Click Run Analysis.
Review the generated signature score matrix.

How the workflow builds the analysis table

After signature calculation, the workflow automatically constructs a combined matrix by:

transposing the TPM matrix so that samples are rows and genes are columns
keeping the signature score matrix in sample-wise format
merging the two tables by ID

The final combined table contains:

ID
signature score columns
gene expression columns

This combined table is then used as the shared input for downstream correlation modules.

Part 2 · Correlation

This section provides correlation-based analyses between signatures and genes:

Batch Correlation — correlate one selected signature with multiple genes
Single Correlation — visualize correlation between one signature and one gene
Correlation Matrix — compute and plot correlations between selected signature set and gene set

In this workflow, the correlation modules are configured so that signature variables are used as targets and gene expression variables are used as features.

Output

The workflow returns processed data that can be reused across correlation analyses, including:

TPM expression matrix
outlier-filtered TPM matrix
signature score matrix
merged signature-gene matrix
downstream correlation plots and result tables

Download

Tables generated in downstream modules can be exported from their corresponding Download panels.
Plots generated in downstream modules can be exported from their corresponding Download panels.
An initial plot size is provided, which can be adjusted if needed.
If needed, you can adjust the plot width and height before downloading to obtain a more suitable layout.