STRING cross-reference the proteins with several databases (see "Details" section). By providing your input set o proteins (and optionally background or universe protein set), you can use this function to perform enrichment test and retrieve a list of enriched terms in each database, among with pertinent information for each term.

rba_string_enrichment(
  ids,
  species = NULL,
  background = NULL,
  split_df = FALSE,
  ...
)

Arguments

ids

Your protein ID(s). It is strongly recommended to supply STRING IDs. See rba_string_map_ids for more information.

species

Numeric: NCBI Taxonomy identifier; Human Taxonomy ID is 9606. (Recommended, but optional if your input is less than 100 IDs.)

background

character vector: A set of STRING protein IDs to be used as the statistical background (or universe) when computing P-value for the terms. Only STRING IDs are acceptable. (See rba_string_map_ids to map your IDs.)

split_df

(logical, default = FALSE), If TRUE, instead of one data frame, results from different categories will be split into multiple data frames based on their 'category'.

...

rbioapi option(s). See rba_options's arguments manual for more information on available options.

Value

A data frame which every row is an enriched terms with p-value smaller than 0.1 and the columns are the terms category, decription, number of genes, p-value, fdr and other pertinent information.

Details

STRING currently maps to and retrieve enrichment results based on Gene Ontology (GO), KEGG pathways, UniProt Keywords, PubMed publications, Pfam domains, InterPro domains, and SMART domains.
Note that this function will only return the enriched terms pertinent to your proteins that have a p-value lesser than 0.1. To retrieve a full list of the terms -unfiltered by enrichment p-values-, use rba_string_annotations.

Corresponding API Resources

"POST https://string-db.org/api/[output_format]/enrichment?identifiers= [your_identifiers]&[optional_parameters]"

References

  • Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering CV. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613. doi: 10.1093/nar/gky1131. PMID: 30476243; PMCID: PMC6323986.

  • STRING API Documentation

See also

Examples

# \donttest{
rba_string_enrichment(ids = c("TP53", "TNF", "EGFR"), species = 9606)
# }