Skip to content

pengsl-lab/CellMsg

 
 

Repository files navigation

CellMsg: graph convolutional networks for ligand-receptor-mediated cell-cell communication analysis

=========================================================================================

license license license license license license

CellMsg is a method for analyzing cell-cell communication mediated by ligand-receptor interactions. (i) The CellMsg method is a framework that accurately captures potential ligand-receptor interactions, thereby effectively inferring cell-cell communications (CCCs). (ii) The CellMsg method organizes ligand-receptor pairs into an adjacency matrix format, obtains protein features using iFeature, and performs feature extraction through GCNConv. This is followed by binary classification tasks using linear layers. Multiple layers of GCNConv with skip connections are added to ensure comprehensive node neighborhood information, avoiding issues like over-smoothing and gradient vanishing, thereby improving model accuracy. The overview figure of CellMsg is shown as follows.

Image text

The overview of CellMsg. a. Ligand-Receptor Interaction Prediction, including Data Preprocessing and Ligand-Receptor Interaction Classification. The data preprocessing section extracts multimodal features of ligands and receptors to construct an initial feature matrix and constructs an adjacency matrix based on known associations of ligands and receptors. The Ligand-Receptor Interaction Classification section uses these matrices as inputs to the graph convolutional networks to classify LRIs. b. Cell-Cell Communication Inference, including Ligand-Receptor Interaction Screening and Cell-Cell Communication Strength Measurement. The ligand-receptor interaction screening section filters high-confidence LRIs and then calculates the thresholding result, product result, and cell result for each scRNA-seq data. On this basis, the cell-cell communication strength measurement section uses a three-point estimation method to calculate the CCC strength between different cell types. c. Cell-Cell Communication Visualization, including the communication heatmap between different cell types, the communication network between different cell types, and the communication heatmap of the most active LR pairs between different cell types.

Table of Contents

Installation

CellMsg is tested to work under:

* Anaconda 24.1.2
* Python 3.11.7
* Torch 2.0.1
* Scanpy 1.10.1
* Anndata 0.10.8
* R 4.2.2
* Numpy 1.24.4
* iFeature
* Other basic python and r toolkits

Installation of other dependencies

  • Install LIANA+ using pip install liana if you encounter any issue.
  • Install torch_geometric using pip install torch_geometric if you encounter any issue.
  • Install seaborn-0.13.2 using pip install seaborn if you encounter any issue.
  • Install networkx-2.8.8 using pip install networkx if you encounter any issue.

Quick start

To reproduce our results:

Notes: Due to the large size of some datas, we uploaded them to the Google Drive, if some files cannot be found, please look for them here.

Data Description

File name Description
CellTalkDB/human_lr_pair.rds and CellTalkDB/mouse_lr_pair.rds The human and mouse LRI databases provided by CellTalkDB can be obtained from https://github.com/ZJUFanLab/CellTalkDB/tree/master/database.
CellTalkDB/human_lr_pair.csv and CellTalkDB/mouse_lr_pair.csv The CSV files converted from CellTalkDB/human_lr_pair.rds and CellTalkDB/mouse_lr_pair.rds using R.
Connectome/ncomms8866_human.rda and Connectome/ncomms8866_mouse.rda The human and mouse LRI databases provided by Connectome can be obtained from https://github.com/msraredon/Connectome/tree/master/data.
Connectome/human_lr.csv and Connectome/mouse_lr.csv The CSV files converted from Connectome/ncomms8866_human.rda and Connectome/ncomms8866_mouse.rda using R.
Cytotalk/lrp_human.rda and Cytotalk/lrp_mouse.rda The human and mouse LRI databases provided by Cytotalk can be obtained from https://github.com/tanlabcode/CytoTalk/tree/master/data.
Cytotalk/lrp_human.csv and Cytotalk/lrp_mouse.csv The CSV files converted from Cytotalk/lrp_human.rda and Cytotalk/lrp_mouse.rda using R.
NATMI/human_lr.csv The human LRI database provided by NATMI can be obtained from https://github.com/forrest-lab/NATMI/blob/master/lrdbs/lrc2p.csv.
SingleCellSignalR/LRdb.rda The human LRI database provided by SingleCellSignalR can be obtained from https://github.com/SCA-IRCM/SingleCellSignalR/tree/master/data.
SingleCellSignalR/human_lr.csv The CSV files converted from SingleCellSignalR/LRdb.rda using R.
LRI.csv The LR pairs identified by CellMsg.
cell2ct.csv and cell2ct.txt Cell annotation files, including mappings from cells to cell types (both represented numerically).
mart_export.txt, uniprotid2gn.txt, ensmusg.txt and ensmusp.txt Mapping files of protein identifiers to gene names.
ligand_sequence.txt and receptor_sequence.txt ligand and receptor sequence files, they serve as input files for iFeature to generate corresponding ligand or receptor features.
ligand_res_fea.csv and receptor_res_fea.csv (stored in google drive) ligand feature and receptor feature files obtained after processing with iFeature.
ligand-receptor-interaction.csv This file contains information about ligand-receptor interactions that we collected.
final_model.pth (stored in google drive) The final model for predicting LRIs.
LRI_predicted.csv LRIs that predicted by CellMsg.
original_LRI.csv, LRI_ori_.csv, origin_LRI.csv LRIs that we collected.

1, acquiring feature file from sequence file using iFeature

Notes: Since the processing steps for all sequence files are identical, we will proceed to process one of the sequence files.

python iFeature.py --file CellMsg/dataset1/ligand_sequence.txt --type AAC --out ligand_aac.csv
python iFeature.py --file CellMsg/dataset1/ligand_sequence.txt --type CKSAAP --out ligand_cksaap.csv
python iFeature.py --file CellMsg/dataset1/ligand_sequence.txt --type CTriad --out ligand_ctriad.csv
python iFeature.py --file CellMsg/dataset1/ligand_sequence.txt --type PAAC --out ligand_paac.csv
Then, the four features were merged to generate the final feature file for all ligands, where each row represents the features of one ligand, with the number of rows equating to the number of ligands.

2, training an LRI prediction model

Notes: Since the steps for training the LRI prediction model are the same for all datasets, let's proceed with processing Dataset 1.

python CellMsg/dataset1/CellMsg.py

3, preditcing LRIs using trained model

Notes: Since the steps for predicting LRIs are the same for all datasets, let's proceed with processing Dataset 1.

python CellMsg/dataset1/generate_lr.py
python CellMsg/dataset1/ensp_to_gname.py

Through the above steps, we obtained predicted LRIs with high confidence, which are then merged with the LRIs previously collected to serve as LRIs identified by CellMsg.

4, measuring cell-Cell communication strength

python CellMsg/CCC_Analysis/Processing_scRNA-seq_data.py
python CellMsg/CCC_Analysis/The_three-point_estimation_method.py

Through the above steps, we obtained the cell-cell communication strength matrix processed using the three-point evaluation method, and we generated the cell communication heatmap and cell communication network.

5, visualization analysis of cell-cell communication

python CellMsg/CCC_Analysis/The_number_of_LRIs.py

Through the steps outlined above, we have obtained Three_LRi_num.pdf and Three_LRi_num.csv, which show the number of LRIs mediating communication between cell types in human melanoma tissues.

python CellMsg/CCC_Analysis/Top.py

Through the above command, we obtained Top.pdf and Top_data.csv, which display the three most likely LR pairs mediating communication between melanoma cancer cells and six other cell types.

=========================================================================================

Contributing

All authors were involved in the conceptualization of the CellMsg method. BYJ and SLP conceived and supervised the project. BYJ and HX collected the data. HX completed the coding for the project. BYJ, HX and SLP contributed to the review of the manuscript before submission for publication. All authors read and approved the final manuscript.

cite

Contacts

If you have any questions or comments, please feel free to email: Shaoliang Peng ([email protected]); (Boya Ji) [email protected]; (Hong Xia) [email protected].

License

MIT © Richard McRichface.

About

CellMsg method

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%