这篇文章主要介绍“DESeq2如何使用”的相关知识,小编通过实际案例向大家展示操作过程,操作方法简单快捷,实用性强,希望这篇“DESeq2如何使用”文章能帮助大家解决问题。
deseq_analysis.r 差异基因分析DESeq2
$Rscript $scriptdir/deseq_analysis.r -h
usage: /work/my_stad_immu/scripts/deseq_analysis.r [-h] -i filepath -m
filepath -t treatname
--control CONTROL --case
CASE [-f fdr] [-c fc]
[-s size] [-a alpha]
[-X x.lab] [-Y y.lab]
[-T title] [-H height]
[-W width] [-o path]
[-p prefix]
DESeq2 analysis : https://www.亿速云.com/article/1500
optional arguments:
-h, --help show this help message and exit
-i filepath, --input filepath
input read count file [required]
-m filepath, --metadata filepath
metadata file , required
-t treatname, --treatname treatname
treat colname in group file, required
--control CONTROL set control group name required
--case CASE set case group name required
-f fdr, --fdr fdr set fdr threshold [default 0.05]
-c fc, --fc fc set fold change threshold [default 2]
-s size, --size size point size [optional, default: 0.7]
-a alpha, --alpha alpha
point transparency [0-1] [optional, default: 1]
-X x.lab, --x.lab x.lab
the label for x axis [optional, default: log2FC]
-Y y.lab, --y.lab y.lab
the label for y axis [optional, default: -log10(FDR)]
-T title, --title title
the label for main title [optional, default: Volcano]
-H height, --height height
the height of pic inches [default 5]
-W width, --width width
the width of pic inches [default 5]
-o path, --outdir path
output file directory [default
/work/my_stad_immu/05.enrich]
-p prefix, --prefix prefix
out file name prefix [default Volcano]
-i 输入基因表达矩阵文件,必须为count表达文件:
ID | TCGA-B7-A5TK-01A-12R-A36D-31 | TCGA-BR-7959-01A-11R-2343-13 | TCGA-IN-8462-01A-11R-2343-13 | TCGA-BR-A4CR-01A-11R-A24K-31 | TCGA-CG-4443-01A-01R-1157-13 | TCGA-KB-A93J-01A-11R-A39E-31 | TCGA-BR-4371-01A-01R-1157-13 |
TSPAN6 | 5951 | 4036 | 2834 | 3484 | 2537 | 2027 | 4749 |
TNMD | 3 | 4 | 0 | 1 | 0 | 1 | 8 |
DPM1 | 4672 | 4330 | 1725 | 4370 | 6523 | 3094 | 4415 |
SCYL3 | 1260 | 2057 | 702 | 1483 | 924 | 1451 | 982 |
C1orf112 | 523 | 992 | 172 | 1400 | 234 | 733 | 958 |
FGR | 1249 | 1127 | 285 | 148 | 56 | 941 | 208 |
CFH | 12831 | 11435 | 5387 | 995 | 4571 | 2189 | 1795 |
FUCA2 | 5896 | 7857 | 3208 | 5625 | 1527 | 7530 | 3290 |
GCLC | 2682 | 5509 | 1447 | 9323 | 6422 | 5265 | 2418 |
-k 输入基因表达矩阵文件,为fpkm 或者 tpm文件 用于相关性分析:
ID | TCGA-B7-A5TK-01A-12R-A36D-31 | TCGA-BR-7959-01A-11R-2343-13 | TCGA-IN-8462-01A-11R-2343-13 | TCGA-BR-A4CR-01A-11R-A24K-31 | TCGA-CG-4443-01A-01R-1157-13 |
TSPAN6 | 59.65411 | 32.83064 | 40.7596 | 39.84131 | 53.37611 |
TNMD | 0.084708 | 0.091652 | 0 | 0.032211 | 0 |
DPM1 | 175.9638 | 132.3385 | 93.21574 | 187.7617 | 515.6368 |
SCYL3 | 8.321862 | 11.02458 | 6.652222 | 11.17368 | 12.80849 |
C1orf112 | 3.984496 | 6.132832 | 1.880095 | 12.1676 | 3.741654 |
FGR | 16.34408 | 11.96739 | 5.350846 | 2.209351 | 1.53802 |
-m metadata文件路径,样本的分组信息,第一列必须和表达文件的样本名称对应:
barcode | subtype.hclust | StromalScore | ImmuneScore | ESTIMATEScore | TumourPurity |
TCGA-B7-A5TK-01A-12R-A36D-31 | S1 | 1026.057 | 2386.835 | 3412.892 | 0.448276 |
TCGA-BR-7959-01A-11R-2343-13 | S2 | 1130.722 | 729.402 | 1860.124 | 0.638667 |
TCGA-IN-8462-01A-11R-2343-13 | S2 | 112.2318 | 683.9349 | 796.1667 | 0.750581 |
TCGA-BR-A4CR-01A-11R-A24K-31 | S2 | -1060.35 | -766.618 | -1826.97 | 0.943814 |
TCGA-CG-4443-01A-01R-1157-13 | S2 | -261.577 | -258.629 | -520.206 | 0.8635 |
TCGA-KB-A93J-01A-11R-A39E-31 | S1 | -202.255 | 1605.12 | 1402.865 | 0.688838 |
TCGA-BR-4371-01A-01R-1157-13 | S2 | -828.231 | 711.3379 | -116.893 | 0.832147 |
TCGA-IN-A6RO-01A-12R-A33Y-31 | S2 | -1406.57 | 68.58307 | -1337.98 | 0.917683 |
TCGA-HU-A4H3-01A-21R-A251-31 | S2 | -619.208 | 538.7225 | -80.4854 | 0.829171 |
TCGA-RD-A8MV-01A-11R-A36D-31 | S1 | 113.4127 | 2309.647 | 2423.06 | 0.572976 |
TCGA-VQ-A91X-01A-12R-A414-31 | S2 | -1845.85 | -590.017 | -2435.87 | 0.969545 |
TCGA-D7-8575-01A-11R-2343-13 | S2 | -206.112 | 1392.799 | 1186.687 | 0.711491 |
TCGA-BR-4257-01A-01R-1131-13 | S1 | 861.029 | 1676.148 | 2537.177 | 0.559167 |
TCGA-BR-8485-01A-11R-2402-13 | S1 | 373.0961 | 1110.516 | 1483.612 | 0.680198 |
TCGA-BR-4370-01A-01R-1157-13 | S1 | 1300.495 | 1802.327 | 3102.822 | 0.488483 |
-t subtype.hclust --case S1 --control S2 : 指定metadata 分组列名,分组里面的比较组名字 ,如果分组名字有空格,应该用引号引起来: “Stage IA”
--fdr 0.01 --fc 2 设置差异基因的筛选条件: 显著性和差异倍数
Rscript $scriptdir/deseq_analysis.r -i ../01.TCGA_download/TCGA-STAD_gene_expression_Counts.tsv \
-k ../01.TCGA_download/TCGA-STAD_gene_expression_TPM.tsv --fdr 0.01 --fc 2 \
-m ../03.TIME/metadata.group.tsv -t subtype.hclust --case S1 --control S2 -p S1_vs_S2
关于“DESeq2如何使用”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识,可以关注亿速云行业资讯频道,小编每天都会为大家更新不同的知识点。
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。