Skip to main content

比较牛津纳米孔测序数据和比对的运行

项目描述

NanoComp

比较多次运行的长读长测序数据和比对。创建长度、质量和百分比同一性的小提琴图或箱线图,并创建动态的重叠读取长度直方图和累积产量图。

从 1.1.0 版开始,NanoComp 还将为动态 html 绘图创建静态 png 图像,因为后者对于大型数据集可能会变得非常大且加载缓慢。但是,这需要您安装orca。如果没有 orca,脚本仍然可以工作,但不会创建动态图的静态副本。

推特网址

安装

pip install NanoComp

该脚本是为 Python3 编写的。

用法

NanoComp [-h] [-v] [-t THREADS] [-o OUTDIR] [-p PREFIX] [--verbose]
                [--raw] [--readtype {1D,2D,1D2}] [--barcoded]
                [--split_runs TSV_FILE]
                [-f {eps,jpeg,jpg,pdf,pgf,png,ps,raw,rgba,svg,svgz,tif,tiff}]
                [-n names [names ...]] [--plot {violin,box}] [--title TITLE]
                (--fastq files [files ...] | --summary files [files ...] | --bam files [files ...])

General options:
  -h, --help            show the help and exit
  -v, --version         Print version and exit.
  -t, --threads THREADS
                        Set the allowed number of threads to be used by the script
  -o, --outdir OUTDIR   Specify directory in which output has to be created.
  -p, --prefix PREFIX   Specify an optional prefix to be used for the output files.
  --verbose             Write log messages also to terminal.
  --raw                 Store the extracted data in tab separated file.

Options for filtering or transforming input prior to plotting:
  --readtype {1D,2D,1D2}
                        Which read type to extract information about from summary. Options are 1D, 2D,
                        1D2
  --barcoded            Barcoded experiment in summary format, splitting per barcode.
  --split_runs TSV_FILE
                        File: Split the summary on run IDs and use names in tsv file. Mandatory header
                        fields are 'NAME' and 'RUN_ID'.

Options for customizing the plots created:
  -f, --format {'png'(default),'jpg','jpeg','webp','svg','pdf','eps','json'}
                        Specify the output format of the plots. JSON output allows for customisation by the end-user after plotting the figures (https://plotly.com/python-api-reference/generated/plotly.io.read_json.html).
  -n, --names names     Specify the names to be used for the datasets.
  -c, --colors colors   Specify the colors to be used for the datasets.
  --plot {violin,box,ridge,false}
                        Which plot type to use: 'box', 'violin' (default), 'ridge' (joyplot) or 'false' (no plots)
  --title TITLE         Add a title to all plots, requires quoting if using spaces

Input data sources, one of these is required.:
  --fastq files [files ...]
                        Data is in (compressed) fastq format.
  --fasta files [files ...]
                        Data is in (compressed) fasta format.
  --summary files [files ...]
                        Data is in (compressed) summary files generated by albacore or guppy.
  --bam files [files ...]
                        Data is in sorted bam files.

--split_runs 的示例文件

例子

NanoComp --bam alignment1.bam alignment2.bam alignment3.bam --outdir compare-runs
NanoComp --fastq reads1.fastq.gz reads2.fastq.gz reads3.fastq.gz reads4.fastq.gz --names run1 run2 run3 run4

示例输出

对数长度示例 box percentIdentity 示例

查看更多示例

我欢迎所有建议、错误报告、功能请求和贡献。请留下问题或打开拉取请求。我通常会在一天内回复,或者很少在几天内回复。

引文

如果您使用此工具,请考虑引用我们的出版物

项目详情


发布历史 发布通知| RSS订阅