snakemake - I use seqkit to statistic my sequences' information, but only DM8909_R1_clean.fastq.gz has a result and DM89

IT技术

更新时间：2025-04-071

admin管理员组
文章数量:1356237

samples = ["DM8909"]

rule all:
    input:
        "/home/huangqiang/whole_genome_pipline/temp/seqkit/second_sequence_statistics.txt"
      
rule fastp_map:
    input:
        r1 = "seq/{sample}_1.fastq.gz",
        r2 = "seq/{sample}_2.fastq.gz"
    output:
        r1_clean = "temp/fastp_output/{sample}_R1_clean.fastq.gz",
        r2_clean = "temp/fastp_output/{sample}_R2_clean.fastq.gz",
        html_report = "temp/fastp_output/{sample}_fastp.html",
        json_stats = "temp/fastp_output/{sample}_fastp.json"
    shell:
        "fastp \
        -i {input.r1} \
        -I {input.r2} \
        -o {output.r1_clean} \
        -O {output.r2_clean} \
        -h {output.html_report} \
        -j {output.json_stats} \
        -q 15 -u 40 -l 50 --dedup"

rule mkdir_seqkit:
    output:
        "temp/seqkit"
    shell:
        "mkdir -p {output}"

rule second_sequence_statistics:
    input:
        expand("/home/huangqiang/whole_genome_pipline/temp/fastp_output/{sample}_{pair}_clean.fastq.gz", sample=samples, pair=["R2","R1"])
    output:
        "/home/huangqiang/whole_genome_pipline/temp/seqkit/second_sequence_statistics.txt"
    shell:
        "seqkit \
        stats -aT -i {input} > {output}"

the result showed R1 has a result but R2 missed

file    format  type    num_seqs    sum_len min_len avg_len max_len Q1  Q2  Q3  sum_gap N50 N50_num Q20(%)  Q30(%)  AvgQual GC(%)   sum_n
/home/huangqiang/whole_genome_pipline/temp/fastp_output/DM8909_R1_clean.fastq.gz    FASTQ   DNA 4121308 618107404   50  150.0   150 150.0   150.0   150.0   0   150 1   98.24   94.91   27.70   49.67   9786

samples = ["DM8909"]

rule all:
    input:
        "/home/huangqiang/whole_genome_pipline/temp/seqkit/second_sequence_statistics.txt"
      
rule fastp_map:
    input:
        r1 = "seq/{sample}_1.fastq.gz",
        r2 = "seq/{sample}_2.fastq.gz"
    output:
        r1_clean = "temp/fastp_output/{sample}_R1_clean.fastq.gz",
        r2_clean = "temp/fastp_output/{sample}_R2_clean.fastq.gz",
        html_report = "temp/fastp_output/{sample}_fastp.html",
        json_stats = "temp/fastp_output/{sample}_fastp.json"
    shell:
        "fastp \
        -i {input.r1} \
        -I {input.r2} \
        -o {output.r1_clean} \
        -O {output.r2_clean} \
        -h {output.html_report} \
        -j {output.json_stats} \
        -q 15 -u 40 -l 50 --dedup"

rule mkdir_seqkit:
    output:
        "temp/seqkit"
    shell:
        "mkdir -p {output}"

rule second_sequence_statistics:
    input:
        expand("/home/huangqiang/whole_genome_pipline/temp/fastp_output/{sample}_{pair}_clean.fastq.gz", sample=samples, pair=["R2","R1"])
    output:
        "/home/huangqiang/whole_genome_pipline/temp/seqkit/second_sequence_statistics.txt"
    shell:
        "seqkit \
        stats -aT -i {input} > {output}"

the result showed R1 has a result but R2 missed

file    format  type    num_seqs    sum_len min_len avg_len max_len Q1  Q2  Q3  sum_gap N50 N50_num Q20(%)  Q30(%)  AvgQual GC(%)   sum_n
/home/huangqiang/whole_genome_pipline/temp/fastp_output/DM8909_R1_clean.fastq.gz    FASTQ   DNA 4121308 618107404   50  150.0   150 150.0   150.0   150.0   0   150 1   98.24   94.91   27.70   49.67   9786

Share Improve this question edited Mar 31 at 7:57 halfer 20.4k19 gold badges109 silver badges202 bronze badges asked Mar 30 at 1:31 00ye ye 11 bronze badge

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

I am not sure this solves your problem, but three things come to mind while looking at your code. A) You should stay consistent with absolute and relative paths (I would say, only use absolute paths if really necessary). B) You do not need to create folders, snakemake does this for you. C) Use triple quotes for multi line code blocks

This will results in:

samples = ["DM8909"]

rule all:
    input:
        "temp/seqkit/second_sequence_statistics.txt"
      
rule fastp_map:
    input:
        r1 = "seq/{sample}_1.fastq.gz",
        r2 = "seq/{sample}_2.fastq.gz"
    output:
        r1_clean = "temp/fastp_output/{sample}_R1_clean.fastq.gz",
        r2_clean = "temp/fastp_output/{sample}_R2_clean.fastq.gz",
        html_report = "temp/fastp_output/{sample}_fastp.html",
        json_stats = "temp/fastp_output/{sample}_fastp.json"
    shell:
        """
        fastp \
        -i {input.r1} \
        -I {input.r2} \
        -o {output.r1_clean} \
        -O {output.r2_clean} \
        -h {output.html_report} \
        -j {output.json_stats} \
        -q 15 -u 40 -l 50 --dedup
        """


rule second_sequence_statistics:
    input:
        expand("temp/fastp_output/{sample}_{pair}_clean.fastq.gz", sample=samples, pair=["R2","R1"])
    output:
        "temp/seqkit/second_sequence_statistics.txt"
    shell:
        """
        seqkit \
        stats -aT -i {input} > {output}
        """

I don't know the syntax for seqkit and fastp, but this will run the following commands:

fastp -i seq/DM8909_1.fastq.gz -I seq/DM8909_2.fastq.gz -o temp/fastp_output/DM8909_R1_clean.fastq.gz -O temp/fastp_output/DM8909_R2_clean.fastq.gz -h temp/fastp_output/DM8909_fastp.html -j temp/fastp_output/DM8909_fastp.json -q 15 -u 40 -l 50 --dedup

seqkit stats -aT -i temp/fastp_output/DM8909_R2_clean.fastq.gz temp/fastp_output/DM8909_R1_clean.fastq.gz > temp/seqkit/second_sequence_statistics.txt

本文标签：

版权声明：本文标题：snakemake - I use seqkit to statistic my sequences' information, but only DM8909_R1_clean.fastq.gz has a result and DM89 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1743999348a2573571.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

snakemake - I use seqkit to statistic my sequences&#39; information, but only DM8909_R1_clean.fastq.gz has a result and DM89

1 Answer 1

更多相关文章

javascript - Reactjs prop available in render but not in componentDidMount - Stack Overflow

python 3.x - How window splitter is done in pywin32? - Stack Overflow

javascript - Condition to test if a word exists in window.location.href - Stack Overflow

sql - OR is slower than two NOT EXISTS? - Stack Overflow

javascript - Using drawImage() to output fixed size images on a canvas? - Stack Overflow

python - I updated to Spyder 6.05, and my kernel is now not loading - Stack Overflow

javascript - How to position Dialog to top in Material-UI - Stack Overflow

INP on anchor tag - Stack Overflow

php - Add a specific category in Elementor Category Archive Template page for a Widget - Stack Overflow

Passing javascript variables to server-side C# logic - Stack Overflow

javascript - jquery.ajax multiple data retrieval - Stack Overflow

javascript - Issue in Google OAuth flow when using PKCE - Stack Overflow

javascript - TypeError: Cannot set property &#39;checked&#39; of null for CheckBox - Stack Overflow

I am trying to set up a typescript repo to use tsx for development but when ever I run tsx I get type errors - Stack Overflow

css - JavaScript: How to get a dynamically created element&#39;s width? - Stack Overflow

javascript - CSS animation through JS slows down and then stops when using translate with rotate - Stack Overflow

javascript - why does JSLint recommend x === &quot;undefined&quot; vs. typeof x == &quot;undefined&quot;? - Stac

c# - Overridable configuration in a .NET Core nuget package - Stack Overflow

javascript - How would I create a splash screen in Vue.js? - Stack Overflow

Is JavaScript&#39;s broadcast channel limited to one received message per second? - Stack Overflow

发表评论

推荐文章

python - Dataframe can&#39;t multiprocess and reference in functions - Stack Overflow

python - Columns are not in a row but are stacked in several rows - Stack Overflow

this operator in javascript - Stack Overflow

vba - find out if subform is loaded from another subform - Stack Overflow

html - Javascript: Highlighting part of a string with &lt;b&gt; tags - Stack Overflow

热门文章

javascript - Getting textbox values inside table using jQuery - Stack Overflow

html - How to insert text into a textarea using JavaScript without messing up edit history? - Stack Overflow

converting Google Visualization Query result into javascript array - Stack Overflow

Beyond Compare 4 文件对比 安装、激活

javascript - Modifying the onclick event with jQuery - Stack Overflow

java - How can I use the same error page for multiple error codes in Tomcat? - Stack Overflow

javascript - How to display 2 digits numbers of time left [countdown jquery]? - Stack Overflow

php - Typo3 13 fresh Windows install fails Image Processing tests - Stack Overflow

javascript - Ember.js shorthand for common computed property pattern - Stack Overflow

I am trying to set up a typescript repo to use tsx for development but when ever I run tsx I get type errors - Stack Overflow

最新文章

更新并关机怎么关闭计算机,win10关机并更新能取消吗 win10更新并关机怎样关闭...

禁止windows更新唤醒计算机,windows10睡眠被自动更新唤醒的解决方法

Windows11暂停自动更新

windows - win11永久关闭更新、延迟十年更新、随时继续更新

win10开机自启动项目在哪关闭

javascript - Knockout.js changing colour of &lt;option&gt; when using &#39;options&#39; binding? - Stack Overflo

sql - PostgreSQL Pagination Issue: (createdon, id) &gt; (...) Condition Not Skipping Duplicates - Stack Overflow

javascript - Pre-populating an email address in StripeCheckout - Stack Overflow

javascript - On click add class to child element without Jquery - Stack Overflow

html - How do I cycle through pictures in JavaScript? - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

snakemake - I use seqkit to statistic my sequences' information, but only DM8909_R1_clean.fastq.gz has a result and DM89

javascript - TypeError: Cannot set property 'checked' of null for CheckBox - Stack Overflow

css - JavaScript: How to get a dynamically created element's width? - Stack Overflow

javascript - why does JSLint recommend x === "undefined" vs. typeof x == "undefined"? - Stac

Is JavaScript's broadcast channel limited to one received message per second? - Stack Overflow

python - Dataframe can't multiprocess and reference in functions - Stack Overflow

html - Javascript: Highlighting part of a string with <b> tags - Stack Overflow

Beyond Compare 4 文件对比安装、激活

javascript - Knockout.js changing colour of <option> when using 'options' binding? - Stack Overflo

sql - PostgreSQL Pagination Issue: (createdon, id) > (...) Condition Not Skipping Duplicates - Stack Overflow