admin管理员组

文章数量:1390519

I have two sets of files :

#1

Axiom_AgDivMS2_apple.r1.annot.csv
Axiom_AgDivMS2_favabean.r1.annot.csv
Axiom_AgDivMS2_gardenpea.r1.annot.csv
Axiom_AgDivMS2_pear.r1.annot.csv
Axiom_AgDivMS2_white_lupin.r1.annot.csv

#2

feverole_Axiom_AgDivMS2_apple.tag        lupin_Axiom_AgDivMS2_apple.tag        poire_Axiom_AgDivMS2_apple.tag        pois_Axiom_AgDivMS2_apple.tag        pomme_Axiom_AgDivMS2_apple.tag
feverole_Axiom_AgDivMS2_favabean.tag     lupin_Axiom_AgDivMS2_favabean.tag     poire_Axiom_AgDivMS2_favabean.tag     pois_Axiom_AgDivMS2_favabean.tag     pomme_Axiom_AgDivMS2_favabean.tag
feverole_Axiom_AgDivMS2_gardenpea.tag    lupin_Axiom_AgDivMS2_gardenpea.tag    poire_Axiom_AgDivMS2_gardenpea.tag    pois_Axiom_AgDivMS2_gardenpea.tag    pomme_Axiom_AgDivMS2_gardenpea.tag
feverole_Axiom_AgDivMS2_pear.tag         lupin_Axiom_AgDivMS2_pear.tag         poire_Axiom_AgDivMS2_pear.tag         pois_Axiom_AgDivMS2_pear.tag         pomme_Axiom_AgDivMS2_pear.tag
feverole_Axiom_AgDivMS2_white_lupin.tag  lupin_Axiom_AgDivMS2_white_lupin.tag  poire_Axiom_AgDivMS2_white_lupin.tag  pois_Axiom_AgDivMS2_white_lupin.tag  pomme_Axiom_AgDivMS2_white_lupin.tag

I need to match the #2 files with the *_apple.tag , *_favabean.tag , *_gardenpea.tag , *_pear.tag and *_white_lupin.tag to their correspoding file in the #1. I cannot show all the files here but it looks like this :

I mean, the files with the tag "*_apple.tag" should only match the Axiom_AgDivMS2_apple.r1.annot.csv , because they have the common "apple" pattern.

The tags are delimited between the patterns "Axiom_AgDivMS2_" and ".r1.annot.csv". :

for i in *.csv
 do a=${i%.r1.annot.csv}; b=${a#*_*_}
 echo $b
done

apple
favabean
gardenpea
pear
white_lupin

For example for "apple" , the combinations I should get :

Axiom_AgDivMS2_apple.r1.annot.csv feverole_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv lupin_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv poire_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv pois_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv pomme_Axiom_AgDivMS2_apple.tag

I do all the combinations at this step, but not only the necessary ones :

for N in *.csv; do   
   for S in *.tag; do     
    echo ${N} ${S};   
   done; 
done

EDIT :

In this ticket, all the *.csv files have the same structure Axiom_AgDivMS2_*.r1.annot.csv . It is the same for the tag files with *_Axiom_AgDivMS2_*.tag . That structure could change for another project, but the main thing is the pattern which makes the link between the *.csv and *.tag files. A *.tag file will necessary match a *.csv file, and vice versa. And all the *.tag files will have the same number of combinations (which is the number of *.csv files).

Any help?

I have two sets of files :

#1

Axiom_AgDivMS2_apple.r1.annot.csv
Axiom_AgDivMS2_favabean.r1.annot.csv
Axiom_AgDivMS2_gardenpea.r1.annot.csv
Axiom_AgDivMS2_pear.r1.annot.csv
Axiom_AgDivMS2_white_lupin.r1.annot.csv

#2

feverole_Axiom_AgDivMS2_apple.tag        lupin_Axiom_AgDivMS2_apple.tag        poire_Axiom_AgDivMS2_apple.tag        pois_Axiom_AgDivMS2_apple.tag        pomme_Axiom_AgDivMS2_apple.tag
feverole_Axiom_AgDivMS2_favabean.tag     lupin_Axiom_AgDivMS2_favabean.tag     poire_Axiom_AgDivMS2_favabean.tag     pois_Axiom_AgDivMS2_favabean.tag     pomme_Axiom_AgDivMS2_favabean.tag
feverole_Axiom_AgDivMS2_gardenpea.tag    lupin_Axiom_AgDivMS2_gardenpea.tag    poire_Axiom_AgDivMS2_gardenpea.tag    pois_Axiom_AgDivMS2_gardenpea.tag    pomme_Axiom_AgDivMS2_gardenpea.tag
feverole_Axiom_AgDivMS2_pear.tag         lupin_Axiom_AgDivMS2_pear.tag         poire_Axiom_AgDivMS2_pear.tag         pois_Axiom_AgDivMS2_pear.tag         pomme_Axiom_AgDivMS2_pear.tag
feverole_Axiom_AgDivMS2_white_lupin.tag  lupin_Axiom_AgDivMS2_white_lupin.tag  poire_Axiom_AgDivMS2_white_lupin.tag  pois_Axiom_AgDivMS2_white_lupin.tag  pomme_Axiom_AgDivMS2_white_lupin.tag

I need to match the #2 files with the *_apple.tag , *_favabean.tag , *_gardenpea.tag , *_pear.tag and *_white_lupin.tag to their correspoding file in the #1. I cannot show all the files here but it looks like this :

I mean, the files with the tag "*_apple.tag" should only match the Axiom_AgDivMS2_apple.r1.annot.csv , because they have the common "apple" pattern.

The tags are delimited between the patterns "Axiom_AgDivMS2_" and ".r1.annot.csv". :

for i in *.csv
 do a=${i%.r1.annot.csv}; b=${a#*_*_}
 echo $b
done

apple
favabean
gardenpea
pear
white_lupin

For example for "apple" , the combinations I should get :

Axiom_AgDivMS2_apple.r1.annot.csv feverole_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv lupin_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv poire_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv pois_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv pomme_Axiom_AgDivMS2_apple.tag

I do all the combinations at this step, but not only the necessary ones :

for N in *.csv; do   
   for S in *.tag; do     
    echo ${N} ${S};   
   done; 
done

EDIT :

In this ticket, all the *.csv files have the same structure Axiom_AgDivMS2_*.r1.annot.csv . It is the same for the tag files with *_Axiom_AgDivMS2_*.tag . That structure could change for another project, but the main thing is the pattern which makes the link between the *.csv and *.tag files. A *.tag file will necessary match a *.csv file, and vice versa. And all the *.tag files will have the same number of combinations (which is the number of *.csv files).

Any help?

Share Improve this question edited Mar 13 at 23:14 pedro asked Mar 13 at 21:49 pedropedro 5051 gold badge4 silver badges11 bronze badges 4
  • 1 There's no need to use ls. for N in $(ls *csv) should just be for N in *csv – Barmar Commented Mar 13 at 22:39
  • could you describe, textually, how you determine which part of the file name is considered the tag? it's obviously not the last underscore-delimited field otherwise the white_lupin tag would be processed as just lupin; it doesn't appear to be a prefix match since the *.tag files all start with different prefixes – markp-fuso Commented Mar 13 at 22:46
  • are you guaranteed to have at least one match between *.tag and *.csv file names? if it's possible for there to be a non-match then what, if anything, should be displayed? can a *.tag file exist without a *.csv match? can a *.csv file exist without a *.tag match? – markp-fuso Commented Mar 13 at 22:48
  • Thanks for the intereset. I did en edit and tried to be as clear as I can. I added an image which shows how the matches should work (I cannot show all the matches and the datasets). – pedro Commented Mar 13 at 23:03
Add a comment  | 

1 Answer 1

Reset to default 1

OP's update has a good start with the extraction of the 'tag' from the *.csv file names (ie, the assignment to the b variable). We'll build on this, with a change in variable names:

for file1 in *.csv
do
    [[ ! -f "${file1}" ]] && continue             # in case there are no files ending in *.csv

    tag="${file1%.r1.annot.csv}"
    tag="${tag#*_*_}"

    for file2 in *_"${tag}".tag                   # wrap ${tag} in double quotes in case of embedded white space
    do
        [[ ! -f "${file2}" ]] && continue         # again, just in case there are no files ending in _${tag}.tag

        echo "${file1} ${file2}"
    done
done

This generates:

Axiom_AgDivMS2_apple.r1.annot.csv feverole_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv lupin_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv poire_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv pois_Axiom_AgDivMS2_apple.tag
Axiom_AgDivMS2_apple.r1.annot.csv pomme_Axiom_AgDivMS2_apple.tag

Axiom_AgDivMS2_favabean.r1.annot.csv feverole_Axiom_AgDivMS2_favabean.tag
Axiom_AgDivMS2_favabean.r1.annot.csv lupin_Axiom_AgDivMS2_favabean.tag
Axiom_AgDivMS2_favabean.r1.annot.csv poire_Axiom_AgDivMS2_favabean.tag
Axiom_AgDivMS2_favabean.r1.annot.csv pois_Axiom_AgDivMS2_favabean.tag
Axiom_AgDivMS2_favabean.r1.annot.csv pomme_Axiom_AgDivMS2_favabean.tag

Axiom_AgDivMS2_gardenpea.r1.annot.csv feverole_Axiom_AgDivMS2_gardenpea.tag
Axiom_AgDivMS2_gardenpea.r1.annot.csv lupin_Axiom_AgDivMS2_gardenpea.tag
Axiom_AgDivMS2_gardenpea.r1.annot.csv poire_Axiom_AgDivMS2_gardenpea.tag
Axiom_AgDivMS2_gardenpea.r1.annot.csv pois_Axiom_AgDivMS2_gardenpea.tag
Axiom_AgDivMS2_gardenpea.r1.annot.csv pomme_Axiom_AgDivMS2_gardenpea.tag

Axiom_AgDivMS2_pear.r1.annot.csv feverole_Axiom_AgDivMS2_pear.tag
Axiom_AgDivMS2_pear.r1.annot.csv lupin_Axiom_AgDivMS2_pear.tag
Axiom_AgDivMS2_pear.r1.annot.csv poire_Axiom_AgDivMS2_pear.tag
Axiom_AgDivMS2_pear.r1.annot.csv pois_Axiom_AgDivMS2_pear.tag
Axiom_AgDivMS2_pear.r1.annot.csv pomme_Axiom_AgDivMS2_pear.tag

Axiom_AgDivMS2_white_lupin.r1.annot.csv feverole_Axiom_AgDivMS2_white_lupin.tag
Axiom_AgDivMS2_white_lupin.r1.annot.csv lupin_Axiom_AgDivMS2_white_lupin.tag
Axiom_AgDivMS2_white_lupin.r1.annot.csv poire_Axiom_AgDivMS2_white_lupin.tag
Axiom_AgDivMS2_white_lupin.r1.annot.csv pois_Axiom_AgDivMS2_white_lupin.tag
Axiom_AgDivMS2_white_lupin.r1.annot.csv pomme_Axiom_AgDivMS2_white_lupin.tag

NOTE: blank lines manually added for readability

本文标签: bashLoop over two sets of files based on a specific patternStack Overflow