BLASTP 2.2.25+
Reference:
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
15,229,318 sequences; 5,219,829,388 total letters
Query= Rv2817c
Length=338
Score E
Sequences producing significant alignments: (Bits) Value
gi|167968189|ref|ZP_02550466.1| hypothetical protein MtubH3_0920... 695 0.0
gi|15609954|ref|NP_217333.1| hypothetical protein Rv2817c [Mycob... 694 0.0
gi|298526284|ref|ZP_07013693.1| conserved hypothetical protein [... 692 0.0
gi|331004038|ref|ZP_08327520.1| CRISPR-associated protein cas1 [... 223 3e-56
gi|224543481|ref|ZP_03684020.1| hypothetical protein CATMIT_0269... 210 3e-52
gi|229826471|ref|ZP_04452540.1| hypothetical protein GCWU000182_... 206 4e-51
gi|291460045|ref|ZP_06599435.1| CRISPR-associated protein Cas1 [... 204 2e-50
gi|240143668|ref|ZP_04742269.1| CRISPR-associated protein Cas1 [... 199 7e-49
gi|294782686|ref|ZP_06748012.1| CRISPR-associated protein Cas1 [... 196 7e-48
gi|253578035|ref|ZP_04855307.1| CRISPR-associated protein cas1 [... 192 5e-47
gi|340752436|ref|ZP_08689235.1| CRISPR-associated protein cas1 [... 192 7e-47
gi|315925051|ref|ZP_07921268.1| conserved hypothetical protein [... 191 1e-46
gi|121533432|ref|ZP_01665260.1| CRISPR-associated protein Cas1 [... 189 6e-46
gi|339890605|gb|EGQ79706.1| hypothetical protein HMPREF9094_1263... 186 4e-45
gi|296133514|ref|YP_003640761.1| CRISPR-associated protein Cas1 ... 186 6e-45
gi|121533442|ref|ZP_01665270.1| CRISPR-associated protein Cas1 [... 185 1e-44
gi|114567264|ref|YP_754418.1| hypothetical protein Swol_1749 [Sy... 181 1e-43
gi|237741581|ref|ZP_04572062.1| CRISPR-associated protein [Fusob... 181 2e-43
gi|258645680|ref|ZP_05733149.1| CRISPR-associated protein Cas1 [... 177 2e-42
gi|339278110|emb|CCC19858.1| CRISPR-associated protein cas1 [Str... 176 7e-42
gi|325695839|gb|EGD37730.1| hypothetical protein HMPREF9384_1727... 174 3e-41
gi|294794257|ref|ZP_06759393.1| CRISPR-associated protein Cas1 [... 170 3e-40
gi|333976353|gb|EGL77222.1| CRISPR-associated endonuclease Cas1 ... 169 4e-40
gi|238018273|ref|ZP_04598699.1| hypothetical protein VEIDISOL_00... 168 1e-39
gi|116627764|ref|YP_820383.1| hypothetical protein STER_0970 [St... 167 3e-39
gi|303231960|ref|ZP_07318668.1| CRISPR-associated endonuclease C... 166 4e-39
gi|342214546|ref|ZP_08707233.1| CRISPR-associated endonuclease C... 164 1e-38
gi|341822659|emb|CCC73583.1| CRISPR-associated endonuclease CaS1... 163 3e-38
gi|159899002|ref|YP_001545249.1| CRISPR-associated Cas1 family p... 161 2e-37
gi|291539925|emb|CBL13036.1| CRISPR-associated protein Cas1 [Ros... 159 6e-37
gi|125718075|ref|YP_001035208.1| hypothetical protein SSA_1255 [... 159 7e-37
gi|327470946|gb|EGF16402.1| CRISPR-associated protein cas1 [Stre... 158 1e-36
gi|323141545|ref|ZP_08076431.1| CRISPR-associated endonuclease C... 154 2e-35
gi|309791951|ref|ZP_07686429.1| CRISPR-associated Cas1 family pr... 154 3e-35
gi|334126733|ref|ZP_08500681.1| CRISPR-associated protein Cas1 [... 154 3e-35
gi|313894905|ref|ZP_07828465.1| CRISPR-associated endonuclease C... 154 3e-35
gi|327474433|gb|EGF19839.1| hypothetical protein HMPREF9391_0559... 152 8e-35
gi|209526394|ref|ZP_03274922.1| CRISPR-associated protein Cas1 [... 151 1e-34
gi|156741961|ref|YP_001432090.1| CRISPR-associated Cas1 family p... 151 2e-34
gi|163846146|ref|YP_001634190.1| CRISPR-associated Cas1 family p... 150 2e-34
gi|291568436|dbj|BAI90708.1| CRISPR-associated protein Cas1 [Art... 148 1e-33
gi|284052685|ref|ZP_06382895.1| hypothetical protein AplaP_14548... 148 1e-33
gi|320161859|ref|YP_004175084.1| hypothetical protein ANT_24580 ... 147 2e-33
gi|312899092|ref|ZP_07758470.1| CRISPR-associated protein Cas1 [... 144 1e-32
gi|219850296|ref|YP_002464729.1| CRISPR-associated protein Cas1 ... 144 2e-32
gi|337286709|ref|YP_004626182.1| CRISPR-associated protein Cas1 ... 142 9e-32
gi|312278318|gb|ADQ62975.1| CRISPR-associated protein, Cas1 fami... 141 1e-31
gi|292669134|ref|ZP_06602560.1| CRISPR-associated fusion protein... 140 3e-31
gi|328953000|ref|YP_004370334.1| CRISPR-associated protein Cas1 ... 139 5e-31
gi|254417359|ref|ZP_05031101.1| CRISPR-associated protein Cas1 [... 139 5e-31
>gi|167968189|ref|ZP_02550466.1| hypothetical protein MtubH3_09204 [Mycobacterium tuberculosis
H37Ra]
gi|253798098|ref|YP_003031099.1| hypothetical protein TBMG_01156 [Mycobacterium tuberculosis KZN
1435]
gi|254551879|ref|ZP_05142326.1| hypothetical protein Mtube_15707 [Mycobacterium tuberculosis
'98-R604 INH-RIF-EM']
22 more sequence titles
Length=341
Score = 695 bits (1793), Expect = 0.0, Method: Compositional matrix adjust.
Identities = 338/338 (100%), Positives = 338/338 (100%), Gaps = 0/338 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE
Sbjct 4 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 63
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR
Sbjct 64 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 123
Query 121 AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 180
AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS
Sbjct 124 AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 183
Query 181 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 240
TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP
Sbjct 184 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 243
Query 241 IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR 300
IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR
Sbjct 244 IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR 303
Query 301 YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA 338
YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA
Sbjct 304 YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA 341
>gi|15609954|ref|NP_217333.1| hypothetical protein Rv2817c [Mycobacterium tuberculosis H37Rv]
gi|15842358|ref|NP_337395.1| hypothetical protein MT2884 [Mycobacterium tuberculosis CDC1551]
gi|31793993|ref|NP_856486.1| hypothetical protein Mb2841c [Mycobacterium bovis AF2122/97]
41 more sequence titles
Length=338
Score = 694 bits (1790), Expect = 0.0, Method: Compositional matrix adjust.
Identities = 338/338 (100%), Positives = 338/338 (100%), Gaps = 0/338 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE
Sbjct 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR
Sbjct 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
Query 121 AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 180
AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS
Sbjct 121 AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 180
Query 181 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 240
TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP
Sbjct 181 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 240
Query 241 IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR 300
IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR
Sbjct 241 IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR 300
Query 301 YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA 338
YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA
Sbjct 301 YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA 338
>gi|298526284|ref|ZP_07013693.1| conserved hypothetical protein [Mycobacterium tuberculosis 94_M4241A]
gi|298496078|gb|EFI31372.1| conserved hypothetical protein [Mycobacterium tuberculosis 94_M4241A]
Length=338
Score = 692 bits (1787), Expect = 0.0, Method: Compositional matrix adjust.
Identities = 337/338 (99%), Positives = 338/338 (100%), Gaps = 0/338 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE
Sbjct 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
RDIQLFTTDGHYQGRISTPD+SYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR
Sbjct 61 RDIQLFTTDGHYQGRISTPDLSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
Query 121 AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 180
AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS
Sbjct 121 AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 180
Query 181 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 240
TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP
Sbjct 181 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 240
Query 241 IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR 300
IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR
Sbjct 241 IIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHR 300
Query 301 YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA 338
YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA
Sbjct 301 YTFQYALDLQLQSLVRVIEAGHPSRLVDIDITSEPSGA 338
>gi|331004038|ref|ZP_08327520.1| CRISPR-associated protein cas1 [Lachnospiraceae oral taxon 107
str. F0167]
gi|330411624|gb|EGG91032.1| CRISPR-associated protein cas1 [Lachnospiraceae oral taxon 107
str. F0167]
Length=334
Score = 223 bits (568), Expect = 3e-56, Method: Compositional matrix adjust.
Identities = 110/323 (35%), Positives = 190/323 (59%), Gaps = 2/323 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+++ S+I+ + G++I+ + + +P ETL+ I +FG +MT P L++
Sbjct 1 MACLYITEQGSKITTSAGKIIIECRDGTKKSFPKETLESIMIFGNSSMTVPVKKFCLEKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ +T G Y GR+++ YA RL++QV+ +D + CL +K+I + KI NQ+ +++
Sbjct 61 IKVTFLSTKGKYFGRLASTSHFYAERLKKQVYLSDSNSDCLEFAKKIQAAKIHNQRIILK 120
Query 121 AHT--SGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
+ S +D+ E + + + + S+ E+ G+EG AA+ YF AL ++ +EFAF G
Sbjct 121 RYEKHSDKDIKEELDRISIYENEISQCKSVDEVLGYEGMAAREYFKALSKIIREEFAFDG 180
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
RS +PPLDAFNSM+S GY++++ I +E L+ YIGF+H+ R H TL SD++E WR
Sbjct 181 RSRQPPLDAFNSMISFGYTIVFYEIFAEVESRDLSPYIGFIHKIKRNHPTLVSDMLEEWR 240
Query 239 APIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDP 298
A ++D T+L LI + FS + +TGAV+ + A + R ++ + Y++
Sbjct 241 ALLVDSTILSLIQGNEISIYEFSHDEETGAVYLSDNAIKICVRKIEEKMRKEMNYLEYLD 300
Query 299 HRYTFQYALDLQLQSLVRVIEAG 321
+F+ A+ Q++SL I+ G
Sbjct 301 SPVSFRRAIWWQIKSLAGCIDNG 323
>gi|224543481|ref|ZP_03684020.1| hypothetical protein CATMIT_02690 [Catenibacterium mitsuokai
DSM 15897]
gi|224523608|gb|EEF92713.1| hypothetical protein CATMIT_02690 [Catenibacterium mitsuokai
DSM 15897]
Length=332
Score = 210 bits (534), Expect = 3e-52, Method: Compositional matrix adjust.
Identities = 121/320 (38%), Positives = 177/320 (56%), Gaps = 4/320 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+ S I D R+ V ++ P+E++DGITL G +TT I E LK+
Sbjct 1 MSILYIDKSDCVIGKQDNRITVKYKDGMFRTIPVESIDGITLLGHAQVTTQCIQECLKKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ F+ GHY GR+S+ A R+Q D P F L LSK+I+ KI NQ ++R
Sbjct 61 ISLSYFSKGGHYFGRLSSTGHIKASLQRKQAGLYDQP-FALELSKKIIYAKIHNQIVVLR 119
Query 121 AHT--SGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
++ + ++++ MK +L V+ ++ EL G+EG+AA+ YF L + + F F+G
Sbjct 120 RYSRSTNHNISDIELHMKSALRKVNYVKNIEELMGYEGSAARYYFKGLSMCIDEAFKFEG 179
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
RS RPP DAFNSM+SLGYS+L + G IE LNAY GFLH+D+ H TLASD+ME WR
Sbjct 180 RSKRPPHDAFNSMLSLGYSILMNELYGEIEIKGLNAYFGFLHRDAEKHPTLASDMMEEWR 239
Query 239 APIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDP 298
A +ID TV+ +I V F + G V+ ++A + N+ Y+
Sbjct 240 AVLIDSTVMSMINGHEVHIDEFYSDDGEG-VYIFKQALNKFIKKLENKFQICQKYLDYID 298
Query 299 HRYTFQYALDLQLQSLVRVI 318
+ +F+ A+ Q+ SLV I
Sbjct 299 YPVSFRSAISFQMSSLVDAI 318
>gi|229826471|ref|ZP_04452540.1| hypothetical protein GCWU000182_01844 [Abiotrophia defectiva
ATCC 49176]
gi|229789341|gb|EEP25455.1| hypothetical protein GCWU000182_01844 [Abiotrophia defectiva
ATCC 49176]
Length=340
Score = 206 bits (524), Expect = 4e-51, Method: Compositional matrix adjust.
Identities = 111/324 (35%), Positives = 181/324 (56%), Gaps = 2/324 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV + S+I G+ I+ ++ P E L+ I++FG +TT I L++
Sbjct 7 MSCLYVVEQGSKIKHIGGQFILEVKDGENRVVPDEILESISIFGNSVLTTQAIKACLEKN 66
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
++ +T G Y G++ + + RL+ Q + +D+ CL +K I+ KI NQ ++R
Sbjct 67 INVSFLSTKGRYFGKLMSNTATNPDRLKAQAYLSDNIDECLKFAKIILKAKINNQDVILR 126
Query 121 --AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
A +S D++ I+ +K +++ + ++ G+EG AA+ YF AL L+ EF F G
Sbjct 127 RYAKSSEADISSHIKDLKIYEEHIEKGKDINKIMGYEGIAARTYFEALSKLIKPEFKFSG 186
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
R+ RPP DAFNSM+SLGYSL+Y I IE +L+ YIGF+H+ H L SDL+E WR
Sbjct 187 RNKRPPKDAFNSMLSLGYSLIYNEIFSEIENRNLSPYIGFIHKLKDRHPALVSDLIEEWR 246
Query 239 APIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDP 298
A ++D T++ LI + F+K+ + AVF + A + I R N++ Y++
Sbjct 247 AVLVDATMMSLIQGNEILIEEFTKDEYSEAVFISDLAVKQIVRKIENKLRSQNNYLEYLN 306
Query 299 HRYTFQYALDLQLQSLVRVIEAGH 322
+F+ A+ Q++SL IE+G+
Sbjct 307 EPISFRKAIWWQVKSLASCIESGN 330
>gi|291460045|ref|ZP_06599435.1| CRISPR-associated protein Cas1 [Oribacterium sp. oral taxon 078
str. F0262]
gi|291417386|gb|EFE91105.1| CRISPR-associated protein Cas1 [Oribacterium sp. oral taxon 078
str. F0262]
Length=332
Score = 204 bits (518), Expect = 2e-50, Method: Compositional matrix adjust.
Identities = 119/326 (37%), Positives = 177/326 (55%), Gaps = 6/326 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV + + I R V ++ P ETL+ I +FG+ +TT + E LKR
Sbjct 1 MSYLYVCEQGAVIGCEANRFQVCYKDGMLKSVPGETLEVIEIFGKVQLTTQCMTECLKRG 60
Query 61 RDIQLFTTDGHYQGR-ISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ ++++G Y GR IST V+ A RQ+ F L++RI+ KI NQ ++
Sbjct 61 ITVLFYSSNGAYYGRLISTNHVNVA---RQRSQAALKEEFKAGLARRIIRAKIRNQTVIL 117
Query 120 RAHTSGQDVA--ESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
R + Q A ++ M +D +G ++ G+EG AA+ YF ALG LV +EF F+
Sbjct 118 RRYARKQAAAVEGTVSEMLRLAEKLDWTGDTEKIMGYEGMAARVYFAALGGLVDREFCFK 177
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
GRS RPP D FNS++SLGYS+L I G +E LN Y G LH+D H TLASDLME W
Sbjct 178 GRSKRPPKDPFNSLISLGYSILLGEIYGKLEGKGLNPYFGVLHKDREKHPTLASDLMEEW 237
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RA ++D T + ++ + F ++ +TGAV ++A + R ++ Y+
Sbjct 238 RAVLVDSTAMSILNGHELHREDFFRDEETGAVLLVKDAFKEYLRKLEAKLHTDMKYLSYV 297
Query 298 PHRYTFQYALDLQLQSLVRVIEAGHP 323
+R F+ ALDLQ+ L + IE+ +P
Sbjct 298 DYRVNFRSALDLQVDRLAKAIESENP 323
>gi|240143668|ref|ZP_04742269.1| CRISPR-associated protein Cas1 [Roseburia intestinalis L1-82]
gi|257204345|gb|EEV02630.1| CRISPR-associated protein Cas1 [Roseburia intestinalis L1-82]
Length=334
Score = 199 bits (505), Expect = 7e-49, Method: Compositional matrix adjust.
Identities = 117/322 (37%), Positives = 172/322 (54%), Gaps = 4/322 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYVS+ + I R V ++ P ETL+ I +FG +TT + E LKR
Sbjct 1 MSYLYVSEQGASIGIEANRFQVNYKDGMIKSIPAETLEMIEVFGSVQITTRCLTECLKRG 60
Query 61 RDIQLFTTDGHYQGR-ISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+I ++T G Y GR IST V+ R R Q + F L +SKRI+ KI NQ ++
Sbjct 61 VNILFYSTSGAYYGRLISTSHVN-VQRQRIQAEIGHNETFKLEMSKRIIDAKIRNQVVVL 119
Query 120 RAHTSG--QDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
R + G +D+ I M++ + + S+ ++ G+EG AAK YF LG L+ ++F F+
Sbjct 120 RRYARGRDEDIHRMIIEMQNMQKKLLYAKSVEQVMGYEGTAAKIYFKVLGKLIDEQFVFE 179
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
GRS RPP+D FNS++SLGYS++ + G IE LN Y G +H+D H TLASDLME W
Sbjct 180 GRSRRPPMDPFNSLISLGYSIILNELYGKIEGKGLNPYFGVMHKDREKHPTLASDLMEEW 239
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RA +ID T L ++ + F D VF ++ R + + Y+
Sbjct 240 RAVLIDTTALSMLNGHELVKEDFYTGIDQPGVFLEKDGFRKYIQKLEGKFRTENRYLSYI 299
Query 298 PHRYTFQYALDLQLQSLVRVIE 319
+ +F+ A+DLQ+ V+ IE
Sbjct 300 DYSVSFRRAMDLQVNQFVKAIE 321
>gi|294782686|ref|ZP_06748012.1| CRISPR-associated protein Cas1 [Fusobacterium sp. 1_1_41FAA]
gi|294481327|gb|EFG29102.1| CRISPR-associated protein Cas1 [Fusobacterium sp. 1_1_41FAA]
Length=335
Score = 196 bits (497), Expect = 7e-48, Method: Compositional matrix adjust.
Identities = 97/325 (30%), Positives = 176/325 (55%), Gaps = 3/325 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+ + + + + R+++ PIE +D + +FG ++T I +L +
Sbjct 1 MSNLYIYEQGIVLRYKENRLLITYANDDSKSIPIENIDNVVIFGGIQLSTSCIHNLLAKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQA-LI 119
+ + +G Y GR+ + R R+Q ++DD FCL ++K+ + K NQ+ LI
Sbjct 61 IHVTFLSKNGSYFGRLESTSNINIDRQREQFRKSDDKEFCLEIAKKFIKGKGTNQRTILI 120
Query 120 RAHTSGQD--VAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
RA+ ++ +A +I TM + ++ + ++ EL G EG AK YF AL H++ ++++F+
Sbjct 121 RANKELKNEVLATTITTMFGIIKDINDTKTIEELMGIEGYLAKLYFNALNHIIDKKYSFK 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D FN+++S GY+LL+ I + LN Y FLH D H L SDLME W
Sbjct 181 TRTKRPPKDPFNAVISFGYTLLHYEIFTILVTKGLNPYAAFLHSDRHKHPALCSDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
R+ ++D + L+ + + F + ++G VF ++A F R+ + +YIK
Sbjct 241 RSILVDSLAIALLNNNKIAYEDFDFDEESGGVFLNKKACEKFVEQFEKRLRQEVSYIKEV 300
Query 298 PHRYTFQYALDLQLQSLVRVIEAGH 322
P++ +F+ ++ Q+ L++ +EA +
Sbjct 301 PYKMSFRRIIEYQVMLLIKALEANN 325
>gi|253578035|ref|ZP_04855307.1| CRISPR-associated protein cas1 [Ruminococcus sp. 5_1_39B_FAA]
gi|251850353|gb|EES78311.1| CRISPR-associated protein cas1 [Ruminococcus sp. 5_1_39BFAA]
Length=333
Score = 192 bits (489), Expect = 5e-47, Method: Compositional matrix adjust.
Identities = 113/327 (35%), Positives = 170/327 (52%), Gaps = 5/327 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+DS + I V ++ + PIE+LDGIT+ G+ MTT E ++R
Sbjct 1 MSLLYVNDSGATIGIEGNCCTVKQKDGSKRMLPIESLDGITIMGQSQMTTQCAEECMQRG 60
Query 61 RDIQLFTTDGHYQGR-ISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ F+ G Y GR IST V+ R R+Q D F + L+ +I+S KI NQ ++
Sbjct 61 IPVSYFSKGGKYFGRLISTGHVN-VERQRKQCALYD-TGFAVELAMKILSAKIKNQSVVL 118
Query 120 RAH--TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
R + + G ++ E + + V + E+ GFEG AAK YF L + + F FQ
Sbjct 119 RRYEKSKGLNLEEEQKMLAICRNKVLTCDRIEEMIGFEGQAAKYYFKGLSACIDENFTFQ 178
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
GR+ RPP D FNSM+SLGYS+L + +E LN Y GF+H+D+ H TLASD++E W
Sbjct 179 GRNRRPPRDEFNSMISLGYSILMNEVYCKVEMKGLNPYFGFIHRDAEKHPTLASDMIEEW 238
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RA I+D T + +I + F N D + T++ + + Y+K
Sbjct 239 RAIIVDATAMSMINGHEILKDHFYFNMDEPGCYITKDGLKLYLNKLERKFQTEVRYLKYV 298
Query 298 PHRYTFQYALDLQLQSLVRVIEAGHPS 324
+ +F+ + LQ++ L + IE G S
Sbjct 299 DYAVSFRRGIFLQMEHLAKAIEEGDAS 325
>gi|340752436|ref|ZP_08689235.1| CRISPR-associated protein cas1 [Fusobacterium sp. 2_1_31]
gi|229422235|gb|EEO37282.1| CRISPR-associated protein cas1 [Fusobacterium sp. 2_1_31]
Length=335
Score = 192 bits (488), Expect = 7e-47, Method: Compositional matrix adjust.
Identities = 96/324 (30%), Positives = 172/324 (54%), Gaps = 3/324 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+ + + + + R+++ PIE +D I +FG ++T + +L +
Sbjct 1 MSNLYIYEQGIVLRYKENRLLITYTNDDSKSIPIENIDNIVIFGGIQLSTTCMHNLLAKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQA-LI 119
+ + +G Y GR+ + R R+Q ++DD FCL ++K+ + K NQ+ LI
Sbjct 61 IHVTFLSKNGSYFGRLESTSNINIDRQREQFRKSDDKEFCLEIAKKFIKGKATNQRTILI 120
Query 120 RAHTSGQD--VAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
RA+ ++ ++ +I TM + ++ + ++ EL G EG AK YF AL ++ ++++F+
Sbjct 121 RANKELKNDVLSSTITTMFGIIKDINNAKTIEELMGVEGYLAKLYFNALNQIIDKKYSFK 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D FN+++S GY+LL+ I + LN Y FLH D H L SDLME W
Sbjct 181 TRTKRPPKDPFNAVISFGYTLLHYEIFTILVTKGLNPYAAFLHSDRHKHPALCSDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RA ++D + L+ + + F + +G VF ++A F R+ + +YIK
Sbjct 241 RAILVDSLAIALLNNNKITYEDFDFDEKSGGVFLNKKACEKFVEQFEKRLRQEVSYIKEV 300
Query 298 PHRYTFQYALDLQLQSLVRVIEAG 321
P++ +F+ ++ Q+ L++ EA
Sbjct 301 PYKMSFRRIIEYQVMLLIKAFEAN 324
>gi|315925051|ref|ZP_07921268.1| conserved hypothetical protein [Pseudoramibacter alactolyticus
ATCC 23263]
gi|315621950|gb|EFV01914.1| conserved hypothetical protein [Pseudoramibacter alactolyticus
ATCC 23263]
Length=332
Score = 191 bits (485), Expect = 1e-46, Method: Compositional matrix adjust.
Identities = 109/324 (34%), Positives = 175/324 (55%), Gaps = 8/324 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYVS++ + I R++V +++ P+ETL+GITL +TT + +K+
Sbjct 1 MSLLYVSENGAVIQSQANRIVVNTKDGVSRSIPVETLEGITLLAPAQLTTQCMETCMKKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQ--VHRTDDPAFCLSLSKRIVSRKILNQQAL 118
D+ F+ GHY GR++ A R+Q ++ +D F + LS+R+V+ K+ NQ +
Sbjct 61 IDVVFFSKGGHYFGRLTATGYQKAGLQRRQAKLYHSD---FAIDLSRRMVAAKLNNQAVM 117
Query 119 IR--AHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
+R A DV + + ++ + ++ E+ G+EG+ AK+YF L V +FAF
Sbjct 118 LRRYARNHAVDVNQEVMHIQWAKERAASERAIPEIMGYEGHGAKSYFKGLSRCVDSDFAF 177
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
GRS RPP D FN+++SLGY++L + G IE H LN Y GF+HQD+ H TLASDLME
Sbjct 178 SGRSRRPPRDPFNALISLGYAILLNELYGEIESHGLNPYFGFMHQDAEKHPTLASDLMEE 237
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKG 296
WRA ++D + L+ + F ++ D G + T++ ++ + Y+
Sbjct 238 WRAVLVDSLAMSLLNGHELAVGDF-ESDDAGGCYLTKKGLAVFLNKMEKKMQTSIRYLHY 296
Query 297 DPHRYTFQYALDLQLQSLVRVIEA 320
+ TF+ AL Q L + IEA
Sbjct 297 LDYPVTFRRALGFQAGRLAKAIEA 320
>gi|121533432|ref|ZP_01665260.1| CRISPR-associated protein Cas1 [Thermosinus carboxydivorans Nor1]
gi|121307991|gb|EAX48905.1| CRISPR-associated protein Cas1 [Thermosinus carboxydivorans Nor1]
Length=332
Score = 189 bits (480), Expect = 6e-46, Method: Compositional matrix adjust.
Identities = 114/326 (35%), Positives = 165/326 (51%), Gaps = 6/326 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+D+ S I GR +V + + P+E LD + LFG ++ I E LKR
Sbjct 1 MRTLYVTDAGSHIQKNAGRFLVCKGDTILREIPLELLDNVVLFGSIQVSAKTITEFLKRG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ + G + GR+ + RQQ+ D P FCL +++ I+ KI N ++R
Sbjct 61 ITLTWLSKTGEFYGRLESTRHIDIFLHRQQIRMGDRPDFCLKIAQAIIDAKIANCMTILR 120
Query 121 AH---TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ + +VA+ I M + + L G EG+AA+ YFTAL LVP +FAF+
Sbjct 121 RYQRTANSPEVADHIHAMGIIAEKIPNVDKIETLLGLEGSAARHYFTALACLVPDDFAFK 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
GR+ +PP D FNS++S GY+LL + ++ L+ Y GFLH+D +GH TL SDLME W
Sbjct 181 GRNKQPPKDPFNSLLSFGYTLLMYDFYTIVQNAGLHPYAGFLHKDRQGHPTLVSDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
R IID V+ LI + F G V+ REA+ A+ R+ R Y
Sbjct 241 RPSIIDSLVMSLIHRREIQPLDFLPPDKNGGVYLNREASAEFIAAYEKRMTRLNKY---G 297
Query 298 PHRYTFQYALDLQLQSLVRVIEAGHP 323
TF+ L Q + L + IE P
Sbjct 298 GKELTFRQLLARQAKLLSQAIENEDP 323
>gi|339890605|gb|EGQ79706.1| hypothetical protein HMPREF9094_1263 [Fusobacterium nucleatum
subsp. animalis ATCC 51191]
Length=335
Score = 186 bits (473), Expect = 4e-45, Method: Compositional matrix adjust.
Identities = 96/325 (30%), Positives = 169/325 (52%), Gaps = 3/325 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+ + + + + R+++ PIE +D + +FG ++T I + +
Sbjct 1 MSNLYIYEQGIVLRYKENRLLITYTNGDYKSIPIENVDNVVIFGGIQLSTACIHNLPTKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQAL-I 119
+ + G Y GR+ + R R+Q ++DD FCL+L K+ + K NQ+ L I
Sbjct 61 IHVTFLSKTGSYFGRLESTSNINIDRQREQFRKSDDKEFCLALGKKFIKGKATNQRTLLI 120
Query 120 RAHTSGQDVAES--IRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
RA+ ++ S I TM + ++ S ++ EL G EG AK YFT + H++ +++ F+
Sbjct 121 RANKDLKNTTLSNIIATMFGIIKNINDSKTIEELMGVEGYLAKVYFTGINHIIDKKYNFK 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D FN+++S GY+LL+ I + LN Y FLH D H L SDLME W
Sbjct 181 TRTKRPPKDPFNAVISFGYTLLHYEIFTTLVTKGLNPYAAFLHSDRHKHPALCSDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RA ++D + L+ + + F + +G VF ++A F R+ + +YI
Sbjct 241 RAILVDSMAIALLNNNKIAYEDFDFDEKSGGVFLNKKACGKFVEQFEKRLRQEVSYITEV 300
Query 298 PHRYTFQYALDLQLQSLVRVIEAGH 322
P++ +F+ ++ Q+ L++ +E+ +
Sbjct 301 PYKMSFRRIVEYQIMLLIKALESNN 325
>gi|296133514|ref|YP_003640761.1| CRISPR-associated protein Cas1 [Thermincola sp. JR]
gi|296032092|gb|ADG82860.1| CRISPR-associated protein Cas1 [Thermincola potens JR]
Length=335
Score = 186 bits (471), Expect = 6e-45, Method: Compositional matrix adjust.
Identities = 113/322 (36%), Positives = 171/322 (54%), Gaps = 3/322 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV + +R+ DGR+ ++ E P+E L+G+ L G MT VE+L++
Sbjct 1 MSFLYVCEPDTRVRIKDGRITAEQKDGMEVSIPLELLEGVVLMGSAQMTAACSVELLEKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ + G + GR+ + R R+Q DD FC + IV+ KI NQ ++R
Sbjct 61 IPVTFLSRSGFFYGRLESTRHVNILRQRKQFRAGDDEEFCFKFTCMIVAAKIHNQAVILR 120
Query 121 A---HTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
H + V E I M+ + R+GS+A++ GFEG A+K YF AL +V + FAF
Sbjct 121 RYNRHVNSPAVDECISRMQLLEENIARAGSIAQVMGFEGAASKHYFKALSLMVDRRFAFS 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
GR+ PPLD FNS++SLGY+LL A+ L+ Y G +H+D +GH LASDLME W
Sbjct 181 GRNRMPPLDPFNSLLSLGYTLLLYETYTAVVNKGLHPYAGLMHRDRQGHPALASDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
R I+D V+ ++ G++ F K+ +GAV +A R + F +I A Y+
Sbjct 241 RPVIVDSLVMSIVQGGILAPGDFYKDEASGAVLLKNDALRKFIKHFEQKIRSEANYLSYL 300
Query 298 PHRYTFQYALDLQLQSLVRVIE 319
+R +++ A+ Q L IE
Sbjct 301 DYRVSYRRAVQHQAGVLANCIE 322
>gi|121533442|ref|ZP_01665270.1| CRISPR-associated protein Cas1 [Thermosinus carboxydivorans Nor1]
gi|121308001|gb|EAX48915.1| CRISPR-associated protein Cas1 [Thermosinus carboxydivorans Nor1]
Length=332
Score = 185 bits (469), Expect = 1e-44, Method: Compositional matrix adjust.
Identities = 104/296 (36%), Positives = 156/296 (53%), Gaps = 3/296 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+D+ S + + GR +V E P E LD + LFG +T I E LKR
Sbjct 1 MRSLYVTDAGSHLQKSGGRFLVCKGEQILHAIPAEQLDNVVLFGSVQVTAKTITEFLKRG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ ++ G + GR+ + + RQQ + FCL L+K I+ KI N ++R
Sbjct 61 ITLTWLSSAGEFYGRLESTRHVDIHKQRQQFKMGERFDFCLKLAKSIIGAKIANCLTILR 120
Query 121 AH---TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ ++VA I +K L +D + ++ ++ G EG AA+ YF AL HLVP +F F
Sbjct 121 RYQRTAQKEEVAHYIEVIKVYLDRIDSAETIEKVLGLEGIAARNYFQALSHLVPDDFHFS 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
GR+ +PP D FNS++S GY+LL ++ ++ L+ Y G +H+D +GH TL SDLME W
Sbjct 181 GRNRQPPKDPFNSLLSFGYTLLMYDLYTIVQNAGLHPYAGLIHKDRQGHPTLVSDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATY 293
R IID V+ +I + F + G V+ REA S A+ R+ + Y
Sbjct 241 RPTIIDALVMSVIQRREIQPCDFLPPDEKGGVYLCREAAASFIAAYEKRLTKLNKY 296
>gi|114567264|ref|YP_754418.1| hypothetical protein Swol_1749 [Syntrophomonas wolfei subsp.
wolfei str. Goettingen]
gi|114338199|gb|ABI69047.1| CRISPR-associated protein, Cas1 family [Syntrophomonas wolfei
subsp. wolfei str. Goettingen]
Length=336
Score = 181 bits (460), Expect = 1e-43, Method: Compositional matrix adjust.
Identities = 100/328 (31%), Positives = 175/328 (54%), Gaps = 4/328 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQ-YPIETLDGITLFGRPTMTTPFIVEMLKR 59
M LYV + ++I + V+V S++ + PIE ++ + +FG ++++ + + ++R
Sbjct 1 MSFLYVYERSAKIGVQENCVVVESKKENLKRILPIEGVENVIIFGDASLSSNCVKQFMER 60
Query 60 ERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ ++ ++ G + GR+ + R R+Q +D FCL+L+KRI+ K+ NQ ++
Sbjct 61 DINLTWLSSRGKFYGRLESTRNVNIYRQRKQFACGEDDEFCLALAKRIILAKVKNQITIL 120
Query 120 RAHTSG---QDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
R + + V + I M L ++R + EL G EG AA+ Y+ L LV +FAF
Sbjct 121 RRYRRNRPEKSVQKIIDAMAKLLPIMERVHNKDELMGHEGMAARYYYQGLAELVEPDFAF 180
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
GR+ +PP D FNS++S Y+LL ++ A LN Y FLH RGH L SDLME
Sbjct 181 SGRNRQPPRDPFNSLLSFAYTLLMYDLYTAAVNRGLNPYASFLHSIRRGHPALCSDLMEE 240
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKG 296
WRA + D L + + G++ F K ++ G V+ +++ + ++ + Y+
Sbjct 241 WRAILADSLALYVTSKGIIKRENFEKPNEEGGVYLDGIGSKAFIAEYEKKVRGRSNYLAY 300
Query 297 DPHRYTFQYALDLQLQSLVRVIEAGHPS 324
+ +F+ A+++Q Q L + IE G PS
Sbjct 301 VDYSVSFRRAMEMQCQRLAKAIEEGDPS 328
>gi|237741581|ref|ZP_04572062.1| CRISPR-associated protein [Fusobacterium sp. 4_1_13]
gi|229429229|gb|EEO39441.1| CRISPR-associated protein [Fusobacterium sp. 4_1_13]
Length=335
Score = 181 bits (458), Expect = 2e-43, Method: Compositional matrix adjust.
Identities = 91/325 (28%), Positives = 170/325 (53%), Gaps = 3/325 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+ + + + + R+++ PIE +D + +FG ++T + +L +
Sbjct 1 MSNLYIYEQGIVLRYKENRLLITYTNGDYKSIPIENIDNVVIFGGIQLSTACMHNLLIKG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQAL-I 119
+ + G Y GR+ + R R+Q ++DD FCL++ K+ + K NQ+ L I
Sbjct 61 IHVTFLSKTGSYFGRLESTSNINIDRQREQFRKSDDKKFCLAIGKKFIKGKATNQRTLLI 120
Query 120 RAHT--SGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
RA+ + ++ I +M + ++ S ++ EL G EG A+ YF A+ H++ ++++F+
Sbjct 121 RANKDLKSEILSSVINSMFGIIKDINDSKTIEELMGVEGYLARLYFNAINHIIDKKYSFK 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D FN+++S GY+LL+ I + LN Y FLH D H L SDLME W
Sbjct 181 TRTKRPPKDPFNAVISFGYTLLHYEIFTTLVTKGLNPYAAFLHSDRHKHPALCSDLMEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RA ++D + L+ + + F+ + +G VF ++A F R+ + +YI
Sbjct 241 RAILVDSMAIALLNNNKIAYEDFNFDEKSGGVFLNKKACGKFVEQFEKRLRQEVSYITEV 300
Query 298 PHRYTFQYALDLQLQSLVRVIEAGH 322
++ +F+ ++ Q+ L++ +E +
Sbjct 301 SYKMSFRRIIEYQVMLLIKALENNN 325
>gi|258645680|ref|ZP_05733149.1| CRISPR-associated protein Cas1 [Dialister invisus DSM 15470]
gi|260403048|gb|EEW96595.1| CRISPR-associated protein Cas1 [Dialister invisus DSM 15470]
Length=331
Score = 177 bits (449), Expect = 2e-42, Method: Compositional matrix adjust.
Identities = 98/316 (32%), Positives = 171/316 (55%), Gaps = 7/316 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M +YV++ ++++ GR ++ E + P ++G+TLF +++ IV+ L+R
Sbjct 1 MSWIYVTEPGAKLNRQGGRYVISRENETICEVPSAVVEGVTLFDSIQISSSVIVDFLERN 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ ++ G + GR+ + D R ++Q D FCL+L+KR+V K+ NQ+ ++R
Sbjct 61 IPLTWISSTGRFFGRLESTDHQNVLRQKEQFDALADKDFCLALAKRVVFGKVYNQRTILR 120
Query 121 AHTSGQD--VAESIRTMKHSLA-WVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ + E +R+ LA + + S+ E+ G+EG A+ YF A+GH++P+EF F+
Sbjct 121 NYNRRAEDPFIEKVRSDIRILADKLHMAHSVEEVMGYEGMMARIYFQAIGHILPEEFRFE 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D FNS++S GY+LL + AI L+ YIGFLH GH LASDLME W
Sbjct 181 KRTKRPPRDYFNSLLSFGYTLLMYDFYSAIVNCGLHPYIGFLHALRNGHPALASDLMEPW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
R ++D L L+ + F K + G ++ R R +A+ ++ Y +G
Sbjct 241 RPAVVDAFCLSLVTHREISKDYFVK-GENGGIYLNRIGRRIFLQAYERKMRTVNRYFQGT 299
Query 298 PHRYTFQYALDLQLQS 313
Y++++ + ++ S
Sbjct 300 ---YSWRHTIQMECDS 312
>gi|339278110|emb|CCC19858.1| CRISPR-associated protein cas1 [Streptococcus thermophilus JIM
8232]
Length=334
Score = 176 bits (445), Expect = 7e-42, Method: Compositional matrix adjust.
Identities = 108/326 (34%), Positives = 171/326 (53%), Gaps = 4/326 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELG-ESQYPIETLDGITLFGRPTMTTPFIVEMLKR 59
M LY+ S +S ++ R+I+ ++ + I +D + LFG +TT I + K
Sbjct 1 MSDLYIQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60
Query 60 ERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ ++ F+ G + I T + Q + F L +++ I + K+ +Q AL+
Sbjct 61 KVNVYYFSNVGQFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRHQIALL 120
Query 120 RAH-TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
R T G + S+ + ++ S+ E+ G+EG AK+YF L LVP +F F G
Sbjct 121 REFDTDGLLDTSDYSRFEDSVNDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPDDFHFNG 180
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
RS RP D FNS ++ GYS+LY ++G I+++ L+ G +H+ + HATLASDLME WR
Sbjct 181 RSRRPAEDCFNSALNFGYSILYSCLMGLIKKNGLSLGFGVIHKHHQHHATLASDLMEEWR 240
Query 239 APIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDP 298
I+D+T++ LI +G + F +N D + T E ARA +RI YI+ D
Sbjct 241 PIIVDNTLMELIRNGKLLLSHF-ENKDQDFIL-TDEGREIFARALRSRILEVHQYIELDK 298
Query 299 HRYTFQYALDLQLQSLVRVIEAGHPS 324
RY+F Y D Q++SL+R PS
Sbjct 299 KRYSFLYTADRQIKSLIRAFRELDPS 324
>gi|325695839|gb|EGD37730.1| hypothetical protein HMPREF9384_1727 [Streptococcus sanguinis
SK160]
Length=308
Score = 174 bits (440), Expect = 3e-41, Method: Compositional matrix adjust.
Identities = 100/310 (33%), Positives = 165/310 (54%), Gaps = 4/310 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQ-YPIETLDGITLFGRPTMTTPFIVEMLKR 59
M Y+ +S +S +D ++++ +++ + + +D I +FG ++T + + +
Sbjct 1 MAFFYIQNSSYSLSISDRKLMIKNQDRTMLKAISLGLIDNILIFGNSQLSTQLLKSLSRH 60
Query 60 ERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ F++ G + + + + + R+Q + D +FCL +S+RI S KI+NQ L+
Sbjct 61 GIPVFYFSSKGEFLFSMDSFKEADYEKQREQAQSSFDKSFCLKMSQRIASAKIMNQLNLL 120
Query 120 RAHTS-GQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
+A+ + G E + K + + + S++E+ G EG AK+YF L LV ++F F
Sbjct 121 KAYDAQGIFDEEDFKRFKAACESLKSAKSISEIMGIEGRIAKSYFYYLNLLVEEDFQFYC 180
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
R+ RP LD FN++++ GYS+LY IG I ++ L+A G HQ HA LASDLME WR
Sbjct 181 RNRRPSLDRFNALLNFGYSILYSCFIGLIRKNGLSAGFGVTHQPHTHHAVLASDLMEEWR 240
Query 239 APIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDP 298
I+DDTV+ LI G + F K D + T E +R RI Y++ D
Sbjct 241 PVIVDDTVMSLIKHGDIRGEHFEKMGDE--MHLTSEGIEVFSRTMRERILEIHHYVELDK 298
Query 299 HRYTFQYALD 308
+RYTF Y D
Sbjct 299 NRYTFLYMAD 308
>gi|294794257|ref|ZP_06759393.1| CRISPR-associated protein Cas1 [Veillonella sp. 3_1_44]
gi|294454587|gb|EFG22960.1| CRISPR-associated protein Cas1 [Veillonella sp. 3_1_44]
Length=331
Score = 170 bits (431), Expect = 3e-40, Method: Compositional matrix adjust.
Identities = 95/298 (32%), Positives = 161/298 (55%), Gaps = 6/298 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+++ S I G V+V + P+E ++ IT+F ++T+ + + ++R
Sbjct 1 MSSLYVTEAGSFIKRDGGHVVVGRNNEVLFEVPLERIEDITVFDSVSITSSLVTDFIERG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y G I + + ++Q DD AF +++S++I+ K+ NQ ++R
Sbjct 61 VPITWLSGYGKYFGTIINTNTIDINKHKKQFDLLDDNAFRVAMSRKIIRAKVRNQLTILR 120
Query 121 AHTSGQD----VAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
+ + + I +K + + ++EL G+EG ++ YF ALG +VP FAF
Sbjct 121 RYARNLEEDINIDVQIANIKSVRSHIGECMRVSELMGYEGLISRLYFEALGKIVPSAFAF 180
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
R+ +PP D FN+M+ LGYS+L+ I+ + L+ ++G +H ++GH LASDL+E
Sbjct 181 TKRTKQPPRDPFNAMLGLGYSMLFNEILAGVINAGLHPFVGVMHSLAKGHPALASDLIEE 240
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYI 294
WRAPIID VL +++ +VD F NSD G + T E ++ A+ +I YI
Sbjct 241 WRAPIIDSMVLSMVSRNMVDLSEFD-NSDKGC-YLTAEGRKAFLMAYNKKIRSENQYI 296
>gi|333976353|gb|EGL77222.1| CRISPR-associated endonuclease Cas1 [Veillonella parvula ACS-068-V-Sch12]
Length=331
Score = 169 bits (429), Expect = 4e-40, Method: Compositional matrix adjust.
Identities = 95/298 (32%), Positives = 161/298 (55%), Gaps = 6/298 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+++ S I G V+V + P+E ++ IT+F ++T+ + + ++R
Sbjct 1 MSSLYVTEAGSFIKRDGGHVVVGRNNEVLFEVPLERIEDITVFDSVSITSSLVTDFIERG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y G I + + ++Q DD AF +++S++I+ K+ NQ ++R
Sbjct 61 VPITWLSGYGKYFGTIINTNTIDINKHKKQFDLLDDNAFRVAMSRKIIRAKVRNQLTILR 120
Query 121 AHTSGQD----VAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
+ + + I +K + + ++EL G+EG ++ YF ALG +VP FAF
Sbjct 121 RYARNLEEDINIDVQIANIKSVRSHIGECMRVSELMGYEGLISRLYFEALGKIVPPAFAF 180
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
R+ +PP D FN+M+ LGYS+L+ I+ + L+ ++G +H ++GH LASDL+E
Sbjct 181 TKRTKQPPRDPFNAMLGLGYSMLFNEILAGVINAGLHPFVGVMHSLAKGHPALASDLIEE 240
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYI 294
WRAPIID VL +++ +VD F NSD G + T E ++ A+ +I YI
Sbjct 241 WRAPIIDSMVLSMVSRNMVDLSEFD-NSDKGC-YLTAEGRKAFLMAYNKKIRSENQYI 296
>gi|238018273|ref|ZP_04598699.1| hypothetical protein VEIDISOL_00097 [Veillonella dispar ATCC
17748]
gi|237864744|gb|EEP66034.1| hypothetical protein VEIDISOL_00097 [Veillonella dispar ATCC
17748]
Length=331
Score = 168 bits (426), Expect = 1e-39, Method: Compositional matrix adjust.
Identities = 94/298 (32%), Positives = 160/298 (54%), Gaps = 6/298 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+++ S I G V+V + P+E ++ IT+F ++T+ + + ++R
Sbjct 1 MSSLYVTEAGSFIKRDGGHVVVGRNNEVLFEVPLERIEDITVFDSVSITSSLVTDFIERG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y G + + + ++Q DD AF +++S++I+ K+ NQ ++R
Sbjct 61 VPITWLSGYGKYFGTLINTNTIDINKHKKQFDLLDDNAFRVAMSRKIIRAKVRNQLTILR 120
Query 121 AHTSGQD----VAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
+ + + I +K + + ++EL G+EG ++ YF ALG +VP FAF
Sbjct 121 RYARNLEEDINIDAQIANIKSVRSHIGECMRVSELMGYEGLISRLYFEALGKIVPSAFAF 180
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
R+ +PP D FN+M+ LGYS+L+ I+ + L+ ++G +H ++GH LASDL+E
Sbjct 181 TKRTKQPPRDPFNAMLGLGYSMLFNEILAGVINAGLHPFVGIMHSLAKGHPALASDLIEE 240
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYI 294
WRAPIID VL +++ +VD F NSD G + T E + A+ +I YI
Sbjct 241 WRAPIIDSMVLSMVSRNMVDLAEFD-NSDKGC-YLTAEGRKVFLTAYNKKIRSENQYI 296
>gi|116627764|ref|YP_820383.1| hypothetical protein STER_0970 [Streptococcus thermophilus LMD-9]
gi|116101041|gb|ABJ66187.1| CRISPR-associated protein, Cas1 family [Streptococcus thermophilus
LMD-9]
Length=334
Score = 167 bits (422), Expect = 3e-39, Method: Compositional matrix adjust.
Identities = 106/326 (33%), Positives = 168/326 (52%), Gaps = 4/326 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELG-ESQYPIETLDGITLFGRPTMTTPFIVEMLKR 59
M LY S +S ++ R+I+ ++ + I +D + LFG +TT I + K
Sbjct 1 MSDLYSQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60
Query 60 ERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ ++ F+ G + I T + Q + F L +++ I + K+ +Q AL+
Sbjct 61 KVNVYYFSNVGQFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRHQIALL 120
Query 120 RAH-TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
R T G + S+ + ++ S+ E+ G+EG AK+YF L LVP +F F G
Sbjct 121 REFDTDGLLDTSDYSRFEDSVNDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPDDFHFNG 180
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
RS R D FNS ++ GYS+LY ++G I+++ L+ G +H+ + HATLASDLME WR
Sbjct 181 RSRRTAEDCFNSALNFGYSILYSCLMGLIKKNGLSLGFGVIHKHHQHHATLASDLMEEWR 240
Query 239 APIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDP 298
I+D+T++ LI +G + F +N D + T E A A +RI YI+ D
Sbjct 241 PIIVDNTLMELIRNGKLLLSHF-ENKDQDFIL-TDEGREIFAWALRSRILEVHRYIELDK 298
Query 299 HRYTFQYALDLQLQSLVRVIEAGHPS 324
RY+F Y D Q++SL+R PS
Sbjct 299 KRYSFLYTADRQIKSLIRAFRELDPS 324
>gi|303231960|ref|ZP_07318668.1| CRISPR-associated endonuclease Cas1 [Veillonella atypica ACS-049-V-Sch6]
gi|302513389|gb|EFL55423.1| CRISPR-associated endonuclease Cas1 [Veillonella atypica ACS-049-V-Sch6]
Length=331
Score = 166 bits (421), Expect = 4e-39, Method: Compositional matrix adjust.
Identities = 97/298 (33%), Positives = 159/298 (54%), Gaps = 6/298 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV++S S I G V+V + P+E ++ IT+F ++T+ + + ++R
Sbjct 1 MSSLYVTESGSFIKRNGGHVVVGRNNEVLFEVPLERIEDITVFDTVSITSSLVTDFIERG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y G I + + ++Q D+ F L++S++I+ K+ NQ ++R
Sbjct 61 IPITWLSGYGKYFGTIINTNTIDINKHKKQFDLLDNHEFRLAISRKIIRAKVRNQLTILR 120
Query 121 AHTSGQDVAESIRT----MKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
+ D SI T +K + + ++EL G+EG ++ YF ALG +VP F+F
Sbjct 121 RYARNLDEDISIDTQIDNIKSVRSHIGECVRISELMGYEGIISRLYFEALGKIVPPIFSF 180
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
RS +PP D FN+M+ LGYS+L+ I+ + L+ ++G +H +GH LASDL+E
Sbjct 181 TKRSKQPPRDEFNAMLGLGYSMLFNEILAGLINAGLHPFVGVMHSLGKGHPALASDLIEE 240
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYI 294
WRAPIID VL +++ +V+ F NSD G + T E + A+ +I YI
Sbjct 241 WRAPIIDSMVLSMVSRNMVELSDFD-NSDKGC-YLTTEGRKGFLVAYNKKIRSENQYI 296
>gi|342214546|ref|ZP_08707233.1| CRISPR-associated endonuclease Cas1 [Veillonella sp. oral taxon
780 str. F0422]
gi|341592059|gb|EGS34954.1| CRISPR-associated endonuclease Cas1 [Veillonella sp. oral taxon
780 str. F0422]
Length=331
Score = 164 bits (416), Expect = 1e-38, Method: Compositional matrix adjust.
Identities = 92/297 (31%), Positives = 151/297 (51%), Gaps = 4/297 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV++S S I G VIV + P E++D +T+F +++ + + +
Sbjct 1 MTTLYVTESGSFIKRKGGHVIVGRNHEVLFEVPFESIDDVTVFDSVHISSSLLTDFISNG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ + G Y G + + + ++Q D FCL+LSK++++ KI NQ ++R
Sbjct 61 IPVTWLSGYGKYFGTLINTNTVDIHKHQRQFTIRGDKEFCLALSKKLINAKINNQLTILR 120
Query 121 AHTSG---QDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ V SI + H + ++ S+ EL G+EG ++ YF LG +VP EF F
Sbjct 121 RYERNLLDDSVMMSINNICHIRKNIHKATSIEELMGYEGIISRLYFEGLGKIVPFEFTFT 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
RS +PPLD FN+M+ LGYS+L+ I+ + L+ ++G +H GH L SDL+E W
Sbjct 181 KRSKQPPLDPFNAMLGLGYSMLFNEIMAGVINAGLHPFVGCMHSIKGGHPALVSDLIEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYI 294
RAP+ID VL L+ ++D + S G + E + A+ +I Y+
Sbjct 241 RAPVIDSLVLNLVKRKMIDVEEDFQYSGEGC-YLNGEGRKLFLSAYNKKIKSMNQYM 296
>gi|341822659|emb|CCC73583.1| CRISPR-associated endonuclease CaS1 [Megasphaera elsdenii DSM
20460]
Length=331
Score = 163 bits (413), Expect = 3e-38, Method: Compositional matrix adjust.
Identities = 98/323 (31%), Positives = 162/323 (51%), Gaps = 7/323 (2%)
Query 5 YVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRERDIQ 64
Y+++ + IS DGR +V + P ETL+G+ + +T+ IV +L +
Sbjct 5 YITEKGATISKKDGRFVVGRNHETLLEIPEETLEGLLVTDTVQLTSHAIVSLLHLGIPVT 64
Query 65 LFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIRAHTS 124
++ G Y GR+ + + +QQ D P F L +S+R++ K+ NQ L+R +
Sbjct 65 WLSSHGKYFGRLESTRHVSVFKQKQQFLLQDQP-FSLEMSRRVLLAKVHNQLTLLRRYNR 123
Query 125 GQDVAESIRTMKHSLAWVDR---SGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRST 181
+ + + + + + D + L G+EG AAK YF+ALG LV FAF+ RS
Sbjct 124 DRKIPSVMIDIHNMMTMADHLKIAEDCESLMGYEGMAAKIYFSALGKLVDPTFAFEKRSK 183
Query 182 RPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAPI 241
RPPLD FNS++S Y+L+ + AI L+ Y+GFLH H LASDL+E WRA +
Sbjct 184 RPPLDPFNSLLSFAYTLIMYELFTAITNEGLHPYVGFLHTLKEHHPALASDLLEEWRAVL 243
Query 242 IDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGDPHRY 301
D V+ L+ + F + ++ T E + RA+ ++ YI G ++
Sbjct 244 ADSFVMSLVQHHEIKEEHFCCDEANHGIYLTPEGRKIFFRAYEKKMRSINQYIDG---KH 300
Query 302 TFQYALDLQLQSLVRVIEAGHPS 324
+F+ +L+ Q+ + + A P
Sbjct 301 SFRRSLNYQVAQYGQALMAREPK 323
>gi|159899002|ref|YP_001545249.1| CRISPR-associated Cas1 family protein [Herpetosiphon aurantiacus
DSM 785]
gi|159892041|gb|ABX05121.1| CRISPR-associated protein Cas1 [Herpetosiphon aurantiacus DSM
785]
Length=339
Score = 161 bits (407), Expect = 2e-37, Method: Compositional matrix adjust.
Identities = 96/330 (30%), Positives = 161/330 (49%), Gaps = 9/330 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV + + I R+ +W + P+ L+ I + G +TP I +L ++
Sbjct 1 MATLYVLEQGAEIRCDGERLAIWQTDQELGNVPMAKLEDIVVMGNIGFSTPAIKRLLDQQ 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
++ T G Y GR+ ++ R Q R DD + L++++ VS K+ N +A+++
Sbjct 61 IEVTFLTIHGRYHGRLIGEATAHVALRRNQYRRADDEVWALAMAQACVSGKLRNCRAVLQ 120
Query 121 AHTSG-----QDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFA 175
++V ESI + H + VDR+ ++ L G EG+ + AYF L L E+
Sbjct 121 RFARNRQQVEKEVLESIEALDHFIDRVDRTTKISSLVGVEGSGSAAYFGGLRGLFDSEWM 180
Query 176 FQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLME 235
F R+ RPP D N ++SLGY+LL +GA++ + Y GFLHQ +L DL+E
Sbjct 181 FNNRNRRPPTDPVNVLLSLGYTLLVHKTLGAVQAVGFDPYQGFLHQLDYNRPSLVLDLIE 240
Query 236 VWRAPIIDDTVLRLIADGVVDTRAFSKNSDTG-AVFATREATRSIARAFGNRIARTATY- 293
+R ++D V+R DG + FS + D + + E + AF R+ T+
Sbjct 241 EFRPILVDALVIRCCNDGRLTANDFSPSDDPKHPILLSNEGKKRFVVAFEERMRTEVTHP 300
Query 294 --IKGDPHRYTFQYALDLQLQSLVRVIEAG 321
G P + ++ ++LQ + L R I+ G
Sbjct 301 DGADGRPGKVSYWRCIELQARLLARAIQTG 330
>gi|291539925|emb|CBL13036.1| CRISPR-associated protein Cas1 [Roseburia intestinalis XB6B4]
Length=232
Score = 159 bits (402), Expect = 6e-37, Method: Compositional matrix adjust.
Identities = 81/219 (37%), Positives = 123/219 (57%), Gaps = 2/219 (0%)
Query 103 LSKRIVSRKILNQQALIRAHTSG--QDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAK 160
+SKRI+ KI NQ ++R + G +D+ I M++ + + S+ ++ G+EG AAK
Sbjct 1 MSKRIIDAKIRNQVVVLRRYARGRDEDIHRMIIEMQNMQKKLLYAKSVEQVMGYEGTAAK 60
Query 161 AYFTALGHLVPQEFAFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLH 220
YF LG L+ ++F F+GRS RPP+D FNS++SLGYS++ + G IE LN Y G +H
Sbjct 61 IYFKVLGKLIDEQFVFEGRSRRPPMDPFNSLISLGYSIILNELYGKIEGKGLNPYFGVMH 120
Query 221 QDSRGHATLASDLMEVWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIA 280
+D H TLASDLME WRA +ID T L ++ + F D VF ++ R
Sbjct 121 KDREKHPTLASDLMEEWRAVLIDTTALSMLNGHELVKEDFYTGIDQPGVFLEKDGFRKYI 180
Query 281 RAFGNRIARTATYIKGDPHRYTFQYALDLQLQSLVRVIE 319
+ + Y+ + +F+ A+DLQ+ V+ IE
Sbjct 181 QKLEGKFRTENKYLSYIDYSVSFRRAMDLQVNQFVKAIE 219
>gi|125718075|ref|YP_001035208.1| hypothetical protein SSA_1255 [Streptococcus sanguinis SK36]
gi|125497992|gb|ABN44658.1| Conserved hypothetical protein [Streptococcus sanguinis SK36]
Length=262
Score = 159 bits (402), Expect = 7e-37, Method: Compositional matrix adjust.
Identities = 88/262 (34%), Positives = 146/262 (56%), Gaps = 2/262 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQ-YPIETLDGITLFGRPTMTTPFIVEMLKR 59
M LY+ +S +S +D ++++ ++E + + +D I +FG ++T + + +
Sbjct 1 MADLYIQNSSYSLSISDRKLMIKNQERTMLKAISLGLIDNILIFGNSQLSTQLLKSLSRH 60
Query 60 ERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ F++ G + + + + + R+Q + D +FCL +SKRI S KI+NQ L+
Sbjct 61 GIPVFYFSSKGEFLFSMDSFKEADYEKQREQAQASFDKSFCLKMSKRIASAKIMNQLNLL 120
Query 120 RAH-TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
+A+ G E + K + ++ + S++E+ G EG AK+YF L LV ++F F
Sbjct 121 KAYDEQGLFDEEDFKRFKSACESLESAKSISEIMGIEGRIAKSYFYYLNLLVEEDFQFYS 180
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
R+ P LD FN++++ GYS+LY IG I ++ L+A G HQ HA LASDLME WR
Sbjct 181 RNRSPSLDRFNALLNFGYSILYSCFIGLIRKNGLSAGFGVTHQPHTHHAVLASDLMEEWR 240
Query 239 APIIDDTVLRLIADGVVDTRAF 260
I+DDTV+ LI G + AF
Sbjct 241 PVIVDDTVMSLIKHGDIRGGAF 262
>gi|327470946|gb|EGF16402.1| CRISPR-associated protein cas1 [Streptococcus sanguinis SK330]
Length=230
Score = 158 bits (399), Expect = 1e-36, Method: Compositional matrix adjust.
Identities = 86/219 (40%), Positives = 126/219 (58%), Gaps = 3/219 (1%)
Query 103 LSKRIVSRKILNQQALIRAH-TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKA 161
+S+RI S KI+NQ L++AH G E + K + ++ + S++E+ G EG AK+
Sbjct 1 MSQRIASAKIMNQLNLLKAHDEQGLFDEEDFKRFKAACESLESAKSISEIIGIEGRIAKS 60
Query 162 YFTALGHLVPQEFAFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQ 221
YF L LV ++F F R+ RP LD FN++++ GY +LY IG I ++ L+A G HQ
Sbjct 61 YFYYLNLLVKEDFQFYCRNRRPSLDRFNALLNFGYLILYSCFIGLIRKNGLSAGFGVTHQ 120
Query 222 DSRGHATLASDLMEVWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIAR 281
HA LASDLME WR I+DDTV+ LI G + F KN + + T + +R
Sbjct 121 PHTHHAVLASDLMEEWRPVIVDDTVMSLIKQGDIRGEHFEKNGEE--MHLTSKGIEVFSR 178
Query 282 AFGNRIARTATYIKGDPHRYTFQYALDLQLQSLVRVIEA 320
A RI Y++ D +RYTF Y D Q++SL+R ++
Sbjct 179 AMRERILEIHHYVELDKNRYTFLYIADQQVKSLIRCFKS 217
>gi|323141545|ref|ZP_08076431.1| CRISPR-associated endonuclease Cas1 [Phascolarctobacterium sp.
YIT 12067]
gi|322414004|gb|EFY04837.1| CRISPR-associated endonuclease Cas1 [Phascolarctobacterium sp.
YIT 12067]
Length=330
Score = 154 bits (390), Expect = 2e-35, Method: Compositional matrix adjust.
Identities = 92/321 (29%), Positives = 161/321 (51%), Gaps = 8/321 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+++S + + G V+V + P+E ++ +TL +++ I E L+R
Sbjct 1 MTSLYITESGAYLRKRGGHVLVGRNNEVLLEVPLERIEDVTLVDSVQISSGLITEFLERN 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ + G + G + + + ++Q + L+K+I+ K+ NQ ++R
Sbjct 61 IPLSWLSGRGRFFGSLLSNGSIDIIKHQKQFELLQEGKLYFELAKKIIYAKVHNQLTILR 120
Query 121 AHTSG---QDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ +V SIR + + ++ L L GFEG ++ YF ALG +VP EF F+
Sbjct 121 RYNRNLKLDNVDTSIRNILAIRKNICQTDDLHSLMGFEGIISRIYFCALGAIVPDEFKFE 180
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ PP D FNSM+SLGYS+L+ I+ + L+ Y+GF+H+ ++GH L SDL+E W
Sbjct 181 KRTKMPPRDPFNSMLSLGYSMLFNEIMSNVLALGLHPYVGFMHKIAKGHPALVSDLIEEW 240
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RAP+ID VL +I ++ F N F EA + + + ++ Y +
Sbjct 241 RAPLIDSMVLAMIKRNMLTRDMFEINE--AGCFLNTEARKIYLQTYNKKLRSDNQYFED- 297
Query 298 PHRYTFQYALDLQLQSLVRVI 318
+YT++ ++ Q + VI
Sbjct 298 --KYTYRESIRQQCRKYASVI 316
>gi|309791951|ref|ZP_07686429.1| CRISPR-associated Cas1 family protein [Oscillochloris trichoides
DG6]
gi|308225945|gb|EFO79695.1| CRISPR-associated Cas1 family protein [Oscillochloris trichoides
DG6]
Length=339
Score = 154 bits (388), Expect = 3e-35, Method: Compositional matrix adjust.
Identities = 109/333 (33%), Positives = 167/333 (51%), Gaps = 11/333 (3%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV + + I R++V + PI LD I + G ++TP + + +R
Sbjct 1 MATLYVIEQGAEIGCDGERIVVRRQGQEIGSVPISRLDDILIIGNIGISTPALKRLFERG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI- 119
++ T G YQGR+ +A R Q R DDPA+ LS ++ V+ K+ N + L+
Sbjct 61 IEVTFLTVHGRYQGRLVGATTPHAALRRAQYRRADDPAWSLSQAQACVTGKLRNARVLLQ 120
Query 120 -----RAHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEF 174
R++ S DV+ + + +A ++R+ L+ L G EG+A YF L L ++
Sbjct 121 RFARNRSNVS-PDVSIAADDLSTYIARIERTTQLSSLLGVEGSATARYFGGLRALFEPDW 179
Query 175 AFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLM 234
F RS RPP D N ++SLGY+LL + AI L+ Y+G+LHQ G A+LA D+M
Sbjct 180 QFHARSRRPPGDPVNVLLSLGYTLLLHKVTAAIAASGLDPYMGYLHQIEYGRASLALDMM 239
Query 235 EVWRAPIIDDTVLRLIADGVVDTRAF-SKNSDTGAVFATREATRSIARAFGNRIARTATY 293
E +R ++D VLR DG V F S A+ + E R AF R+ AT+
Sbjct 240 EEFRPLLVDSLVLRCCGDGRVQAEDFRSGGEGERAIVFSPEGQRRFISAFEERMRTEATH 299
Query 294 IKG---DPHRYTFQYALDLQLQSLVRVIEAGHP 323
+G P + ++ L+LQ + LVR I+ P
Sbjct 300 PEGADSGPGKVSYMRCLELQARRLVRAIQGSTP 332
>gi|334126733|ref|ZP_08500681.1| CRISPR-associated protein Cas1 [Centipeda periodontii DSM 2778]
gi|333391143|gb|EGK62264.1| CRISPR-associated protein Cas1 [Centipeda periodontii DSM 2778]
Length=331
Score = 154 bits (388), Expect = 3e-35, Method: Compositional matrix adjust.
Identities = 98/317 (31%), Positives = 161/317 (51%), Gaps = 8/317 (2%)
Query 5 YVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRERDIQ 64
Y+++ + + G ++ + P E L+ +TL +++ +VE+L+ +
Sbjct 4 YITEEGAYVQKRGGNFVIGRNNECVMEIPEEVLESLTLIDSVQVSSQAMVELLRLGVPVT 63
Query 65 LFTTDGHYQGRI-STPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIRAHT 123
+ G + GR+ ST V+ + RQ + + F L + +++++ K+ NQ L+R +
Sbjct 64 WLSRTGFFFGRLESTRHVNVFRQERQVLMKGS--GFYLRMGRKVIAAKVHNQLTLLRRYN 121
Query 124 SGQDVAESIRTMKHSLAWVDR---SGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQGRS 180
++ + + LA R + + +L G+EG AK YF ALG LVP+EFAF RS
Sbjct 122 RNAELPGVQQAIDEILALRKRIPLAETSEQLMGYEGAIAKVYFRALGLLVPEEFAFMRRS 181
Query 181 TRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRAP 240
RPPLD FN+M+S GY+LL +I A+ L+ Y GFLH H LASDLME WRA
Sbjct 182 KRPPLDPFNAMLSFGYTLLMYDIYTALSNEGLHPYFGFLHALKNRHPALASDLMEEWRAV 241
Query 241 IIDDTVLRLIADGVVDTRAFSK-NSDTGAVFATREATRSIARAFGNRIARTATYIKGD-P 298
++D VL L++ + F+ D + TRE RA+ ++ Y++G
Sbjct 242 LVDAMVLSLVSHHEIKREHFAAMKEDEPGIILTREGRAIFLRAYEKKLRTANRYVEGKHS 301
Query 299 HRYTFQYALDLQLQSLV 315
+R T Y Q+L+
Sbjct 302 YRRTLAYQARQYAQALL 318
>gi|313894905|ref|ZP_07828465.1| CRISPR-associated endonuclease Cas1 [Selenomonas sp. oral taxon
137 str. F0430]
gi|312976586|gb|EFR42041.1| CRISPR-associated endonuclease Cas1 [Selenomonas sp. oral taxon
137 str. F0430]
Length=333
Score = 154 bits (388), Expect = 3e-35, Method: Compositional matrix adjust.
Identities = 93/330 (29%), Positives = 166/330 (51%), Gaps = 11/330 (3%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M +Y++D +++ + +V + P E ++G+ L +++ +VE+LK
Sbjct 1 MSFIYITDEGAKLQKKGDKFLVGRNLEILMEIPKEIIEGLVLIDSVQISSDAVVELLKLG 60
Query 61 RDIQLFTTDGHYQGRI-STPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+T G + GR+ ST +V + RQ + + + F + L K+I + K+ NQ L+
Sbjct 61 VPTTWISTHGKFYGRLESTRNVDVFKQRRQIL--SQESEFAVKLCKKIAAAKVHNQLTLL 118
Query 120 RAHT-----SGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEF 174
R + +++A + ++ +D ++ G+EG +A+ YF ALG + P F
Sbjct 119 RRYNRREEEHAKEIASLVTRLQILQKNIDFVSEKEKIMGYEGASARHYFKALGMMTPSPF 178
Query 175 AFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLM 234
+F+ R+ +PP DAFNSM+S GY+LL I A+ L+ Y GF H H LASDLM
Sbjct 179 SFERRTRQPPRDAFNSMLSFGYTLLMYEIYTALCNQGLSPYFGFFHALKNRHPALASDLM 238
Query 235 EVWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYI 294
E WR +ID V+ L+ V F ++ + V+ TRE RA+ ++ Y+
Sbjct 239 EEWRPVLIDSMVMSLVHHHEVQPEHFMRSEENDGVYMTREGRTIFLRAYEKKLRTMNRYL 298
Query 295 KGDPHRYTFQYALDLQLQSLVRVIEAGHPS 324
G+ ++++ +L +Q + + + A P
Sbjct 299 TGE---HSYRKSLTIQAKKFSQALMAEEPE 325
>gi|327474433|gb|EGF19839.1| hypothetical protein HMPREF9391_0559 [Streptococcus sanguinis
SK408]
Length=242
Score = 152 bits (384), Expect = 8e-35, Method: Compositional matrix adjust.
Identities = 81/225 (36%), Positives = 127/225 (57%), Gaps = 1/225 (0%)
Query 37 LDGITLFGRPTMTTPFIVEMLKRERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDD 96
+D I +FG ++T + + + + F++ G + + + + + R+Q + D
Sbjct 18 IDNILIFGNSQLSTQLLKSLSRHGIPVFYFSSKGEFLFSMDSFKEADYEKQREQAQASFD 77
Query 97 PAFCLSLSKRIVSRKILNQQALIRAH-TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFE 155
+FCL +SKRI S KI+NQ L++A+ G E + K + ++ + S++E+ G E
Sbjct 78 KSFCLKMSKRIASAKIMNQLNLLKAYDEQGLFDEEDFKRFKSACESLESAKSISEIMGIE 137
Query 156 GNAAKAYFTALGHLVPQEFAFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAY 215
G AK+YF L LV ++F F R+ P LD FN++++ GYS+LY IG I ++ L+A
Sbjct 138 GRIAKSYFYYLNLLVEEDFQFYSRNRSPSLDRFNALLNFGYSILYSCFIGLIRKNGLSAG 197
Query 216 IGFLHQDSRGHATLASDLMEVWRAPIIDDTVLRLIADGVVDTRAF 260
G HQ HA LASDLME WR I+DDTV+ LI G + AF
Sbjct 198 FGVTHQPHTHHAVLASDLMEEWRPVIVDDTVMSLIKHGDIRGGAF 242
>gi|209526394|ref|ZP_03274922.1| CRISPR-associated protein Cas1 [Arthrospira maxima CS-328]
gi|209493167|gb|EDZ93494.1| CRISPR-associated protein Cas1 [Arthrospira maxima CS-328]
Length=654
Score = 151 bits (382), Expect = 1e-34, Method: Compositional matrix adjust.
Identities = 96/316 (31%), Positives = 157/316 (50%), Gaps = 8/316 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+D + + + V + P+ +D I LFG ++ I L+R
Sbjct 320 MTTLYVTDQGAYVKVKHQQFQVLLGNDLKVSIPVNVVDYIILFGCCNLSHGAIGLALRRR 379
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y GR+ T ++ L +QVH +D F L +K IV+ K+ N + L+R
Sbjct 380 IPILFLSYQGRYFGRLQTDGMTRVDYLSRQVHCAEDETFVLRQAKVIVAGKLHNCRILLR 439
Query 121 AHTSGQDVAESIRTMKHSLAWVDRSGS---LAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ +++ I ++ W ++ L L G+EG + YF ALG LV F F+
Sbjct 440 RLNRDRQISQVIEAIEELGVWQEKIAEVELLESLLGYEGFGTRIYFQALGALVQPPFTFE 499
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D NS++SLGY+LL++NI I L+ + G LH H L SDL+E +
Sbjct 500 HRTRRPPTDPVNSLLSLGYTLLHQNIHSLILAVGLHPHYGNLHVPRSNHPALVSDLIEEF 559
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RAP++D V+ L+ G+ F+ + + G V+ +A + + + ++++ T+
Sbjct 560 RAPVVDSLVIYLVNSGIFTPEDFTPSDERGGVYLYSDALKKYLKHWQDKLSLKTTH---- 615
Query 298 PHR-YTFQYALDLQLQ 312
PH Y Y L+LQ
Sbjct 616 PHTGYKVSYYRCLELQ 631
>gi|156741961|ref|YP_001432090.1| CRISPR-associated Cas1 family protein [Roseiflexus castenholzii
DSM 13941]
gi|156233289|gb|ABU58072.1| CRISPR-associated protein Cas1 [Roseiflexus castenholzii DSM
13941]
Length=339
Score = 151 bits (381), Expect = 2e-34, Method: Compositional matrix adjust.
Identities = 97/330 (30%), Positives = 161/330 (49%), Gaps = 9/330 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV++ S I R+ V + + P+ ++ I + G ++TP I ML
Sbjct 1 MDTLYVTEQGSEIGCDGERLAVRRDNAIIASIPLIKIEDIVIIGNVGLSTPAIKRMLDNG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
++ T G YQGR+ ++A Q R DD A+ L L++R V K+ N +AL+R
Sbjct 61 INVTFLTVHGRYQGRLVGSVSAHAALRAAQYRRADDRAWSLRLAQRFVEGKLRNCRALLR 120
Query 121 AHTSGQ-----DVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFA 175
+ + ++ + + V R+ +L L G EG+A YF + L+ E+
Sbjct 121 RFARNRADAPAEAGQAADDLDRFIDRVPRTTTLNALMGVEGSATARYFAGVRALIGAEWR 180
Query 176 FQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLME 235
F+ R RPP D N+++S GY+LL ++GA+E + Y+G+LH G +LA DL+E
Sbjct 181 FEARIRRPPPDRVNALLSFGYTLLVHKMLGAVEAAGFDPYLGYLHHIDYGRPSLALDLIE 240
Query 236 VWRAPIIDDTVLRLIADGVVDTRAFSKNSDTG-AVFATREATRSIARAFGNRIARTATY- 293
+R ++D V+R DG + F++ D V + + R AF R+ AT+
Sbjct 241 EFRPILVDSLVIRCCNDGRIAFDDFTETPDGDYPVLLSDDGKRRFVAAFEERMRTEATHP 300
Query 294 --IKGDPHRYTFQYALDLQLQSLVRVIEAG 321
G P + ++ L LQ + L R ++ G
Sbjct 301 DGADGRPGKVSYLRCLALQARRLARAVQGG 330
>gi|163846146|ref|YP_001634190.1| CRISPR-associated Cas1 family protein [Chloroflexus aurantiacus
J-10-fl]
gi|222523888|ref|YP_002568358.1| CRISPR-associated protein Cas1 [Chloroflexus sp. Y-400-fl]
gi|163667435|gb|ABY33801.1| CRISPR-associated protein Cas1 [Chloroflexus aurantiacus J-10-fl]
gi|222447767|gb|ACM52033.1| CRISPR-associated protein Cas1 [Chloroflexus sp. Y-400-fl]
Length=339
Score = 150 bits (380), Expect = 2e-34, Method: Compositional matrix adjust.
Identities = 99/326 (31%), Positives = 155/326 (48%), Gaps = 8/326 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+ + + I R++V P+ +D I +FG ++TP I +L R
Sbjct 1 MATLYLIEQGAEIGCDGERIVVRRAGEIIGSVPLVKVDDIVIFGNIGISTPAIKRLLDRS 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
++ T DG YQGR+ ++ + Q D L L++ V K+ NQ+AL++
Sbjct 61 IEVTFMTVDGSYQGRLVGQVTAHVALRQAQYACAADSDRTLRLAQSFVEGKLRNQRALLQ 120
Query 121 AHTSGQDVAESIRTMKHS-----LAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFA 175
+ + + + V R+ L+ L G EG+A YF L L+ E+
Sbjct 121 RFSRNRATPPAEALAAADDLDAYIKRVRRTTRLSALLGVEGSATARYFAGLRSLIEPEWD 180
Query 176 FQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLME 235
F+ R RPP D N ++S GY+LL +GA++ + Y+GFLH G +LA DLME
Sbjct 181 FRSRQRRPPPDPVNLLLSFGYTLLTHKTLGAVQAAGFDPYLGFLHSLDYGRPSLALDLME 240
Query 236 VWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIK 295
+R +ID V+R+ DG + F + V T E R+ AF R+ AT+ +
Sbjct 241 EFRPLLIDSLVVRVCNDGRLRLEHFQPGDEARPVIITDEGKRAFLTAFEERMRTEATHPE 300
Query 296 G---DPHRYTFQYALDLQLQSLVRVI 318
G P + T+Q + LQ + L RVI
Sbjct 301 GADSGPGKVTYQRCIALQARRLARVI 326
>gi|291568436|dbj|BAI90708.1| CRISPR-associated protein Cas1 [Arthrospira platensis NIES-39]
Length=599
Score = 148 bits (374), Expect = 1e-33, Method: Compositional matrix adjust.
Identities = 95/316 (31%), Positives = 156/316 (50%), Gaps = 8/316 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+D + + + V + P+ +D I LFG ++ I L+R
Sbjct 266 MTTLYVTDQGAYVKVKHQQFQVLLGNDLKVSIPVNVVDYIILFGCCNLSHGAIGLALRRR 325
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y GR+ T ++ L +QVH +D F L +K IV+ K+ N + L+R
Sbjct 326 IPILFLSDQGRYFGRLQTDGMTRVDYLSRQVHCAEDETFVLRQAKVIVAGKLHNCRILLR 385
Query 121 AHTSGQDVAESIRTMKHSLAWVDRSGS---LAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ +++ I ++ W ++ L L G+EG + YF AL LV F F+
Sbjct 386 RLNRDRQISQVIEAIEELGVWQEKIAEVELLESLLGYEGFGTRIYFQALRALVQPPFTFE 445
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D NS++SLGY+LL++NI I L+ + G LH H L SDL+E +
Sbjct 446 HRTRRPPTDPVNSLLSLGYTLLHQNIHSLILAVGLHPHYGNLHVPRSNHPALVSDLIEEF 505
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RAP++D V+ L+ G+ F+ + + G V+ +A + + + ++++ T+
Sbjct 506 RAPVVDSLVIYLVNSGIFTPEDFTPSDERGGVYIYSDALKKYLKHWHDKLSLKTTH---- 561
Query 298 PHR-YTFQYALDLQLQ 312
PH Y Y L+LQ
Sbjct 562 PHTGYKVSYYRCLELQ 577
>gi|284052685|ref|ZP_06382895.1| hypothetical protein AplaP_14548 [Arthrospira platensis str.
Paraca]
Length=592
Score = 148 bits (374), Expect = 1e-33, Method: Compositional matrix adjust.
Identities = 95/316 (31%), Positives = 156/316 (50%), Gaps = 8/316 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV+D + + + V + P+ +D I LFG ++ I L+R
Sbjct 259 MTTLYVTDQGAYVKVKHQQFQVLLGNDLKVSIPVNVVDYIILFGCCNLSHGAIGLALRRR 318
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
I + G Y GR+ T ++ L +QVH +D F L +K IV+ K+ N + L+R
Sbjct 319 IPILFLSDQGRYFGRLQTDGMTRVDYLSRQVHCAEDETFVLRQAKVIVAGKLHNCRILLR 378
Query 121 AHTSGQDVAESIRTMKHSLAWVDRSGS---LAELNGFEGNAAKAYFTALGHLVPQEFAFQ 177
+ +++ I ++ W ++ L L G+EG + YF AL LV F F+
Sbjct 379 RLNRDRQISQVIEAIEELGVWQEKIAEVELLESLLGYEGFGTRIYFQALRALVQPPFTFE 438
Query 178 GRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVW 237
R+ RPP D NS++SLGY+LL++NI I L+ + G LH H L SDL+E +
Sbjct 439 HRTRRPPTDPVNSLLSLGYTLLHQNIHSLILAVGLHPHYGNLHVPRSNHPALVSDLIEEF 498
Query 238 RAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKGD 297
RAP++D V+ L+ G+ F+ + + G V+ +A + + + ++++ T+
Sbjct 499 RAPVVDSLVIYLVNSGIFTPEDFTPSDERGGVYIYSDALKKYLKHWHDKLSLKTTH---- 554
Query 298 PHR-YTFQYALDLQLQ 312
PH Y Y L+LQ
Sbjct 555 PHTGYKVSYYRCLELQ 570
>gi|320161859|ref|YP_004175084.1| hypothetical protein ANT_24580 [Anaerolinea thermophila UNI-1]
gi|319995713|dbj|BAJ64484.1| hypothetical protein ANT_24580 [Anaerolinea thermophila UNI-1]
Length=344
Score = 147 bits (372), Expect = 2e-33, Method: Compositional matrix adjust.
Identities = 94/296 (32%), Positives = 146/296 (50%), Gaps = 11/296 (3%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGE-----SQYPIETLDGITLFGRPTMTTPFIVE 55
M LYV S++ + RV V +E E +Q PI + I LFG +TTP +
Sbjct 8 MPPLYVVQQNSKLRLNNRRVQV-EQETDEGIQVLAQIPIGQVSEIILFGNVGLTTPLMDA 66
Query 56 MLKRERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQ 115
+L + T DG Y+G +S + P R Q + P+F L ++K V K+ +Q
Sbjct 67 LLYEGIPVIFLTRDGDYRGILSGGLTPHVPLRRAQYRALEKPSFSLEMAKGFVRAKLRHQ 126
Query 116 QALI----RAHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVP 171
+ L+ R ++E I M+H++ V R SL+ L G EG+A AYF+ L L
Sbjct 127 RTLLQRQNRPPKQDASLSEVIERMEHAIDEVQRKTSLSSLRGLEGSATAAYFSGLRQLFN 186
Query 172 QEFAFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLAS 231
E+ F R RPP D N ++SLGY+LL ++ + A++ L+ Y GFLH+ + L
Sbjct 187 PEWKFDARLRRPPPDPVNVLLSLGYTLLAQDCVAAVQAVGLDPYAGFLHEVAYNRPALGL 246
Query 232 DLMEVWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRI 287
DL+E +R P++D VL + G + + F+ V + R +AF R+
Sbjct 247 DLLEEFR-PLVDGVVLWICHSGKICPQQFTPGPPERPVVLDDQGKRDFIKAFEERM 301
>gi|312899092|ref|ZP_07758470.1| CRISPR-associated protein Cas1 [Megasphaera micronuciformis F0359]
gi|310619759|gb|EFQ03341.1| CRISPR-associated protein Cas1 [Megasphaera micronuciformis F0359]
Length=346
Score = 144 bits (364), Expect = 1e-32, Method: Compositional matrix adjust.
Identities = 95/300 (32%), Positives = 148/300 (50%), Gaps = 14/300 (4%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M +Y+++ ++IS G I+ + P E L+ +TL GR ++ I +L++E
Sbjct 18 MSHVYITEDGAKISKRGGHFILSRNSEVLFEIPEEGLESLTLIGRVQLSATVIERLLQKE 77
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
+ + G++ GR+ + A + +QV T + LS+ K ++ KI NQQ L+R
Sbjct 78 IPVTWLSKGGYFFGRLESTRHCNAVKQAKQVVLTGGSLY-LSMGKSMIEAKIHNQQVLLR 136
Query 121 AHT------SGQDVAESIRTMKHSLAWV-DRSGSLAELNGFEGNAAKAYFTALGHLVPQE 173
+ S + E + +KH + V +RS EL G EG AA+ YF AL L+ E
Sbjct 137 RYNRELESDSVRQKIEQLSRIKHKIMQVPNRS----ELMGNEGLAARIYFDALSELIEPE 192
Query 174 FAFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDL 233
F F GR+ RPP D FN+++ GY+LL + A+ L+ Y G LH H LASDL
Sbjct 193 FRFNGRTKRPPQDPFNAVIGFGYTLLLYELYTALSNVGLHPYFGCLHALKHRHPALASDL 252
Query 234 MEVWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATY 293
ME WR IID + L + F + D V+ RE + +A+ R+ + Y
Sbjct 253 MEEWRPVIIDSLAMSLFNRHQLKAEHFERTED--GVYLNREGRYTFLQAYEKRLRTSNKY 310
>gi|219850296|ref|YP_002464729.1| CRISPR-associated protein Cas1 [Chloroflexus aggregans DSM 9485]
gi|219544555|gb|ACL26293.1| CRISPR-associated protein Cas1 [Chloroflexus aggregans DSM 9485]
Length=339
Score = 144 bits (363), Expect = 2e-32, Method: Compositional matrix adjust.
Identities = 100/326 (31%), Positives = 160/326 (50%), Gaps = 8/326 (2%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LYV + + I R+ V P+ LD I +FG ++TP + +L R
Sbjct 1 MATLYVIEQGAEIGCDGERIEVRRGADIIGSVPLVKLDDIVIFGNVGISTPAMKRLLDRG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIR 120
++ T DG YQGR+ ++ Q DPA L+L++R V K+ NQ+AL++
Sbjct 61 IEVTFMTVDGRYQGRLIGQVTAHVALRHAQYACAADPARALALAQRFVEGKLRNQRALLQ 120
Query 121 AHTSGQ-----DVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFA 175
+ + + + ++ + V R+ L+ L G EG+A YF L L+ E++
Sbjct 121 RFSRNRAEPPPEAQAAADDLEAYIKRVKRTTQLSSLLGVEGSATARYFAGLRSLIGPEWS 180
Query 176 FQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLME 235
F GR RPP D N ++SLGY+LL ++GA++ + Y+GFLH G +LA D+ME
Sbjct 181 FSGRQRRPPPDPVNLLLSLGYTLLAHKVLGAVQAAGFDPYLGFLHSLDYGRPSLALDIME 240
Query 236 VWRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIK 295
+R +ID V+R+ DG + F + T E R+ AF R+ AT+ +
Sbjct 241 EFRPILIDSLVVRICNDGRIRPEHFRPGEGERPIIITDEGKRAFLTAFEERMRTEATHPE 300
Query 296 G---DPHRYTFQYALDLQLQSLVRVI 318
G P + + + LQ + L RV+
Sbjct 301 GADSGPGKVPYTRCIALQARRLARVV 326
>gi|337286709|ref|YP_004626182.1| CRISPR-associated protein Cas1 [Thermodesulfatator indicus DSM
15286]
gi|335359537|gb|AEH45218.1| CRISPR-associated protein Cas1 [Thermodesulfatator indicus DSM
15286]
Length=338
Score = 142 bits (358), Expect = 9e-32, Method: Compositional matrix adjust.
Identities = 90/326 (28%), Positives = 151/326 (47%), Gaps = 10/326 (3%)
Query 4 LYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRERDI 63
LY+++ ++ R+ + E+ ++ ++ LD I +FGR + + +LK E +
Sbjct 3 LYITEQGLKVRKEGQRLQFYKEKNVVREFRLDDLDEIYVFGRLNFSAAALQALLKHEIKV 62
Query 64 QLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALIRAHT 123
T G Y GR++ P Q D+ L +++ +++ KI NQ+ +R
Sbjct 63 HFLTASGKYLGRLAPPRGKNVELRLAQFRAFDNEKRRLEIARAVIAGKIRNQKNFLRRQN 122
Query 124 ---SGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQE-FAFQGR 179
+ + ++I ++H + + + SL L G EG AA+ YF G L E F GR
Sbjct 123 RKLKNEKIGQAILKLRHKIKEAEDAQSLESLRGIEGQAAQVYFDVFGKLFQVEGLKFPGR 182
Query 180 STRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWRA 239
RPP D N+++SLGY+LL+ I E + Y+GFLH G +L DL E WR
Sbjct 183 IRRPPPDPINALLSLGYTLLFAQIWSVAESTGFDPYLGFLHVPEYGRPSLVLDLAEEWRP 242
Query 240 PIIDDTVLRLIADGVVDTRAFSK-----NSDTGAVFATREATRSIARAFGNRIARTATYI 294
I+D V+RL V F++ D + T + R F R+ A Y
Sbjct 243 LIVDSLVVRLFNWKAVKPEDFTEEPWDDEEDFTSFKLTPDGLRKFLAKFRERLDEEALYA 302
Query 295 KGDPHRYTFQYALDLQLQSLVRVIEA 320
+ R +++Y + Q+ L RV++
Sbjct 303 PLN-KRLSYRYIMQQQVWHLARVLDG 327
>gi|312278318|gb|ADQ62975.1| CRISPR-associated protein, Cas1 family [Streptococcus thermophilus
ND03]
Length=266
Score = 141 bits (356), Expect = 1e-31, Method: Compositional matrix adjust.
Identities = 84/255 (33%), Positives = 137/255 (54%), Gaps = 2/255 (0%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELG-ESQYPIETLDGITLFGRPTMTTPFIVEMLKR 59
M LY S +S ++ R+I+ ++ + I +D + LFG +TT I + K
Sbjct 1 MSDLYSQRSNYYLSLSEQRIIIKNDNKEIVKEVSISLVDNVLLFGNAQLTTQLIKALSKN 60
Query 60 ERDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI 119
+ + F+ G + I T + Q + F L +++ I + K+ NQ AL+
Sbjct 61 KVNGYYFSNVGQFISSIETHRQDEFQKQELQAKAYFEEDFRLEVARSIATTKVRNQIALL 120
Query 120 RAH-TSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAFQG 178
R T G + S++ + ++ S+ E+ G+EG AK+YF L LVP +F F G
Sbjct 121 REFDTDGVLDTSDYSRFEDSVSDIQKAYSITEIMGYEGRLAKSYFYYLNLLVPDDFHFNG 180
Query 179 RSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEVWR 238
RS RP D FNS ++ GYS+LY ++G I+++ L+ G +H+ + HATLASDL+E WR
Sbjct 181 RSRRPAEDCFNSALNFGYSILYSCLMGLIKKNGLSLGFGVIHKHHQHHATLASDLIEEWR 240
Query 239 APIIDDTVLRLIADG 253
I+D+T++ LI +G
Sbjct 241 PIIVDNTLMELIRNG 255
>gi|292669134|ref|ZP_06602560.1| CRISPR-associated fusion protein cas1 [Selenomonas noxia ATCC
43541]
gi|292649186|gb|EFF67158.1| CRISPR-associated fusion protein cas1 [Selenomonas noxia ATCC
43541]
Length=281
Score = 140 bits (353), Expect = 3e-31, Method: Compositional matrix adjust.
Identities = 86/250 (35%), Positives = 133/250 (54%), Gaps = 8/250 (3%)
Query 53 IVEMLKRERDIQLFTTDGHYQGRI-STPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRK 111
+VE+L+ + + G++ GR+ ST V+ + RQ + R D F L++++++++ K
Sbjct 1 MVELLRLGIPVTWLSRTGYFFGRLESTRHVNVFRQERQILLR--DSFFYLAMARKVIAAK 58
Query 112 ILNQQALIRAHTSGQDVAESIRTMKHSLAW---VDRSGSLAELNGFEGNAAKAYFTALGH 168
NQ L+R + + E M A + R + +L G+EG AK YF ALG
Sbjct 59 AHNQFILLRRYNRSASLPEVRTAMAEITALSKHIPRCETNTQLMGYEGAIAKVYFRALGL 118
Query 169 LVPQEFAFQGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHAT 228
LVP+ FAF RS RPP+D FN+M+S GY+LL ++ + L+ Y GFLH H
Sbjct 119 LVPEAFAFVKRSRRPPMDPFNTMLSFGYTLLMYDLYTVVNNEGLHPYFGFLHALKNRHPA 178
Query 229 LASDLMEVWRAPIIDDTVLRLIADGVVDTRAFSKNSDTG--AVFATREATRSIARAFGNR 286
LASDLME WR ++D VL L+ + F+ + + G +F TRE RA+ +
Sbjct 179 LASDLMEEWRPVLVDAMVLSLVHHHEMRPEHFAPSEEEGRPGIFLTREGRAIFLRAYEKK 238
Query 287 IARTATYIKG 296
+ T+ Y G
Sbjct 239 MRATSLYGGG 248
>gi|328953000|ref|YP_004370334.1| CRISPR-associated protein Cas1 [Desulfobacca acetoxidans DSM
11109]
gi|328453324|gb|AEB09153.1| CRISPR-associated protein Cas1 [Desulfobacca acetoxidans DSM
11109]
Length=333
Score = 139 bits (351), Expect = 5e-31, Method: Compositional matrix adjust.
Identities = 88/281 (32%), Positives = 139/281 (50%), Gaps = 5/281 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M LY+S+ + + R++V E P+ ++ + +FG TT +L++
Sbjct 1 MAFLYLSEQGACLQKTGERLVVAKEGETLLDLPVGKVEAVLIFGNVQFTTQAAHLLLQQG 60
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILNQQALI- 119
++ LFT G G++++P + Q R DP F L L+K IV K+ N + L+
Sbjct 61 VEMALFTRRGRLVGQLTSPFTKNVTLRQAQYDRAADPEFALDLAKIIVGAKLTNSRGLLQ 120
Query 120 ---RAHTSGQDVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQEFAF 176
R H E I + + + S +LA L G EG AA YF L +V F F
Sbjct 121 EFARNHPESGLKGE-IERLTELILQIGGSPNLAALLGLEGAAAHTYFQGLARMVRHGFGF 179
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
GR P D N+++SLGY+L+Y I ++ + Y+GF HQ GHATLASDL+E
Sbjct 180 SGRQHHPAPDPVNALLSLGYTLVYNEISSLLDGMGFDPYMGFYHQPRYGHATLASDLLEE 239
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATR 277
+RA ++D L LI + V + F ++ +G ++ E +
Sbjct 240 FRALLVDRLTLSLINNRVFGEQDFFRHEPSGGMYLGDEPRK 280
>gi|254417359|ref|ZP_05031101.1| CRISPR-associated protein Cas1 [Microcoleus chthonoplastes PCC
7420]
gi|196175794|gb|EDX70816.1| CRISPR-associated protein Cas1 [Microcoleus chthonoplastes PCC
7420]
Length=354
Score = 139 bits (351), Expect = 5e-31, Method: Compositional matrix adjust.
Identities = 92/325 (29%), Positives = 160/325 (50%), Gaps = 5/325 (1%)
Query 1 MVQLYVSDSVSRISFADGRVIVWSEELGESQYPIETLDGITLFGRPTMTTPFIVEMLKRE 60
M +Y+ + + I R I++ E + + PI + I +FG ++TP + L+ +
Sbjct 21 MAAIYLIEQGTTIYKEYQRFIIYVSEKPKLEVPIREVQQILVFGNIQLSTPVMQVCLREQ 80
Query 61 RDIQLFTTDGHYQGRISTPDVSYAPRLRQQVHRTDDPAFCLSLSKRIVSRKILN-QQALI 119
+ + G Y G + + + + QV R D AF +S+ IV K++N +Q L+
Sbjct 81 IAVVFLSQSGRYHGHLWSSEFRDLDQELVQVRRWGDAAFQFQVSQAIVYGKLMNSKQLLL 140
Query 120 RAHTSGQ--DVAESIRTMKHSLAWVDRSGSLAELNGFEGNAAKAYFTALGHLVPQ-EFAF 176
R + + DV +I + + ++ S SL L G+EG A YF ALG L+ F F
Sbjct 141 RFNRKRKLPDVERAIIGINQDIEALEFSESLDRLRGYEGIGAARYFPALGQLITNSRFEF 200
Query 177 QGRSTRPPLDAFNSMVSLGYSLLYKNIIGAIERHSLNAYIGFLHQDSRGHATLASDLMEV 236
R+ +PP D NS++S GY+LL+ N++G I L+ Y+G H R LA DLME
Sbjct 201 SLRNRQPPTDPVNSLLSFGYTLLFNNVLGFIIAEGLSPYLGNFHYGERQKPYLAFDLMEE 260
Query 237 WRAPIIDDTVLRLIADGVVDTRAFSKNSDTGAVFATREATRSIARAFGNRIARTATYIKG 296
R+ ++D VL ++ + + F TG V+ + A R + F R+ ++
Sbjct 261 MRSVVVDSLVLNIVNHSLFKPQDFDTVPSTGGVYLNQSARRVFLKQFETRMNEEVSH-PD 319
Query 297 DPHRYTFQYALDLQLQSLVRVIEAG 321
+ T++ A+ LQ++ + + +G
Sbjct 320 LQSKVTYRQAIQLQVRRYKQSLLSG 344
Lambda K H
0.321 0.135 0.390
Gapped
Lambda K H
0.267 0.0410 0.140
Effective search space used: 611369523864
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
Posted date: Sep 5, 2011 4:36 AM
Number of letters in database: 5,219,829,388
Number of sequences in database: 15,229,318
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40