BLASTP 2.2.25+
Reference:
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
15,229,318 sequences; 5,219,829,388 total letters
Query= Rv3122
Length=156
Score E
Sequences producing significant alignments: (Bits) Value
gi|254233746|ref|ZP_04927071.1| hypothetical protein TBCG_03059 ... 322 1e-86
gi|15610259|ref|NP_217638.1| hypothetical protein Rv3122 [Mycoba... 321 3e-86
gi|31794297|ref|NP_856790.1| hypothetical protein Mb3145 [Mycoba... 319 7e-86
gi|167970213|ref|ZP_02552490.1| hypothetical protein MtubH3_2018... 288 2e-76
gi|289575834|ref|ZP_06456061.1| conserved hypothetical protein [... 287 5e-76
gi|339633129|ref|YP_004724771.1| hypothetical protein MAF_31290 ... 205 1e-51
gi|308271934|emb|CBX28542.1| hypothetical protein N47_G38660 [un... 112 1e-23
gi|186474617|ref|YP_001863588.1| hypothetical protein Bphy_7610 ... 66.2 1e-09
gi|255595127|ref|XP_002536233.1| conserved hypothetical protein ... 60.5 8e-08
gi|308272218|emb|CBX28824.1| unknown protein [uncultured Desulfo... 56.2 2e-06
gi|258652785|ref|YP_003201941.1| hypothetical protein Namu_2600 ... 49.7 2e-04
gi|218289478|ref|ZP_03493706.1| hypothetical protein AaLAA1DRAFT... 45.1 0.004
gi|257793012|ref|YP_003186411.1| hypothetical protein Aaci_3019 ... 42.7 0.018
gi|340959603|gb|EGS20784.1| helicase-like protein [Chaetomium th... 35.8 2.4
gi|156100605|ref|XP_001616030.1| erythrocyte membrane protein 3 ... 34.3 7.0
gi|301101531|ref|XP_002899854.1| conserved hypothetical protein ... 34.3 7.1
>gi|254233746|ref|ZP_04927071.1| hypothetical protein TBCG_03059 [Mycobacterium tuberculosis C]
gi|124599275|gb|EAY58379.1| hypothetical protein TBCG_03059 [Mycobacterium tuberculosis C]
Length=161
Score = 322 bits (825), Expect = 1e-86, Method: Compositional matrix adjust.
Identities = 156/156 (100%), Positives = 156/156 (100%), Gaps = 0/156 (0%)
Query 1 VYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
VYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC
Sbjct 6 VYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 65
Query 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI 120
AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI
Sbjct 66 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI 125
Query 121 CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPPR 156
CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPPR
Sbjct 126 CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPPR 161
>gi|15610259|ref|NP_217638.1| hypothetical protein Rv3122 [Mycobacterium tuberculosis H37Rv]
gi|15842694|ref|NP_337731.1| hypothetical protein MT3205 [Mycobacterium tuberculosis CDC1551]
gi|148662976|ref|YP_001284499.1| hypothetical protein MRA_3154 [Mycobacterium tuberculosis H37Ra]
32 more sequence titles
Length=156
Score = 321 bits (822), Expect = 3e-86, Method: Compositional matrix adjust.
Identities = 155/156 (99%), Positives = 156/156 (100%), Gaps = 0/156 (0%)
Query 1 VYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
+YSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC
Sbjct 1 MYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
Query 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI 120
AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI
Sbjct 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI 120
Query 121 CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPPR 156
CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPPR
Sbjct 121 CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPPR 156
>gi|31794297|ref|NP_856790.1| hypothetical protein Mb3145 [Mycobacterium bovis AF2122/97]
gi|121639003|ref|YP_979227.1| hypothetical protein BCG_3143 [Mycobacterium bovis BCG str. Pasteur
1173P2]
gi|224991495|ref|YP_002646184.1| hypothetical protein JTY_3138 [Mycobacterium bovis BCG str. Tokyo
172]
gi|31619892|emb|CAD96832.1| HYPOTHETICAL PROTEIN Mb3145 [Mycobacterium bovis AF2122/97]
gi|121494651|emb|CAL73132.1| Hypothetical protein BCG_3143 [Mycobacterium bovis BCG str. Pasteur
1173P2]
gi|224774610|dbj|BAH27416.1| hypothetical protein JTY_3138 [Mycobacterium bovis BCG str. Tokyo
172]
gi|341603042|emb|CCC65720.1| hypothetical protein BCGM_3127 [Mycobacterium bovis BCG str.
Moreau RDJ]
Length=156
Score = 319 bits (818), Expect = 7e-86, Method: Compositional matrix adjust.
Identities = 154/155 (99%), Positives = 155/155 (100%), Gaps = 0/155 (0%)
Query 1 VYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
+YSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC
Sbjct 1 MYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
Query 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI 120
AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI
Sbjct 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAI 120
Query 121 CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPP 155
CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPP
Sbjct 121 CEARPPNPAATAPPAGTTGHKKGGSATRSRRSSPP 155
>gi|167970213|ref|ZP_02552490.1| hypothetical protein MtubH3_20183 [Mycobacterium tuberculosis
H37Ra]
gi|254552206|ref|ZP_05142653.1| hypothetical protein Mtube_17423 [Mycobacterium tuberculosis
'98-R604 INH-RIF-EM']
gi|294995484|ref|ZP_06801175.1| hypothetical protein Mtub2_13472 [Mycobacterium tuberculosis
210]
30 more sequence titles
Length=141
Score = 288 bits (737), Expect = 2e-76, Method: Compositional matrix adjust.
Identities = 140/141 (99%), Positives = 141/141 (100%), Gaps = 0/141 (0%)
Query 16 VGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWT 75
+GEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWT
Sbjct 1 MGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWT 60
Query 76 RTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPA 135
RTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPA
Sbjct 61 RTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPA 120
Query 136 GTTGHKKGGSATRSRRSSPPR 156
GTTGHKKGGSATRSRRSSPPR
Sbjct 121 GTTGHKKGGSATRSRRSSPPR 141
>gi|289575834|ref|ZP_06456061.1| conserved hypothetical protein [Mycobacterium tuberculosis K85]
gi|289540265|gb|EFD44843.1| conserved hypothetical protein [Mycobacterium tuberculosis K85]
Length=141
Score = 287 bits (734), Expect = 5e-76, Method: Compositional matrix adjust.
Identities = 139/140 (99%), Positives = 140/140 (100%), Gaps = 0/140 (0%)
Query 16 VGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWT 75
+GEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWT
Sbjct 1 MGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWT 60
Query 76 RTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPA 135
RTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPA
Sbjct 61 RTVAGRGTKGRQLSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPA 120
Query 136 GTTGHKKGGSATRSRRSSPP 155
GTTGHKKGGSATRSRRSSPP
Sbjct 121 GTTGHKKGGSATRSRRSSPP 140
>gi|339633129|ref|YP_004724771.1| hypothetical protein MAF_31290 [Mycobacterium africanum GM041182]
gi|339332485|emb|CCC28199.1| hypothetical protein MAF_31290 [Mycobacterium africanum GM041182]
Length=156
Score = 205 bits (522), Expect = 1e-51, Method: Compositional matrix adjust.
Identities = 97/98 (99%), Positives = 98/98 (100%), Gaps = 0/98 (0%)
Query 1 VYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
+YSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC
Sbjct 1 MYSGCWINNQNGETRVGEDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVC 60
Query 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRA 98
AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRA
Sbjct 61 AQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEVDKVRA 98
>gi|308271934|emb|CBX28542.1| hypothetical protein N47_G38660 [uncultured Desulfobacterium
sp.]
gi|308271952|emb|CBX28560.1| hypothetical protein N47_G38840 [uncultured Desulfobacterium
sp.]
gi|308272635|emb|CBX29239.1| hypothetical protein N47_J02200 [uncultured Desulfobacterium
sp.]
gi|308273746|emb|CBX30348.1| hypothetical protein N47_D31570 [uncultured Desulfobacterium
sp.]
gi|308275192|emb|CBX31789.1| hypothetical protein N47_N26140 [uncultured Desulfobacterium
sp.]
Length=128
Score = 112 bits (281), Expect = 1e-23, Method: Compositional matrix adjust.
Identities = 51/98 (53%), Positives = 65/98 (67%), Gaps = 0/98 (0%)
Query 28 RARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWTRTVAGRGTKGRQ 87
R LY QL TGDFRRG IS YR+CGK NC CA+EGHPGHGP++LW T+ G+
Sbjct 3 REHLYRQLQETGDFRRGIISVVYRKCGKKNCACAKEGHPGHGPQHLWNTTIKGKSYAKSV 62
Query 88 LSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARP 125
E+ K E+AN+ R+ ++ E+IV VNE IC+ RP
Sbjct 63 KLGPELQKYLEEIANHQRYVKLCEEIVLVNERICDLRP 100
>gi|186474617|ref|YP_001863588.1| hypothetical protein Bphy_7610 [Burkholderia phymatum STM815]
gi|184198576|gb|ACC76538.1| conserved hypothetical protein [Burkholderia phymatum STM815]
Length=110
Score = 66.2 bits (160), Expect = 1e-09, Method: Compositional matrix adjust.
Identities = 35/82 (43%), Positives = 46/82 (57%), Gaps = 2/82 (2%)
Query 24 LEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWTRTVAGRGT 83
L +RRA+L Q+ A RGS+ E Y+RCGKP C CA PGHGP+Y + + GR
Sbjct 9 LRRRRAQLLKQMPALETLLRGSLIERYKRCGKPGCKCADG--PGHGPKYYLSVSFPGRRP 66
Query 84 KGRQLSVEEVDKVRAELANYHR 105
+ + + V LANYHR
Sbjct 67 QMDYVPQADYTDVAEHLANYHR 88
>gi|255595127|ref|XP_002536233.1| conserved hypothetical protein [Ricinus communis]
gi|223520361|gb|EEF26150.1| conserved hypothetical protein [Ricinus communis]
Length=110
Score = 60.5 bits (145), Expect = 8e-08, Method: Compositional matrix adjust.
Identities = 36/89 (41%), Positives = 47/89 (53%), Gaps = 3/89 (3%)
Query 18 EDSLED-LEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWTR 76
ED L L +RRA L Q+ A RGS+ E Y+RCGKP C CA PGHGP+Y +
Sbjct 2 EDILSSTLRKRRAELLRQMPALDTLLRGSLIERYKRCGKPGCKCAD--GPGHGPKYYLSV 59
Query 77 TVAGRGTKGRQLSVEEVDKVRAELANYHR 105
+ GR + + + V L +YHR
Sbjct 60 SFPGRRPQMDYVPQADHADVVERLESYHR 88
>gi|308272218|emb|CBX28824.1| unknown protein [uncultured Desulfobacterium sp.]
Length=70
Score = 56.2 bits (134), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 29/67 (44%), Positives = 37/67 (56%), Gaps = 0/67 (0%)
Query 18 EDSLEDLEQRRARLYDQLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWTRT 77
E+++E LE R LY QL TGDFRRG IS YR+CGK + G G G T+
Sbjct 2 EETIESLEIEREHLYRQLQETGDFRRGIISVVYRKCGKKTAHVLKRGIQGMGRNIYGTQQ 61
Query 78 VAGRGTK 84
G+ T+
Sbjct 62 SKGKATR 68
>gi|258652785|ref|YP_003201941.1| hypothetical protein Namu_2600 [Nakamurella multipartita DSM
44233]
gi|258652834|ref|YP_003201990.1| hypothetical protein Namu_2654 [Nakamurella multipartita DSM
44233]
gi|258556010|gb|ACV78952.1| hypothetical protein Namu_2600 [Nakamurella multipartita DSM
44233]
gi|258556059|gb|ACV79001.1| hypothetical protein Namu_2654 [Nakamurella multipartita DSM
44233]
Length=112
Score = 49.7 bits (117), Expect = 2e-04, Method: Compositional matrix adjust.
Identities = 28/92 (31%), Positives = 47/92 (52%), Gaps = 1/92 (1%)
Query 34 QLAATGDFRRGSISENYRRCGKPNCVCAQEGHPGHGPRYLWTRTVAGRGTKGRQLSVEEV 93
QLAA G GS++ RCGKP+C C + H P + WTR +AGR T R+L+ +++
Sbjct 17 QLAALGFVLPGSVTHRQARCGKPSCRCHADPPVLHPPFWSWTRKLAGR-TVTRRLTEDQL 75
Query 94 DKVRAELANYHRFAQVSEQIVAVNEAICEARP 125
+ N R + ++ ++ + + P
Sbjct 76 RDYQPWFDNSRRLRALVTELEDLSLRVLDQDP 107
>gi|218289478|ref|ZP_03493706.1| hypothetical protein AaLAA1DRAFT_1292 [Alicyclobacillus acidocaldarius
LAA1]
gi|218240346|gb|EED07528.1| hypothetical protein AaLAA1DRAFT_1292 [Alicyclobacillus acidocaldarius
LAA1]
Length=300
Score = 45.1 bits (105), Expect = 0.004, Method: Compositional matrix adjust.
Identities = 33/110 (30%), Positives = 54/110 (50%), Gaps = 23/110 (20%)
Query 45 SISENYRRCGKPNCVCAQEGHPGHGPRYLWTRTVAGRG---------------TKGRQLS 89
SI+ YR+CGKP C +EG PGHGP + ++TV GR T+ +
Sbjct 9 SIAAQYRKCGKPTCRVCREG-PGHGPYWYGSKTVDGRRISKYFGKTPPMAQEITQDKPSV 67
Query 90 VEEVDKVRAELANYH-RFAQVSEQIVAVNEAICEARPPNPAATAPPAGTT 138
++E+ ++R E A+ + AQ+ ++ A+ PP P+ + P T
Sbjct 68 LQELAQLREENASLRAQVAQLQAELAALRT------PPAPSNSPHPLEET 111
>gi|257793012|ref|YP_003186411.1| hypothetical protein Aaci_3019 [Alicyclobacillus acidocaldarius
subsp. acidocaldarius DSM 446]
gi|257479704|gb|ACV60022.1| hypothetical protein Aaci_3019 [Alicyclobacillus acidocaldarius
subsp. acidocaldarius DSM 446]
Length=300
Score = 42.7 bits (99), Expect = 0.018, Method: Compositional matrix adjust.
Identities = 33/100 (33%), Positives = 49/100 (49%), Gaps = 17/100 (17%)
Query 45 SISENYRRCGKPNCVCAQEGHPGHGPRYLWTRTVAGRGTK---GRQLSVE---------- 91
SI+ YRRCGK C +EG PGHGP + ++TV GR G+ VE
Sbjct 9 SIAAQYRRCGKSACRVCREG-PGHGPYWYGSKTVDGRRLTKYFGKVPPVEQEIAQDEPSV 67
Query 92 --EVDKVRAELANYH-RFAQVSEQIVAVNEAICEARPPNP 128
E+ ++R E N + AQ+ ++ A+ + PP+P
Sbjct 68 LKELSRLREENENLRAQVAQLQAELAALRTPPALSDPPHP 107
>gi|340959603|gb|EGS20784.1| helicase-like protein [Chaetomium thermophilum var. thermophilum
DSM 1495]
Length=1886
Score = 35.8 bits (81), Expect = 2.4, Method: Composition-based stats.
Identities = 17/31 (55%), Positives = 18/31 (59%), Gaps = 0/31 (0%)
Query 125 PPNPAATAPPAGTTGHKKGGSATRSRRSSPP 155
P NPA P+G T G S T SRRSSPP
Sbjct 674 PMNPALFQKPSGGTYSLPGASQTNSRRSSPP 704
>gi|156100605|ref|XP_001616030.1| erythrocyte membrane protein 3 [Plasmodium vivax SaI-1]
gi|148804904|gb|EDL46303.1| erythrocyte membrane protein 3, putative [Plasmodium vivax]
Length=1034
Score = 34.3 bits (77), Expect = 7.0, Method: Composition-based stats.
Identities = 16/34 (48%), Positives = 20/34 (59%), Gaps = 0/34 (0%)
Query 122 EARPPNPAATAPPAGTTGHKKGGSATRSRRSSPP 155
E P NP +PP+G + K SAT S +SSPP
Sbjct 105 EGMPDNPLTKSPPSGLKANIKDQSATHSLKSSPP 138
>gi|301101531|ref|XP_002899854.1| conserved hypothetical protein [Phytophthora infestans T30-4]
gi|262102856|gb|EEY60908.1| conserved hypothetical protein [Phytophthora infestans T30-4]
Length=543
Score = 34.3 bits (77), Expect = 7.1, Method: Compositional matrix adjust.
Identities = 25/63 (40%), Positives = 34/63 (54%), Gaps = 6/63 (9%)
Query 88 LSVEEVDKVRAELANYHRFAQVSEQIVAVNEAICEARPPNPAATAPPAGTTGHKKGGSAT 147
LS E VD++ AEL+ + + V E +A+ + PP +ATA G G GGSAT
Sbjct 208 LSPEFVDRLAAELSRWQFWVDVREMYIAL---LSRQHPPG-SATATEGGPIG--TGGSAT 261
Query 148 RSR 150
SR
Sbjct 262 NSR 264
Lambda K H
0.315 0.131 0.405
Gapped
Lambda K H
0.267 0.0410 0.140
Effective search space used: 130065254832
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
Posted date: Sep 5, 2011 4:36 AM
Number of letters in database: 5,219,829,388
Number of sequences in database: 15,229,318
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40