BLASTP 2.2.25+
Reference:
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
15,229,318 sequences; 5,219,829,388 total letters
Query= Rv2998
Length=153
Score E
Sequences producing significant alignments: (Bits) Value
gi|15610135|ref|NP_217514.1| hypothetical protein Rv2998 [Mycoba... 301 2e-80
gi|148824188|ref|YP_001288942.1| hypothetical protein TBFG_13013... 299 1e-79
gi|340627987|ref|YP_004746439.1| hypothetical protein MCAN_30201... 276 6e-73
gi|15842554|ref|NP_337591.1| hypothetical protein MT3076 [Mycoba... 266 7e-70
gi|308232310|ref|ZP_07664053.1| hypothetical protein TMAG_01205 ... 254 4e-66
gi|308369924|ref|ZP_07419540.2| hypothetical protein TMBG_03149 ... 238 3e-61
gi|167967836|ref|ZP_02550113.1| hypothetical protein MtubH3_0730... 223 7e-57
gi|254552076|ref|ZP_05142523.1| hypothetical protein Mtube_16732... 137 4e-31
gi|298526468|ref|ZP_07013877.1| hypothetical protein TBAG_01887 ... 101 4e-20
gi|288921100|ref|ZP_06415389.1| hypothetical protein FrEUN1fDRAF... 42.0 0.031
gi|71907444|ref|YP_285031.1| formate dehydrogenase accessory pro... 37.4 0.75
>gi|15610135|ref|NP_217514.1| hypothetical protein Rv2998 [Mycobacterium tuberculosis H37Rv]
gi|31794174|ref|NP_856667.1| hypothetical protein Mb3022 [Mycobacterium bovis AF2122/97]
gi|121638879|ref|YP_979103.1| hypothetical protein BCG_3019 [Mycobacterium bovis BCG str. Pasteur
1173P2]
39 more sequence titles
Length=153
Score = 301 bits (771), Expect = 2e-80, Method: Compositional matrix adjust.
Identities = 152/153 (99%), Positives = 153/153 (100%), Gaps = 0/153 (0%)
Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60
+DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL
Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60
Query 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120
SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE
Sbjct 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120
Query 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153
QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN
Sbjct 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153
>gi|148824188|ref|YP_001288942.1| hypothetical protein TBFG_13013 [Mycobacterium tuberculosis F11]
gi|148722715|gb|ABR07340.1| hypothetical protein TBFG_13013 [Mycobacterium tuberculosis F11]
Length=153
Score = 299 bits (765), Expect = 1e-79, Method: Compositional matrix adjust.
Identities = 151/153 (99%), Positives = 152/153 (99%), Gaps = 0/153 (0%)
Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60
+DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL
Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60
Query 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120
SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESR DCGE
Sbjct 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRPDCGE 120
Query 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153
QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN
Sbjct 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153
>gi|340627987|ref|YP_004746439.1| hypothetical protein MCAN_30201 [Mycobacterium canettii CIPT
140010059]
gi|340006177|emb|CCC45351.1| hypothetical protein MCAN_30201 [Mycobacterium canettii CIPT
140010059]
Length=154
Score = 276 bits (707), Expect = 6e-73, Method: Compositional matrix adjust.
Identities = 140/142 (99%), Positives = 141/142 (99%), Gaps = 0/142 (0%)
Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60
+DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL
Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60
Query 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120
SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE
Sbjct 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120
Query 121 QFGVASWTPQGEFEFGGQEAKG 142
QFG ASWTPQGEFEFGGQEAKG
Sbjct 121 QFGDASWTPQGEFEFGGQEAKG 142
>gi|15842554|ref|NP_337591.1| hypothetical protein MT3076 [Mycobacterium tuberculosis CDC1551]
gi|13882866|gb|AAK47405.1| hypothetical protein MT3076 [Mycobacterium tuberculosis CDC1551]
Length=186
Score = 266 bits (680), Expect = 7e-70, Method: Compositional matrix adjust.
Identities = 134/135 (99%), Positives = 135/135 (100%), Gaps = 0/135 (0%)
Query 19 KPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAV 78
+PRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAV
Sbjct 52 QPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAV 111
Query 79 SGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQ 138
SGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQ
Sbjct 112 SGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQ 171
Query 139 EAKGVRSSWPASLTN 153
EAKGVRSSWPASLTN
Sbjct 172 EAKGVRSSWPASLTN 186
>gi|308232310|ref|ZP_07664053.1| hypothetical protein TMAG_01205 [Mycobacterium tuberculosis SUMu001]
gi|308214342|gb|EFO73741.1| hypothetical protein TMAG_01205 [Mycobacterium tuberculosis SUMu001]
Length=129
Score = 254 bits (648), Expect = 4e-66, Method: Compositional matrix adjust.
Identities = 129/129 (100%), Positives = 129/129 (100%), Gaps = 0/129 (0%)
Query 25 MPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAG 84
MPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAG
Sbjct 1 MPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAG 60
Query 85 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 144
VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR
Sbjct 61 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 120
Query 145 SSWPASLTN 153
SSWPASLTN
Sbjct 121 SSWPASLTN 129
>gi|308369924|ref|ZP_07419540.2| hypothetical protein TMBG_03149 [Mycobacterium tuberculosis SUMu002]
gi|308325986|gb|EFP14837.1| hypothetical protein TMBG_03149 [Mycobacterium tuberculosis SUMu002]
Length=121
Score = 238 bits (606), Expect = 3e-61, Method: Compositional matrix adjust.
Identities = 121/121 (100%), Positives = 121/121 (100%), Gaps = 0/121 (0%)
Query 33 MVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGV 92
MVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGV
Sbjct 1 MVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGV 60
Query 93 DDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLT 152
DDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLT
Sbjct 61 DDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLT 120
Query 153 N 153
N
Sbjct 121 N 121
>gi|167967836|ref|ZP_02550113.1| hypothetical protein MtubH3_07307 [Mycobacterium tuberculosis
H37Ra]
Length=114
Score = 223 bits (568), Expect = 7e-57, Method: Compositional matrix adjust.
Identities = 114/114 (100%), Positives = 114/114 (100%), Gaps = 0/114 (0%)
Query 40 MSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKP 99
MSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKP
Sbjct 1 MSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKP 60
Query 100 GAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153
GAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN
Sbjct 61 GAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 114
>gi|254552076|ref|ZP_05142523.1| hypothetical protein Mtube_16732 [Mycobacterium tuberculosis
'98-R604 INH-RIF-EM']
gi|289448671|ref|ZP_06438415.1| conserved hypothetical protein [Mycobacterium tuberculosis CPHL_A]
gi|289421629|gb|EFD18830.1| conserved hypothetical protein [Mycobacterium tuberculosis CPHL_A]
Length=69
Score = 137 bits (346), Expect = 4e-31, Method: Compositional matrix adjust.
Identities = 68/69 (99%), Positives = 69/69 (100%), Gaps = 0/69 (0%)
Query 85 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 144
+VALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR
Sbjct 1 MVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 60
Query 145 SSWPASLTN 153
SSWPASLTN
Sbjct 61 SSWPASLTN 69
>gi|298526468|ref|ZP_07013877.1| hypothetical protein TBAG_01887 [Mycobacterium tuberculosis 94_M4241A]
gi|298496262|gb|EFI31556.1| hypothetical protein TBAG_01887 [Mycobacterium tuberculosis 94_M4241A]
Length=59
Score = 101 bits (251), Expect = 4e-20, Method: Compositional matrix adjust.
Identities = 50/51 (99%), Positives = 51/51 (100%), Gaps = 0/51 (0%)
Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLG 51
+DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLG
Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLG 51
>gi|288921100|ref|ZP_06415389.1| hypothetical protein FrEUN1fDRAFT_5087 [Frankia sp. EUN1f]
gi|288347476|gb|EFC81764.1| hypothetical protein FrEUN1fDRAFT_5087 [Frankia sp. EUN1f]
Length=47
Score = 42.0 bits (97), Expect = 0.031, Method: Composition-based stats.
Identities = 21/32 (66%), Positives = 22/32 (69%), Gaps = 0/32 (0%)
Query 6 SATIATTVATGMRKPRMHGMPPITSGSMVTRV 37
SA I TTVATGMR PRM G P I +G V RV
Sbjct 3 SAIIPTTVATGMRNPRMQGTPLIYAGFTVMRV 34
>gi|71907444|ref|YP_285031.1| formate dehydrogenase accessory protein [Dechloromonas aromatica
RCB]
gi|71847065|gb|AAZ46561.1| Formate dehydrogenase accessory protein [Dechloromonas aromatica
RCB]
Length=302
Score = 37.4 bits (85), Expect = 0.75, Method: Compositional matrix adjust.
Identities = 25/89 (29%), Positives = 38/89 (43%), Gaps = 12/89 (13%)
Query 8 TIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKP 67
T+ A + + R HGMPP+ + S+ T RF+ +L L K+ P
Sbjct 66 TLPLPDAASLEQARTHGMPPLNASSL------------SRPTAWRFALQQLALRLEKTAP 113
Query 68 EGDFGTACGAVSGGDAGVVALAEGVDDGQ 96
EG G S DA + LA+ + G+
Sbjct 114 EGAKKALKGLFSASDADLEKLADMLLTGE 142
Lambda K H
0.314 0.131 0.392
Gapped
Lambda K H
0.267 0.0410 0.140
Effective search space used: 127769454500
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
Posted date: Sep 5, 2011 4:36 AM
Number of letters in database: 5,219,829,388
Number of sequences in database: 15,229,318
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40