BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv2998 Length=153 Score E Sequences producing significant alignments: (Bits) Value gi|15610135|ref|NP_217514.1| hypothetical protein Rv2998 [Mycoba... 301 2e-80 gi|148824188|ref|YP_001288942.1| hypothetical protein TBFG_13013... 299 1e-79 gi|340627987|ref|YP_004746439.1| hypothetical protein MCAN_30201... 276 6e-73 gi|15842554|ref|NP_337591.1| hypothetical protein MT3076 [Mycoba... 266 7e-70 gi|308232310|ref|ZP_07664053.1| hypothetical protein TMAG_01205 ... 254 4e-66 gi|308369924|ref|ZP_07419540.2| hypothetical protein TMBG_03149 ... 238 3e-61 gi|167967836|ref|ZP_02550113.1| hypothetical protein MtubH3_0730... 223 7e-57 gi|254552076|ref|ZP_05142523.1| hypothetical protein Mtube_16732... 137 4e-31 gi|298526468|ref|ZP_07013877.1| hypothetical protein TBAG_01887 ... 101 4e-20 gi|288921100|ref|ZP_06415389.1| hypothetical protein FrEUN1fDRAF... 42.0 0.031 gi|71907444|ref|YP_285031.1| formate dehydrogenase accessory pro... 37.4 0.75 >gi|15610135|ref|NP_217514.1| hypothetical protein Rv2998 [Mycobacterium tuberculosis H37Rv] gi|31794174|ref|NP_856667.1| hypothetical protein Mb3022 [Mycobacterium bovis AF2122/97] gi|121638879|ref|YP_979103.1| hypothetical protein BCG_3019 [Mycobacterium bovis BCG str. Pasteur 1173P2] 39 more sequence titlesLength=153 Score = 301 bits (771), Expect = 2e-80, Method: Compositional matrix adjust. Identities = 152/153 (99%), Positives = 153/153 (100%), Gaps = 0/153 (0%) Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60 +DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60 Query 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE Sbjct 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120 Query 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN Sbjct 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153 >gi|148824188|ref|YP_001288942.1| hypothetical protein TBFG_13013 [Mycobacterium tuberculosis F11] gi|148722715|gb|ABR07340.1| hypothetical protein TBFG_13013 [Mycobacterium tuberculosis F11] Length=153 Score = 299 bits (765), Expect = 1e-79, Method: Compositional matrix adjust. Identities = 151/153 (99%), Positives = 152/153 (99%), Gaps = 0/153 (0%) Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60 +DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60 Query 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESR DCGE Sbjct 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRPDCGE 120 Query 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN Sbjct 121 QFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153 >gi|340627987|ref|YP_004746439.1| hypothetical protein MCAN_30201 [Mycobacterium canettii CIPT 140010059] gi|340006177|emb|CCC45351.1| hypothetical protein MCAN_30201 [Mycobacterium canettii CIPT 140010059] Length=154 Score = 276 bits (707), Expect = 6e-73, Method: Compositional matrix adjust. Identities = 140/142 (99%), Positives = 141/142 (99%), Gaps = 0/142 (0%) Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60 +DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGL 60 Query 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE Sbjct 61 SSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGE 120 Query 121 QFGVASWTPQGEFEFGGQEAKG 142 QFG ASWTPQGEFEFGGQEAKG Sbjct 121 QFGDASWTPQGEFEFGGQEAKG 142 >gi|15842554|ref|NP_337591.1| hypothetical protein MT3076 [Mycobacterium tuberculosis CDC1551] gi|13882866|gb|AAK47405.1| hypothetical protein MT3076 [Mycobacterium tuberculosis CDC1551] Length=186 Score = 266 bits (680), Expect = 7e-70, Method: Compositional matrix adjust. Identities = 134/135 (99%), Positives = 135/135 (100%), Gaps = 0/135 (0%) Query 19 KPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAV 78 +PRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAV Sbjct 52 QPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAV 111 Query 79 SGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQ 138 SGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQ Sbjct 112 SGGDAGVVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQ 171 Query 139 EAKGVRSSWPASLTN 153 EAKGVRSSWPASLTN Sbjct 172 EAKGVRSSWPASLTN 186 >gi|308232310|ref|ZP_07664053.1| hypothetical protein TMAG_01205 [Mycobacterium tuberculosis SUMu001] gi|308214342|gb|EFO73741.1| hypothetical protein TMAG_01205 [Mycobacterium tuberculosis SUMu001] Length=129 Score = 254 bits (648), Expect = 4e-66, Method: Compositional matrix adjust. Identities = 129/129 (100%), Positives = 129/129 (100%), Gaps = 0/129 (0%) Query 25 MPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAG 84 MPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAG Sbjct 1 MPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAG 60 Query 85 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 144 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR Sbjct 61 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 120 Query 145 SSWPASLTN 153 SSWPASLTN Sbjct 121 SSWPASLTN 129 >gi|308369924|ref|ZP_07419540.2| hypothetical protein TMBG_03149 [Mycobacterium tuberculosis SUMu002] gi|308325986|gb|EFP14837.1| hypothetical protein TMBG_03149 [Mycobacterium tuberculosis SUMu002] Length=121 Score = 238 bits (606), Expect = 3e-61, Method: Compositional matrix adjust. Identities = 121/121 (100%), Positives = 121/121 (100%), Gaps = 0/121 (0%) Query 33 MVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGV 92 MVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGV Sbjct 1 MVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGV 60 Query 93 DDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLT 152 DDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLT Sbjct 61 DDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLT 120 Query 153 N 153 N Sbjct 121 N 121 >gi|167967836|ref|ZP_02550113.1| hypothetical protein MtubH3_07307 [Mycobacterium tuberculosis H37Ra] Length=114 Score = 223 bits (568), Expect = 7e-57, Method: Compositional matrix adjust. Identities = 114/114 (100%), Positives = 114/114 (100%), Gaps = 0/114 (0%) Query 40 MSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKP 99 MSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKP Sbjct 1 MSIRLAGDSTLGRFSTSRLGLSSAKSKPEGDFGTACGAVSGGDAGVVALAEGVDDGQSKP 60 Query 100 GAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 153 GAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN Sbjct 61 GAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVRSSWPASLTN 114 >gi|254552076|ref|ZP_05142523.1| hypothetical protein Mtube_16732 [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] gi|289448671|ref|ZP_06438415.1| conserved hypothetical protein [Mycobacterium tuberculosis CPHL_A] gi|289421629|gb|EFD18830.1| conserved hypothetical protein [Mycobacterium tuberculosis CPHL_A] Length=69 Score = 137 bits (346), Expect = 4e-31, Method: Compositional matrix adjust. Identities = 68/69 (99%), Positives = 69/69 (100%), Gaps = 0/69 (0%) Query 85 VVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 144 +VALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR Sbjct 1 MVALAEGVDDGQSKPGAAGGARGVGGFRESRADCGEQFGVASWTPQGEFEFGGQEAKGVR 60 Query 145 SSWPASLTN 153 SSWPASLTN Sbjct 61 SSWPASLTN 69 >gi|298526468|ref|ZP_07013877.1| hypothetical protein TBAG_01887 [Mycobacterium tuberculosis 94_M4241A] gi|298496262|gb|EFI31556.1| hypothetical protein TBAG_01887 [Mycobacterium tuberculosis 94_M4241A] Length=59 Score = 101 bits (251), Expect = 4e-20, Method: Compositional matrix adjust. Identities = 50/51 (99%), Positives = 51/51 (100%), Gaps = 0/51 (0%) Query 1 VDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLG 51 +DVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLG Sbjct 1 MDVIWSATIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLG 51 >gi|288921100|ref|ZP_06415389.1| hypothetical protein FrEUN1fDRAFT_5087 [Frankia sp. EUN1f] gi|288347476|gb|EFC81764.1| hypothetical protein FrEUN1fDRAFT_5087 [Frankia sp. EUN1f] Length=47 Score = 42.0 bits (97), Expect = 0.031, Method: Composition-based stats. Identities = 21/32 (66%), Positives = 22/32 (69%), Gaps = 0/32 (0%) Query 6 SATIATTVATGMRKPRMHGMPPITSGSMVTRV 37 SA I TTVATGMR PRM G P I +G V RV Sbjct 3 SAIIPTTVATGMRNPRMQGTPLIYAGFTVMRV 34 >gi|71907444|ref|YP_285031.1| formate dehydrogenase accessory protein [Dechloromonas aromatica RCB] gi|71847065|gb|AAZ46561.1| Formate dehydrogenase accessory protein [Dechloromonas aromatica RCB] Length=302 Score = 37.4 bits (85), Expect = 0.75, Method: Compositional matrix adjust. Identities = 25/89 (29%), Positives = 38/89 (43%), Gaps = 12/89 (13%) Query 8 TIATTVATGMRKPRMHGMPPITSGSMVTRVTRMSIRLAGDSTLGRFSTSRLGLSSAKSKP 67 T+ A + + R HGMPP+ + S+ T RF+ +L L K+ P Sbjct 66 TLPLPDAASLEQARTHGMPPLNASSL------------SRPTAWRFALQQLALRLEKTAP 113 Query 68 EGDFGTACGAVSGGDAGVVALAEGVDDGQ 96 EG G S DA + LA+ + G+ Sbjct 114 EGAKKALKGLFSASDADLEKLADMLLTGE 142 Lambda K H 0.314 0.131 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 127769454500 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40