BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv2307B Length=143 Score E Sequences producing significant alignments: (Bits) Value gi|31793488|ref|NP_855981.1| glycine rich protein [Mycobacterium... 288 2e-76 gi|289754410|ref|ZP_06513788.1| glycine rich protein [Mycobacter... 286 7e-76 gi|15841802|ref|NP_336839.1| hypothetical protein MT2365.1 [Myco... 271 2e-71 gi|120404998|ref|YP_954827.1| hypothetical protein Mvan_4044 [My... 62.8 2e-08 gi|15839438|ref|NP_334475.1| hypothetical protein MT0066.1 [Myco... 59.3 2e-07 gi|254233460|ref|ZP_04926786.1| hypothetical protein TBCG_00060 ... 57.8 6e-07 gi|308232627|ref|ZP_07664136.1| hypothetical protein TMAG_00731 ... 55.8 2e-06 gi|118619016|ref|YP_907348.1| hypothetical protein MUL_3771 [Myc... 55.1 4e-06 gi|183983814|ref|YP_001852105.1| hypothetical protein MMAR_3839 ... 53.9 7e-06 gi|183983815|ref|YP_001852106.1| hypothetical protein MMAR_3840 ... 52.8 2e-05 gi|109521896|ref|YP_655333.1| gp56 [Mycobacterium phage Pipefish... 52.8 2e-05 gi|222435724|ref|YP_002564152.1| gp54 [Mycobacterium phage Phlye... 47.4 7e-04 gi|194303248|ref|YP_002014662.1| gp51 [Mycobacterium phage Phaed... 47.4 7e-04 gi|326905821|gb|EGE52754.1| hypothetical protein TBPG_03787 [Myc... 45.8 0.002 gi|240170091|ref|ZP_04748750.1| hypothetical protein MkanA1_1231... 41.2 0.045 gi|29566245|ref|NP_817813.1| gp52 [Mycobacterium phage Rosebush]... 40.8 0.073 gi|296532124|ref|ZP_06894891.1| hypothetical protein HMPREF0731_... 35.0 4.0 gi|120405868|ref|YP_955697.1| hypothetical protein Mvan_4918 [My... 33.5 9.9 >gi|31793488|ref|NP_855981.1| glycine rich protein [Mycobacterium bovis AF2122/97] gi|57116965|ref|YP_177666.1| glycine rich protein [Mycobacterium tuberculosis H37Rv] gi|121638191|ref|YP_978415.1| hypothetical protein BCG_2326c [Mycobacterium bovis BCG str. Pasteur 1173P2] 49 more sequence titlesLength=143 Score = 288 bits (738), Expect = 2e-76, Method: Compositional matrix adjust. Identities = 142/143 (99%), Positives = 143/143 (100%), Gaps = 0/143 (0%) Query 1 VEEVPTGPPAMGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAE 60 +EEVPTGPPAMGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAE Sbjct 1 MEEVPTGPPAMGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAE 60 Query 61 PAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTL 120 PAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTL Sbjct 61 PAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTL 120 Query 121 SCVIDDGSPVPPLAAPGSCGGGA 143 SCVIDDGSPVPPLAAPGSCGGGA Sbjct 121 SCVIDDGSPVPPLAAPGSCGGGA 143 >gi|289754410|ref|ZP_06513788.1| glycine rich protein [Mycobacterium tuberculosis EAS054] gi|289694997|gb|EFD62426.1| glycine rich protein [Mycobacterium tuberculosis EAS054] Length=143 Score = 286 bits (732), Expect = 7e-76, Method: Compositional matrix adjust. Identities = 141/143 (99%), Positives = 142/143 (99%), Gaps = 0/143 (0%) Query 1 VEEVPTGPPAMGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAE 60 +EEVPTGPPAMGHRACGGQKAAFPTR NSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAE Sbjct 1 MEEVPTGPPAMGHRACGGQKAAFPTRTNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAE 60 Query 61 PAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTL 120 PAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTL Sbjct 61 PAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTL 120 Query 121 SCVIDDGSPVPPLAAPGSCGGGA 143 SCVIDDGSPVPPLAAPGSCGGGA Sbjct 121 SCVIDDGSPVPPLAAPGSCGGGA 143 >gi|15841802|ref|NP_336839.1| hypothetical protein MT2365.1 [Mycobacterium tuberculosis CDC1551] gi|13882064|gb|AAK46653.1| hypothetical protein MT2365.1 [Mycobacterium tuberculosis CDC1551] Length=133 Score = 271 bits (693), Expect = 2e-71, Method: Compositional matrix adjust. Identities = 133/133 (100%), Positives = 133/133 (100%), Gaps = 0/133 (0%) Query 11 MGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPH 70 MGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPH Sbjct 1 MGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPH 60 Query 71 MPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCVIDDGSPV 130 MPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCVIDDGSPV Sbjct 61 MPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCVIDDGSPV 120 Query 131 PPLAAPGSCGGGA 143 PPLAAPGSCGGGA Sbjct 121 PPLAAPGSCGGGA 133 >gi|120404998|ref|YP_954827.1| hypothetical protein Mvan_4044 [Mycobacterium vanbaalenii PYR-1] gi|119957816|gb|ABM14821.1| hypothetical protein Mvan_4044 [Mycobacterium vanbaalenii PYR-1] Length=128 Score = 62.8 bits (151), Expect = 2e-08, Method: Compositional matrix adjust. Identities = 47/134 (36%), Positives = 62/134 (47%), Gaps = 13/134 (9%) Query 9 PAMGHRACGGQKAAFPTRMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQD 68 PA G R C K R+ G M + + +A+ V + A A+P Sbjct 5 PATGERPCFNVKR---LRLLGG-RTMVRAKLYVAVLVALSCVLAAPGVAEADPT------ 54 Query 69 PHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVP-APFVGTTLTLSCVIDDG 127 PN A G CPGG GGI +C+G ++PDGSYWH V + F ++CVI+D Sbjct 55 -QKPNIATGDCPGG-TGGILAVAWCNGEKFPDGSYWHNVAMTGGTFATPRFEMNCVINDA 112 Query 128 SPVPPLAAPGSCGG 141 P A PG CGG Sbjct 113 FPSGTPAPPGGCGG 126 >gi|15839438|ref|NP_334475.1| hypothetical protein MT0066.1 [Mycobacterium tuberculosis CDC1551] gi|148821251|ref|YP_001286005.1| hypothetical protein TBFG_10060 [Mycobacterium tuberculosis F11] gi|253796976|ref|YP_003029977.1| hypothetical protein TBMG_00060 [Mycobacterium tuberculosis KZN 1435] 20 more sequence titles Length=126 Score = 59.3 bits (142), Expect = 2e-07, Method: Compositional matrix adjust. Identities = 37/100 (37%), Positives = 49/100 (49%), Gaps = 11/100 (11%) Query 26 RMNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGG--G 83 R+ K+ ++ AI A+ F A+A+P DPH P+ GYCPGG G Sbjct 9 RLTEFEMKLKFARLSTAILGCAAALVFPASVASADPP-----DPHQPDMTKGYCPGGRWG 63 Query 84 FGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCV 123 FG + CDG +YPDGS+WHQ + F G CV Sbjct 64 FGDL---AVCDGEKYPDGSFWHQW-MQTWFTGPQFYFDCV 99 >gi|254233460|ref|ZP_04926786.1| hypothetical protein TBCG_00060 [Mycobacterium tuberculosis C] gi|289445587|ref|ZP_06435331.1| conserved hypothetical protein [Mycobacterium tuberculosis CPHL_A] gi|124603253|gb|EAY61528.1| hypothetical protein TBCG_00060 [Mycobacterium tuberculosis C] gi|289418545|gb|EFD15746.1| conserved hypothetical protein [Mycobacterium tuberculosis CPHL_A] Length=112 Score = 57.8 bits (138), Expect = 6e-07, Method: Compositional matrix adjust. Identities = 36/93 (39%), Positives = 47/93 (51%), Gaps = 11/93 (11%) Query 33 KMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGG--GFGGITGW 90 K+ ++ AI A+ F A+A+P DPH P+ GYCPGG GFG + Sbjct 2 KLKFARLSTAILGCAAALVFPASVASADPP-----DPHQPDMTKGYCPGGRWGFGDL--- 53 Query 91 GYCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCV 123 CDG +YPDGS+WHQ + F G CV Sbjct 54 AVCDGEKYPDGSFWHQ-WMQTWFTGPQFYFDCV 85 >gi|308232627|ref|ZP_07664136.1| hypothetical protein TMAG_00731 [Mycobacterium tuberculosis SUMu001] gi|308213493|gb|EFO72892.1| hypothetical protein TMAG_00731 [Mycobacterium tuberculosis SUMu001] Length=93 Score = 55.8 bits (133), Expect = 2e-06, Method: Compositional matrix adjust. Identities = 33/72 (46%), Positives = 39/72 (55%), Gaps = 8/72 (11%) Query 54 MVSANAEPAPPPGQDPHMPNSAMGYCPGG--GFGGITGWGYCDGIRYPDGSYWHQVRVPA 111 M A+ A PP DPH P+ GYCPGG GFG + CDG +YPDGS+WHQ + Sbjct 1 MFPASVASADPP--DPHQPDMTKGYCPGGRWGFGDL---AVCDGEKYPDGSFWHQW-MQT 54 Query 112 PFVGTTLTLSCV 123 F G CV Sbjct 55 WFTGPQFYFDCV 66 >gi|118619016|ref|YP_907348.1| hypothetical protein MUL_3771 [Mycobacterium ulcerans Agy99] gi|118571126|gb|ABL05877.1| conserved hypothetical secreted protein [Mycobacterium ulcerans Agy99] Length=136 Score = 55.1 bits (131), Expect = 4e-06, Method: Compositional matrix adjust. Identities = 37/82 (46%), Positives = 44/82 (54%), Gaps = 13/82 (15%) Query 29 SGVEKMYKNS-IAIAIGTLTMAVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGGGFGGI 87 GV+ M K S + AI A+ FS A A P DPH P+ GYCPGG +G Sbjct 13 DGVKMMLKLSRLGAAILGGVAALMFSTAVATAGPP-----DPHQPDMTKGYCPGGRWG-- 65 Query 88 TGWG---YCDGIRYPDGSYWHQ 106 WG CDG +YPDGS+WHQ Sbjct 66 --WGELAVCDGEKYPDGSFWHQ 85 >gi|183983814|ref|YP_001852105.1| hypothetical protein MMAR_3839 [Mycobacterium marinum M] gi|183177140|gb|ACC42250.1| conserved hypothetical secreted protein [Mycobacterium marinum M] Length=114 Score = 53.9 bits (128), Expect = 7e-06, Method: Compositional matrix adjust. Identities = 30/61 (50%), Positives = 36/61 (60%), Gaps = 12/61 (19%) Query 49 AVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGGGFGGITGWG---YCDGIRYPDGSYWH 105 A+ FS A A+P DPH P+ GYCPGG +G WG CDG +YPDGS+WH Sbjct 18 ALMFSTAVATADPP-----DPHQPDMTKGYCPGGRWG----WGELAVCDGEKYPDGSFWH 68 Query 106 Q 106 Q Sbjct 69 Q 69 >gi|183983815|ref|YP_001852106.1| hypothetical protein MMAR_3840 [Mycobacterium marinum M] gi|183177141|gb|ACC42251.1| conserved hypothetical secreted protein [Mycobacterium marinum M] Length=120 Score = 52.8 bits (125), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 29/56 (52%), Positives = 33/56 (59%), Gaps = 9/56 (16%) Query 54 MVSANAEPAPPPGQDPHMPNSAMGYCPGGGFGGITGWG---YCDGIRYPDGSYWHQ 106 M S A PP DPH P+ GYCPGG +G WG CDG +YPDGS+WHQ Sbjct 20 MFSTAVATAGPP--DPHQPDMTKGYCPGGRWG----WGELAVCDGEKYPDGSFWHQ 69 >gi|109521896|ref|YP_655333.1| gp56 [Mycobacterium phage Pipefish] gi|88910627|gb|ABD58553.1| gp56 [Mycobacterium phage Pipefish] Length=111 Score = 52.8 bits (125), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 46/118 (39%), Positives = 57/118 (49%), Gaps = 9/118 (7%) Query 27 MNSGVEKMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGGGFGG 86 MN KM +A T+ + V + A AP DP+ P + +CPGGG Sbjct 1 MNLSRWKMAGRLFLVAAATVVLIVCAVGWTGRANAAP----DPYWPIPPV-WCPGGG--T 53 Query 87 ITGWG-YCDGIRYPDGSYWHQVRVPAPFVGTTLT-LSCVIDDGSPVPPLAAPGSCGGG 142 +T WG YCDG YPDG+ WH APFVG + CV+ PPLA P CG G Sbjct 54 MTSWGGYCDGTPYPDGTKWHMDSFVAPFVGRVWNPIVCVVHPAPAPPPLAPPTGCGRG 111 >gi|222435724|ref|YP_002564152.1| gp54 [Mycobacterium phage Phlyer] gi|222088277|gb|ACM42218.1| gp54 [Mycobacterium phage Phlyer] Length=104 Score = 47.4 bits (111), Expect = 7e-04, Method: Compositional matrix adjust. Identities = 35/93 (38%), Positives = 46/93 (50%), Gaps = 9/93 (9%) Query 34 MYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGGGFGGITGWG-Y 92 M + +A T + V + A AP DP+ P + +CPGGG +T WG Y Sbjct 1 MAGRLLLVAAATAVLIVCAVGWTGRANAAP----DPYWPIPPV-WCPGGGT--MTSWGGY 53 Query 93 CDGIRYPDGSYWHQVRVPAPFVGTTLT-LSCVI 124 CDG YPDG+ WH APFVG + CV+ Sbjct 54 CDGTPYPDGTKWHMDSFVAPFVGRVWNPIVCVV 86 >gi|194303248|ref|YP_002014662.1| gp51 [Mycobacterium phage Phaedrus] gi|194150980|gb|ACF34015.1| gp51 [Mycobacterium phage Phaedrus] gi|339754713|gb|AEJ94726.1| gp54 [Mycobacterium phage Daisy] Length=115 Score = 47.4 bits (111), Expect = 7e-04, Method: Compositional matrix adjust. Identities = 40/89 (45%), Positives = 47/89 (53%), Gaps = 11/89 (12%) Query 56 SANAEPAPPPGQDPHMPNSAMGYCPGGGFGGITGWG-YCDGIRYPDGSYWHQVRVPAPFV 114 ANA P DP+ P + +CPGGG +T WG YCDG YPDG+ WH APFV Sbjct 36 RANATP------DPYWPIPPV-WCPGGG--TMTSWGGYCDGTPYPDGTKWHMDSFVAPFV 86 Query 115 GTTLT-LSCVIDDGSPVPPLAAPGSCGGG 142 G + CV+ PPLA P CG G Sbjct 87 GRVWNPIVCVVHPAPAPPPLAPPTGCGRG 115 >gi|326905821|gb|EGE52754.1| hypothetical protein TBPG_03787 [Mycobacterium tuberculosis W-148] Length=66 Score = 45.8 bits (107), Expect = 0.002, Method: Compositional matrix adjust. Identities = 29/73 (40%), Positives = 39/73 (54%), Gaps = 10/73 (13%) Query 33 KMYKNSIAIAIGTLTMAVEFSMVSANAEPAPPPGQDPHMPNSAMGYCPGG--GFGGITGW 90 K+ ++ AI A+ F A+A+P DPH P+ GYCPGG GFG + Sbjct 2 KLKFARLSTAILGCAAALVFPASVASADP-----PDPHQPDMTKGYCPGGRWGFGDLA-- 54 Query 91 GYCDGIRYPDGSY 103 CDG +YPDGS+ Sbjct 55 -VCDGEKYPDGSF 66 >gi|240170091|ref|ZP_04748750.1| hypothetical protein MkanA1_12311 [Mycobacterium kansasii ATCC 12478] Length=77 Score = 41.2 bits (95), Expect = 0.045, Method: Compositional matrix adjust. Identities = 25/54 (47%), Positives = 29/54 (54%), Gaps = 6/54 (11%) Query 75 AMGYCPGG--GFGGITGWGYCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCVIDD 126 MGYCPGG GFG + CDG +YPDGS+WHQ + G CV D Sbjct 2 TMGYCPGGRWGFGELA---VCDGEKYPDGSFWHQW-MRTYMTGPQWYYDCVSGD 51 >gi|29566245|ref|NP_817813.1| gp52 [Mycobacterium phage Rosebush] gi|109521833|ref|YP_655729.1| gp49 [Mycobacterium phage Qyrzula] gi|29424970|gb|AAN01894.1| gp52 [Mycobacterium phage Rosebush] gi|91980777|gb|ABE67495.1| gp49 [Mycobacterium phage Qyrzula] gi|345102612|gb|AEN70214.1| hypothetical protein [Mycobacterium phage AnnaL29] Length=111 Score = 40.8 bits (94), Expect = 0.073, Method: Compositional matrix adjust. Identities = 29/71 (41%), Positives = 40/71 (57%), Gaps = 7/71 (9%) Query 68 DPHMPNSAMGYCPGGGFG-GITGWG-YCDGIRYPDGSYWHQVRVPAPFVGTTLTLSCVID 125 DP +P +CPG G G +GWG YC+G +PDG+ + R+ + L C+I Sbjct 38 DPRVPAPPF-WCPGNGPGMSASGWGGYCEGQSFPDGTRLNTFRIGYWWQ----PLRCIIP 92 Query 126 DGSPVPPLAAP 136 +GSP PPLA P Sbjct 93 NGSPTPPLAGP 103 >gi|296532124|ref|ZP_06894891.1| hypothetical protein HMPREF0731_0371 [Roseomonas cervicalis ATCC 49957] gi|296267548|gb|EFH13406.1| hypothetical protein HMPREF0731_0371 [Roseomonas cervicalis ATCC 49957] Length=180 Score = 35.0 bits (79), Expect = 4.0, Method: Compositional matrix adjust. Identities = 27/89 (31%), Positives = 38/89 (43%), Gaps = 4/89 (4%) Query 55 VSANAEPAPPPGQDPHMPNSAMGYCPGGGFGGITGWGYCDGIRYPDGSYWHQVRVPAPFV 114 V+++ +P P ++ HM + M GG G C+ DG WH+ A Sbjct 89 VASDRQPRP---EEYHMTLARMHTAIGGQSTAKDVEGSCELSLSADGQTWHRATCEATDR 145 Query 115 GTTLTLSCVIDDGSPVPPLAAPGSCGGGA 143 +T I +G PV A PG GGGA Sbjct 146 SQRVTRMTFIGNGQPVRA-ARPGQEGGGA 173 >gi|120405868|ref|YP_955697.1| hypothetical protein Mvan_4918 [Mycobacterium vanbaalenii PYR-1] gi|119958686|gb|ABM15691.1| hypothetical protein Mvan_4918 [Mycobacterium vanbaalenii PYR-1] Length=94 Score = 33.5 bits (75), Expect = 9.9, Method: Compositional matrix adjust. Identities = 25/53 (48%), Positives = 26/53 (50%), Gaps = 4/53 (7%) Query 85 GGITGWG---YCDGIRYPDGSYWHQVRVPAP-FVGTTLTLSCVIDDGSPVPPL 133 GG T WG YCDG Y DGSY H V V F GT C D +P PL Sbjct 30 GGWTPWGGGEYCDGYIYEDGSYDHCVSVTVLGFGGTQCNRVCPPDPANPAVPL 82 Lambda K H 0.317 0.138 0.456 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 129250525032 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40