BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv2078 Length=104 Score E Sequences producing significant alignments: (Bits) Value gi|15609215|ref|NP_216594.1| hypothetical protein Rv2078 [Mycoba... 211 4e-53 gi|15841570|ref|NP_336607.1| hypothetical protein MT2139 [Mycoba... 207 3e-52 gi|289443584|ref|ZP_06433328.1| conserved hypothetical protein [... 165 2e-39 gi|240168385|ref|ZP_04747044.1| hypothetical protein MkanA1_0368... 125 2e-27 gi|296167164|ref|ZP_06849571.1| conserved hypothetical protein [... 121 3e-26 gi|240173421|ref|ZP_04752079.1| hypothetical protein MkanA1_2917... 119 2e-25 gi|183985439|ref|YP_001853730.1| hypothetical protein MMAR_5469 ... 112 2e-23 gi|254823071|ref|ZP_05228072.1| hypothetical protein MintA_24295... 112 3e-23 gi|41410427|ref|NP_963263.1| hypothetical protein MAP4329c [Myco... 109 1e-22 gi|254777638|ref|ZP_05219154.1| hypothetical protein MaviaA2_236... 108 3e-22 gi|342862330|ref|ZP_08718971.1| hypothetical protein MCOL_25688 ... 107 6e-22 gi|118620061|ref|YP_908393.1| hypothetical protein MUL_5058 [Myc... 107 8e-22 gi|333992295|ref|YP_004524909.1| hypothetical protein JDM601_365... 88.2 4e-16 gi|333989561|ref|YP_004522175.1| hypothetical protein JDM601_092... 70.1 1e-10 gi|118467290|ref|YP_884402.1| hypothetical protein MAV_5291 [Myc... 65.9 2e-09 gi|86741641|ref|YP_482041.1| bacteriophage resistance gene PglY ... 33.9 8.3 >gi|15609215|ref|NP_216594.1| hypothetical protein Rv2078 [Mycobacterium tuberculosis H37Rv] gi|148661893|ref|YP_001283416.1| hypothetical protein MRA_2092 [Mycobacterium tuberculosis H37Ra] gi|167967779|ref|ZP_02550056.1| hypothetical protein MtubH3_06997 [Mycobacterium tuberculosis H37Ra] 9 more sequence titlesLength=104 Score = 211 bits (536), Expect = 4e-53, Method: Compositional matrix adjust. Identities = 103/104 (99%), Positives = 104/104 (100%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM Sbjct 1 MFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT Sbjct 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 >gi|15841570|ref|NP_336607.1| hypothetical protein MT2139 [Mycobacterium tuberculosis CDC1551] gi|31793261|ref|NP_855754.1| hypothetical protein Mb2104 [Mycobacterium bovis AF2122/97] gi|121637963|ref|YP_978187.1| hypothetical protein BCG_2097 [Mycobacterium bovis BCG str. Pasteur 1173P2] 51 more sequence titles Length=104 Score = 207 bits (528), Expect = 3e-52, Method: Compositional matrix adjust. Identities = 102/104 (99%), Positives = 103/104 (99%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVDV LLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM Sbjct 1 MFVDVGLLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT Sbjct 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 >gi|289443584|ref|ZP_06433328.1| conserved hypothetical protein [Mycobacterium tuberculosis T46] gi|289416503|gb|EFD13743.1| conserved hypothetical protein [Mycobacterium tuberculosis T46] Length=94 Score = 165 bits (417), Expect = 2e-39, Method: Compositional matrix adjust. Identities = 81/83 (98%), Positives = 82/83 (99%), Gaps = 0/83 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVDV LLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM Sbjct 1 MFVDVGLLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 Query 61 RNLHAHRQALITVGEKARHAATG 83 RNLHAHRQALITVGEKARHAATG Sbjct 61 RNLHAHRQALITVGEKARHAATG 83 >gi|240168385|ref|ZP_04747044.1| hypothetical protein MkanA1_03682 [Mycobacterium kansasii ATCC 12478] Length=104 Score = 125 bits (314), Expect = 2e-27, Method: Compositional matrix adjust. Identities = 66/104 (64%), Positives = 75/104 (73%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD L GAN+SH AG+HA GA LSRGPLLSGMFG F A+ FH AV +AHAQQ+ Sbjct 1 MFVDTGSLRLGANDSHRAGDHAQDGAGCLSRGPLLSGMFGEFAAAEAFHGAVTSAHAQQV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 + L AH+ AL VG AR AA GFT MDD NAAEL+AV CS AT Sbjct 61 KTLQAHQDALTAVGGNARRAAVGFTGMDDRNAAELRAVRCSSAT 104 >gi|296167164|ref|ZP_06849571.1| conserved hypothetical protein [Mycobacterium parascrofulaceum ATCC BAA-614] gi|295897486|gb|EFG77085.1| conserved hypothetical protein [Mycobacterium parascrofulaceum ATCC BAA-614] Length=106 Score = 121 bits (304), Expect = 3e-26, Method: Compositional matrix adjust. Identities = 63/104 (61%), Positives = 70/104 (68%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD LLHSG NESH AG HA GADQL+RGPL SGMFG F A FH+AV AH Q + Sbjct 1 MFVDTALLHSGGNESHRAGGHAQEGADQLARGPLASGMFGDFAAADAFHEAVTTAHTQHV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 +NL H+Q L VG KA +AA GFT MD NA ELKAV T Sbjct 61 QNLQGHKQTLTDVGAKAHYAAKGFTSMDQQNAGELKAVRPRSGT 104 >gi|240173421|ref|ZP_04752079.1| hypothetical protein MkanA1_29171 [Mycobacterium kansasii ATCC 12478] Length=123 Score = 119 bits (297), Expect = 2e-25, Method: Compositional matrix adjust. Identities = 59/104 (57%), Positives = 75/104 (73%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD +LLHSG N+SH AG HA GADQL++GPL SG FG F +TFH V A+ + + Sbjct 20 MFVDTDLLHSGGNQSHQAGGHAREGADQLAQGPLPSGTFGEFAAGETFHGVVSASLTKHV 79 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 + L AH +AL +G+KA HAA GFTDMD+ NAA+L+AV CS T Sbjct 80 QTLQAHHEALSAIGDKAHHAAAGFTDMDERNAAKLRAVRCSSVT 123 >gi|183985439|ref|YP_001853730.1| hypothetical protein MMAR_5469 [Mycobacterium marinum M] gi|183178765|gb|ACC43875.1| conserved hypothetical protein [Mycobacterium marinum M] Length=104 Score = 112 bits (279), Expect = 2e-23, Method: Compositional matrix adjust. Identities = 60/104 (58%), Positives = 71/104 (69%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD LLH G NESH AG HA GAD+L+ GPL+SGMFG F A FH+ V +AHAQ + Sbjct 1 MFVDTGLLHLGGNESHRAGGHAQEGADRLALGPLMSGMFGDFAAADAFHNGVHSAHAQHV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 RNL AH++AL VG A AA GFT MD+ NA L+AV S T Sbjct 61 RNLQAHQEALTAVGSNAHLAAKGFTAMDEHNAEALQAVRWSAGT 104 >gi|254823071|ref|ZP_05228072.1| hypothetical protein MintA_24295 [Mycobacterium intracellulare ATCC 13950] Length=109 Score = 112 bits (279), Expect = 3e-23, Method: Compositional matrix adjust. Identities = 58/102 (57%), Positives = 69/102 (68%), Gaps = 0/102 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FV+ E LHSG N+SH AG HA GAD L+ G L SGMFG F A +FH+AV AH Q + Sbjct 1 MFVNTEQLHSGGNQSHRAGGHAQEGADHLAGGTLESGMFGDFEAADSFHNAVTTAHGQHV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSC 102 +NL H + L +VG KA HAA GFT+MD NA ELKAV S Sbjct 61 KNLQGHSETLTSVGTKAHHAANGFTNMDQHNAEELKAVRPSS 102 >gi|41410427|ref|NP_963263.1| hypothetical protein MAP4329c [Mycobacterium avium subsp. paratuberculosis K-10] gi|41399261|gb|AAS06879.1| hypothetical protein MAP_4329c [Mycobacterium avium subsp. paratuberculosis K-10] gi|336459794|gb|EGO38708.1| Protein of unknown function (DUF2563) [Mycobacterium avium subsp. paratuberculosis S397] Length=109 Score = 109 bits (272), Expect = 1e-22, Method: Compositional matrix adjust. Identities = 56/98 (58%), Positives = 68/98 (70%), Gaps = 0/98 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD +LLHSG N+SH AG HA GADQL+ G + SGMFG F A FH AV AH Q + Sbjct 1 MFVDTDLLHSGGNQSHRAGGHARDGADQLAGGTVASGMFGDFAAADAFHSAVVVAHEQHV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAV 98 +NL AH + L VG KA HAA GFT+MD NA E++A+ Sbjct 61 QNLQAHSETLTGVGTKAHHAANGFTNMDQQNATEMRAL 98 >gi|254777638|ref|ZP_05219154.1| hypothetical protein MaviaA2_23616 [Mycobacterium avium subsp. avium ATCC 25291] Length=111 Score = 108 bits (270), Expect = 3e-22, Method: Compositional matrix adjust. Identities = 55/98 (57%), Positives = 67/98 (69%), Gaps = 0/98 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD +LLHSG N+SH AG HA GADQL+ G + SGMFG F A FH AV AH Q + Sbjct 3 MFVDTDLLHSGGNQSHRAGGHARDGADQLAGGTVASGMFGDFAAADPFHSAVAVAHEQHV 62 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAV 98 NL AH + L VG KA HAA FT+MD+ NA E++A+ Sbjct 63 NNLQAHSETLTGVGTKAHHAANSFTNMDEQNATEMRAL 100 >gi|342862330|ref|ZP_08718971.1| hypothetical protein MCOL_25688 [Mycobacterium colombiense CECT 3035] gi|342130187|gb|EGT83515.1| hypothetical protein MCOL_25688 [Mycobacterium colombiense CECT 3035] Length=109 Score = 107 bits (267), Expect = 6e-22, Method: Compositional matrix adjust. Identities = 55/102 (54%), Positives = 67/102 (66%), Gaps = 0/102 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD +LLHSG N+SH AG HA GADQL+ G + SGMFG F A FH AV AAH Q + Sbjct 1 MFVDTDLLHSGGNQSHRAGGHAQDGADQLAGGTVASGMFGDFAAADAFHSAVAAAHGQHV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSC 102 + L H + L VG KA AA GFT+MD NA E++A+ S Sbjct 61 KTLQGHSETLTGVGTKAHTAANGFTNMDKNNATEMQALRPSS 102 >gi|118620061|ref|YP_908393.1| hypothetical protein MUL_5058 [Mycobacterium ulcerans Agy99] gi|118572171|gb|ABL06922.1| conserved hypothetical protein [Mycobacterium ulcerans Agy99] Length=104 Score = 107 bits (266), Expect = 8e-22, Method: Compositional matrix adjust. Identities = 58/104 (56%), Positives = 69/104 (67%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD LLH G NESH AG HA GAD+L+ GPL+SGMFG F A F ++V +AHAQ + Sbjct 1 MFVDTGLLHLGGNESHRAGGHAQEGADRLALGPLMSGMFGDFAAADAFRNSVHSAHAQHV 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 RNL AH++AL VG A AA GFT MD NA L+AV T Sbjct 61 RNLQAHQEALTAVGSNAHLAAKGFTAMDGHNAQALQAVRWGAVT 104 >gi|333992295|ref|YP_004524909.1| hypothetical protein JDM601_3655 [Mycobacterium sp. JDM601] gi|333488263|gb|AEF37655.1| conserved hypothetical protein [Mycobacterium sp. JDM601] Length=104 Score = 88.2 bits (217), Expect = 4e-16, Method: Compositional matrix adjust. Identities = 46/104 (45%), Positives = 67/104 (65%), Gaps = 0/104 (0%) Query 1 VFVDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQM 60 +FVD +L +G + SH A HAH GA L + + +G+FG+F A +H+A+ AH+ + Sbjct 1 MFVDPAMLTAGESHSHSAANHAHTGAASLHQPGVTAGIFGSFGAADVYHNAICTAHSDHI 60 Query 61 RNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVVCSCAT 104 L+ HR+ L VG+KA +AA FT MD NAAEL+AV C+ +T Sbjct 61 TILNGHRRTLTDVGDKAHYAARAFTGMDQHNAAELRAVQCNSST 104 >gi|333989561|ref|YP_004522175.1| hypothetical protein JDM601_0921 [Mycobacterium sp. JDM601] gi|333485529|gb|AEF34921.1| conserved hypothetical protein [Mycobacterium sp. JDM601] Length=105 Score = 70.1 bits (170), Expect = 1e-10, Method: Compositional matrix adjust. Identities = 41/97 (43%), Positives = 55/97 (57%), Gaps = 0/97 (0%) Query 3 VDVELLHSGANESHYAGEHAHGGADQLSRGPLLSGMFGTFPVAQTFHDAVGAAHAQQMRN 62 VD + L +GAN S AG HA GA +L + +G+FG F A +FH A+G+A Sbjct 3 VDPQALRTGANVSDDAGGHARNGAQRLGSAGVAAGIFGDFDDAHSFHAALGSAKDGHRDA 62 Query 63 LHAHRQALITVGEKARHAATGFTDMDDGNAAELKAVV 99 L H Q L + E R AAT FT MD+ NA +L+ V+ Sbjct 63 LQGHHQNLTGIAENVRTAATAFTRMDNDNAEQLRDVI 99 >gi|118467290|ref|YP_884402.1| hypothetical protein MAV_5291 [Mycobacterium avium 104] gi|118168577|gb|ABK69474.1| conserved hypothetical protein [Mycobacterium avium 104] Length=72 Score = 65.9 bits (159), Expect = 2e-09, Method: Compositional matrix adjust. Identities = 32/61 (53%), Positives = 39/61 (64%), Gaps = 0/61 (0%) Query 38 MFGTFPVAQTFHDAVGAAHAQQMRNLHAHRQALITVGEKARHAATGFTDMDDGNAAELKA 97 MFG F A FH AV AH Q + NL AH + L VG KA HAA FT+MD+ NA E++A Sbjct 1 MFGDFAAADAFHSAVAVAHEQHVNNLQAHSETLTGVGTKAHHAANSFTNMDEQNATEMRA 60 Query 98 V 98 + Sbjct 61 L 61 >gi|86741641|ref|YP_482041.1| bacteriophage resistance gene PglY [Frankia sp. CcI3] gi|86568503|gb|ABD12312.1| bacteriophage (phiC31) resistance gene PglY [Frankia sp. CcI3] Length=1227 Score = 33.9 bits (76), Expect = 8.3, Method: Compositional matrix adjust. Identities = 22/51 (44%), Positives = 29/51 (57%), Gaps = 2/51 (3%) Query 45 AQTFHDAVGAAHAQQMRN--LHAHRQALITVGEKARHAATGFTDMDDGNAA 93 A F A+ AA A Q R + +AL+ E AR AATG+ ++DDG AA Sbjct 201 ADRFEQALTAAPAAQERRDLVGTLEKALVGFAELARGAATGYVNIDDGLAA 251 Lambda K H 0.320 0.131 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 127350764394 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40