BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 21,062,489 sequences; 7,218,481,314 total letters Query= Rv3032A Rv3032A Conserved protein 3392812:3393201 forward MW:14298 Length=129 Score E Sequences producing significant alignments: (Bits) Value gi|15842596|ref|NP_337633.1| hypothetical protein MT3117 [Mycoba... 263 1e-68 gi|408780906|ref|ZP_11192679.1| hypothetical protein MkanA1_0948... 233 7e-60 gi|118617512|ref|YP_905844.1| hypothetical protein MUL_1914 [Myc... 230 6e-59 gi|385992288|ref|YP_005910586.1| hypothetical protein [Mycobacte... 221 3e-56 gi|386692298|ref|ZP_10091047.1| Conserved hypothetical protein [... 84.3 8e-15 gi|271964862|ref|YP_003339058.1| hypothetical protein [Streptosp... 83.6 1e-14 gi|330468717|ref|YP_004406460.1| hypothetical protein VAB18032_2... 75.1 5e-12 gi|315505624|ref|YP_004084511.1| hypothetical protein ML5_4885 [... 74.7 6e-12 gi|302867977|ref|YP_003836614.1| hypothetical protein Micau_3511... 73.9 9e-12 gi|300786144|ref|YP_003766435.1| hypothetical protein AMED_4259 ... 73.6 1e-11 gi|331698010|ref|YP_004334249.1| hypothetical protein Psed_4234 ... 73.6 1e-11 gi|84497363|ref|ZP_00996185.1| hypothetical protein JNB_14253 [J... 67.0 1e-09 gi|404612847|gb|EKB09904.1| hypothetical protein HMPREF1167_0371... 35.0 4.9 gi|254436964|ref|ZP_05050458.1| AP endonuclease, family 2 [Octad... 34.7 6.6 >gi|15842596|ref|NP_337633.1| hypothetical protein MT3117 [Mycobacterium tuberculosis CDC1551] gi|121638916|ref|YP_979140.1| hypothetical protein BCG_3056 [Mycobacterium bovis BCG str. Pasteur 1173P2] gi|148662885|ref|YP_001284408.1| hypothetical protein MRA_3064 [Mycobacterium tuberculosis H37Ra] 62 more sequence titlesLength=129 Score = 263 bits (671), Expect = 1e-68, Method: Compositional matrix adjust. Identities = 128/129 (99%), Positives = 129/129 (100%), Gaps = 0/129 (0%) Query 1 VKPQDQGLHFPYRYDLRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDA 60 +KPQDQGLHFPYRYDLRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDA Sbjct 1 MKPQDQGLHFPYRYDLRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDA 60 Query 61 HITGPYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADP 120 HITGPYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADP Sbjct 61 HITGPYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADP 120 Query 121 EGLVAALSS 129 EGLVAALSS Sbjct 121 EGLVAALSS 129 >gi|408780906|ref|ZP_11192679.1| hypothetical protein MkanA1_09487 [Mycobacterium kansasii ATCC 12478] Length=131 Score = 233 bits (595), Expect = 7e-60, Method: Compositional matrix adjust. Identities = 111/128 (87%), Positives = 120/128 (94%), Gaps = 0/128 (0%) Query 1 VKPQDQGLHFPYRYDLRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDA 60 +KPQ++G HFPYRYD RLA MWLPFRWPG QGVT+T+DGRFVARYGPFRVEAPLSSVRDA Sbjct 1 MKPQNRGRHFPYRYDPRLAAMWLPFRWPGGQGVTLTDDGRFVARYGPFRVEAPLSSVRDA 60 Query 61 HITGPYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADP 120 H+TGPYRWWTAVGPRLSMVDDGLTFGTNA AGVC+HFEP +HRV+GLRDHSALTVTVADP Sbjct 61 HVTGPYRWWTAVGPRLSMVDDGLTFGTNAHAGVCVHFEPPVHRVLGLRDHSALTVTVADP 120 Query 121 EGLVAALS 128 E LVAAL Sbjct 121 EALVAALK 128 >gi|118617512|ref|YP_905844.1| hypothetical protein MUL_1914 [Mycobacterium ulcerans Agy99] gi|183981690|ref|YP_001849981.1| hypothetical protein MMAR_1676 [Mycobacterium marinum M] gi|118569622|gb|ABL04373.1| conserved hypothetical protein [Mycobacterium ulcerans Agy99] gi|183175016|gb|ACC40126.1| conserved hypothetical protein [Mycobacterium marinum M] Length=135 Score = 230 bits (587), Expect = 6e-59, Method: Compositional matrix adjust. Identities = 108/129 (84%), Positives = 117/129 (91%), Gaps = 0/129 (0%) Query 1 VKPQDQGLHFPYRYDLRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDA 60 + PQD+G +FPYRYD RLAPMWLPFRWPG QGVT+T+DGRFVARYGPF EAPLSSV D+ Sbjct 1 MTPQDRGEYFPYRYDARLAPMWLPFRWPGRQGVTLTDDGRFVARYGPFHAEAPLSSVTDS 60 Query 61 HITGPYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADP 120 H+TGPYRWWTAVGPRLSMVDDGLTFGTNA AG C+HFEPRIHRV+GLRDHSALTVTVADP Sbjct 61 HVTGPYRWWTAVGPRLSMVDDGLTFGTNAQAGACVHFEPRIHRVLGLRDHSALTVTVADP 120 Query 121 EGLVAALSS 129 GLVAAL Sbjct 121 AGLVAALKK 129 >gi|385992288|ref|YP_005910586.1| hypothetical protein [Mycobacterium tuberculosis CCDC5180] gi|385995914|ref|YP_005914212.1| hypothetical protein [Mycobacterium tuberculosis CCDC5079] gi|339295868|gb|AEJ47979.1| hypothetical protein CCDC5079_2789 [Mycobacterium tuberculosis CCDC5079] gi|339299481|gb|AEJ51591.1| hypothetical protein CCDC5180_2754 [Mycobacterium tuberculosis CCDC5180] gi|358233180|dbj|GAA46672.1| hypothetical protein NCGM2209_3315 [Mycobacterium tuberculosis NCGM2209] gi|379029363|dbj|BAL67096.1| hypothetical protein ERDMAN_3319 [Mycobacterium tuberculosis str. Erdman = ATCC 35801] Length=109 Score = 221 bits (564), Expect = 3e-56, Method: Compositional matrix adjust. Identities = 109/109 (100%), Positives = 109/109 (100%), Gaps = 0/109 (0%) Query 21 MWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPYRWWTAVGPRLSMVD 80 MWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPYRWWTAVGPRLSMVD Sbjct 1 MWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPYRWWTAVGPRLSMVD 60 Query 81 DGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADPEGLVAALSS 129 DGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADPEGLVAALSS Sbjct 61 DGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVADPEGLVAALSS 109 >gi|386692298|ref|ZP_10091047.1| Conserved hypothetical protein [Micromonospora lupini str. Lupac 08] gi|385885203|emb|CCH18931.1| Conserved hypothetical protein [Micromonospora lupini str. Lupac 08] Length=129 Score = 84.3 bits (207), Expect = 8e-15, Method: Compositional matrix adjust. Identities = 53/124 (43%), Positives = 70/124 (57%), Gaps = 6/124 (4%) Query 9 HFPYRYD--LRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPY 66 FP+R+D R A L R P + V VT D V RYGP+R+ +V A + GPY Sbjct 7 RFPFRFDPAFRPALALLGVR-PATAWVAVT-DRDLVIRYGPWRLRTGRDNVLGAEVAGPY 64 Query 67 RWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVI--GLRDHSALTVTVADPEGLV 124 RWW +GP LS+ D G++FG++ A GVC+ F R+ + G H A TVTVADP L Sbjct 65 RWWRVIGPHLSLADGGVSFGSSTAGGVCLRFGVRVPALAPGGWPRHPAATVTVADPPALA 124 Query 125 AALS 128 L+ Sbjct 125 RLLA 128 >gi|271964862|ref|YP_003339058.1| hypothetical protein [Streptosporangium roseum DSM 43021] gi|270508037|gb|ACZ86315.1| hypothetical protein Sros_3377 [Streptosporangium roseum DSM 43021] Length=129 Score = 83.6 bits (205), Expect = 1e-14, Method: Compositional matrix adjust. Identities = 47/123 (39%), Positives = 67/123 (55%), Gaps = 6/123 (4%) Query 13 RYDLRLAPMW-LPFRWPG---SQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPYRW 68 R+D + P W +P R G + + E+G R+G + + PLS+V +TGPY Sbjct 3 RFDFAIEPAWRIPLRLFGVTPERAFALVEEGALTVRFGHWLLRTPLSNVAGTTLTGPYST 62 Query 69 WTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVI--GLRDHSALTVTVADPEGLVAA 126 +G LS+ D G+TFGTN GVC+ F + ++ GL H T+T+ADPEGLV A Sbjct 63 LKVIGAHLSLADRGITFGTNPRRGVCVRFHTPVPALLPGGLLTHPGATLTLADPEGLVRA 122 Query 127 LSS 129 L Sbjct 123 LEK 125 >gi|330468717|ref|YP_004406460.1| hypothetical protein VAB18032_23810 [Verrucosispora maris AB-18-032] gi|328811688|gb|AEB45860.1| hypothetical protein VAB18032_23810 [Verrucosispora maris AB-18-032] Length=136 Score = 75.1 bits (183), Expect = 5e-12, Method: Compositional matrix adjust. Identities = 45/116 (39%), Positives = 63/116 (55%), Gaps = 8/116 (6%) Query 13 RYDLRLAPMWLPFRW-----PGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPYR 67 R++ R P W P P + V V D R+GP+R+ +V +GPYR Sbjct 5 RFEFRFDPPWRPVLALLGVRPSTAWVDVDAD-EVTVRFGPWRLRTTRDNVTGVQESGPYR 63 Query 68 WWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLR--DHSALTVTVADPE 121 WW A+GP LS D G+TFG++ A G+CI F + ++ R H A+TVTVADP+ Sbjct 64 WWRAIGPHLSAADVGVTFGSSTARGLCIRFGRPVPALLPGRWLRHPAMTVTVADPD 119 >gi|315505624|ref|YP_004084511.1| hypothetical protein ML5_4885 [Micromonospora sp. L5] gi|315412243|gb|ADU10360.1| hypothetical protein ML5_4885 [Micromonospora sp. L5] Length=145 Score = 74.7 bits (182), Expect = 6e-12, Method: Compositional matrix adjust. Identities = 51/126 (41%), Positives = 68/126 (54%), Gaps = 10/126 (7%) Query 9 HFPYRYDLRLAPMWLPFRWPGSQGVTVTED---GRFVARYGPFRVEAPLSSVRDAHITGP 65 FP+R+D LP G + T D V R+GP+ + +V A ++GP Sbjct 19 RFPFRFDPAFR---LPLALLGVRPATAWLDWGPDALVVRFGPWLLRTTPGNVTGAELSGP 75 Query 66 YRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHF-EPRIHRVIG--LRDHSALTVTVADPEG 122 YRWW A+GP LS D G+TFG + A G+C+ F EP G LR H A+TVTVADP Sbjct 76 YRWWRAIGPHLSAADGGVTFGASVAGGLCLRFAEPVPALAPGPWLR-HPAVTVTVADPAA 134 Query 123 LVAALS 128 + AL+ Sbjct 135 VRDALA 140 >gi|302867977|ref|YP_003836614.1| hypothetical protein Micau_3511 [Micromonospora aurantiaca ATCC 27029] gi|302570836|gb|ADL47038.1| hypothetical protein Micau_3511 [Micromonospora aurantiaca ATCC 27029] Length=154 Score = 73.9 bits (180), Expect = 9e-12, Method: Compositional matrix adjust. Identities = 51/126 (41%), Positives = 68/126 (54%), Gaps = 10/126 (7%) Query 9 HFPYRYDLRLAPMWLPFRWPGSQGVTVTED---GRFVARYGPFRVEAPLSSVRDAHITGP 65 FP+R+D LP G + T D V R+GP+ + +V A ++GP Sbjct 28 RFPFRFDPAFR---LPLALLGVRPATAWLDWGPDALVVRFGPWLLRTTPGNVTGAELSGP 84 Query 66 YRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHF-EPRIHRVIG--LRDHSALTVTVADPEG 122 YRWW A+GP LS D G+TFG + A GVC+ F EP G LR H A+TVTVADP Sbjct 85 YRWWRAIGPHLSAADGGVTFGASVAGGVCLRFAEPVPGLAPGPWLR-HPAVTVTVADPAA 143 Query 123 LVAALS 128 + A++ Sbjct 144 VRDAVA 149 >gi|300786144|ref|YP_003766435.1| hypothetical protein AMED_4259 [Amycolatopsis mediterranei U32] gi|384149459|ref|YP_005532275.1| hypothetical protein RAM_21690 [Amycolatopsis mediterranei S699] gi|399538027|ref|YP_006550689.1| hypothetical protein AMES_4208 [Amycolatopsis mediterranei S699] gi|299795658|gb|ADJ46033.1| conserved hypothetical protein [Amycolatopsis mediterranei U32] gi|340527613|gb|AEK42818.1| hypothetical protein RAM_21690 [Amycolatopsis mediterranei S699] gi|398318797|gb|AFO77744.1| hypothetical protein AMES_4208 [Amycolatopsis mediterranei S699] Length=142 Score = 73.6 bits (179), Expect = 1e-11, Method: Compositional matrix adjust. Identities = 40/87 (46%), Positives = 53/87 (61%), Gaps = 2/87 (2%) Query 44 RYGPFRVEAPLSSVRDAHITGPYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHR 103 R+GP+ VE PLS++ A TGPYR G RLS+ D GLTFGT GVC+ F + Sbjct 50 RFGPWLVETPLSNLAGAEATGPYRALRVFGVRLSLADRGLTFGTTTRGGVCLRFREPVRG 109 Query 104 VI--GLRDHSALTVTVADPEGLVAALS 128 + GL H LTVTV++PE + A++ Sbjct 110 IDPWGLVRHPGLTVTVSEPELVAEAIN 136 >gi|331698010|ref|YP_004334249.1| hypothetical protein Psed_4234 [Pseudonocardia dioxanivorans CB1190] gi|326952699|gb|AEA26396.1| hypothetical protein Psed_4234 [Pseudonocardia dioxanivorans CB1190] Length=162 Score = 73.6 bits (179), Expect = 1e-11, Method: Compositional matrix adjust. Identities = 52/123 (43%), Positives = 62/123 (51%), Gaps = 8/123 (6%) Query 13 RYDLRLAPMWLPFRW-----PGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHITGPYR 67 RYD A + P P + V VT D R+GP+RV PL +V A TGP Sbjct 22 RYDFSFAAVARPMLAALGVRPATAWVAVTHD-LLDVRFGPWRVRTPLVNVFSAEPTGPLN 80 Query 68 WWTAVGPRLSMVDDGLTFGTNAAAGVCIHFE--PRIHRVIGLRDHSALTVTVADPEGLVA 125 T +GPRLS+ D GLTFG++ GVCI F R GL H LTVTV P LV Sbjct 81 AVTVLGPRLSLADLGLTFGSDTRGGVCIRFRRPVRGFEPFGLLHHPGLTVTVTTPGLLVT 140 Query 126 ALS 128 L+ Sbjct 141 RLN 143 >gi|84497363|ref|ZP_00996185.1| hypothetical protein JNB_14253 [Janibacter sp. HTCC2649] gi|84382251|gb|EAP98133.1| hypothetical protein JNB_14253 [Janibacter sp. HTCC2649] Length=126 Score = 67.0 bits (162), Expect = 1e-09, Method: Compositional matrix adjust. Identities = 45/128 (36%), Positives = 64/128 (50%), Gaps = 5/128 (3%) Query 4 QDQGLHFPYRYDLRLAPMWLPFRWPGSQGVTVTEDGRFVARYGPFRVEAPLSSVRDAHIT 63 ++ F + RL + L R P + VTVT D R+GP+R+ PL+++ IT Sbjct 1 MNRRFEFAFAPAYRLPALILGIR-PRTAHVTVTAD-ELRVRFGPWRLVTPLTNIATTEIT 58 Query 64 GPYRWWTAVG-PRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLR--DHSALTVTVADP 120 G + W G P LS D G+TF TN +C+ F + + R H T+TVADP Sbjct 59 GNFGWLKTAGPPHLSFADRGVTFATNGERALCVRFLEPVAGIDPTRTIKHPGATLTVADP 118 Query 121 EGLVAALS 128 E L AL+ Sbjct 119 ESLQRALA 126 >gi|404612847|gb|EKB09904.1| hypothetical protein HMPREF1167_03713 [Aeromonas veronii AER39] Length=157 Score = 35.0 bits (79), Expect = 4.9, Method: Compositional matrix adjust. Identities = 18/54 (34%), Positives = 26/54 (49%), Gaps = 0/54 (0%) Query 65 PYRWWTAVGPRLSMVDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVTVA 118 P ++ PR+ DG + GTN VCI EPR+ VI S+ + T + Sbjct 101 PDATISSTRPRVLFAADGTSLGTNMTVRVCISEEPRVDVVIAASGRSSKSETTS 154 >gi|254436964|ref|ZP_05050458.1| AP endonuclease, family 2 [Octadecabacter antarcticus 307] gi|198252410|gb|EDY76724.1| AP endonuclease, family 2 [Octadecabacter antarcticus 307] Length=296 Score = 34.7 bits (78), Expect = 6.6, Method: Compositional matrix adjust. Identities = 20/53 (38%), Positives = 27/53 (51%), Gaps = 5/53 (9%) Query 69 WTAVGPRLSM-----VDDGLTFGTNAAAGVCIHFEPRIHRVIGLRDHSALTVT 116 WTA R+ VD GLT G +A AG + FEP + R++ D S L + Sbjct 131 WTAYRDRIKESAKIGVDHGLTVGIHAHAGGFMDFEPELERLLNEVDESILKIC 183 Lambda K H 0.324 0.140 0.463 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 177396525206 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Oct 14, 2012 4:13 PM Number of letters in database: 7,218,481,314 Number of sequences in database: 21,062,489 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40