BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv3008 Length=207 Score E Sequences producing significant alignments: (Bits) Value gi|15610145|ref|NP_217524.1| hypothetical protein Rv3008 [Mycoba... 420 7e-116 gi|148824198|ref|YP_001288952.1| hypothetical protein TBFG_13023... 418 2e-115 gi|340627998|ref|YP_004746450.1| hypothetical protein MCAN_30311... 416 9e-115 gi|339295842|gb|AEJ47953.1| hypothetical protein CCDC5079_2763 [... 389 2e-106 gi|308375958|ref|ZP_07445655.2| hypothetical protein TMGG_02550 ... 325 3e-87 gi|167967822|ref|ZP_02550099.1| hypothetical protein MtubH3_0723... 302 2e-80 gi|221201097|ref|ZP_03574137.1| putative membrane protein [Burkh... 40.8 0.13 gi|154420390|ref|XP_001583210.1| F5/8 type C domain containing p... 39.3 0.40 gi|339482550|ref|YP_004694336.1| putative DNA repair protein [Ni... 37.7 0.92 gi|170692106|ref|ZP_02883270.1| hypothetical protein BgramDRAFT_... 36.6 2.5 gi|118577270|ref|YP_899510.1| hypothetical protein Ppro_3665 [Pe... 36.2 3.2 gi|163859078|ref|YP_001633376.1| hypothetical protein Bpet4757 [... 35.8 4.0 gi|225874898|ref|YP_002756357.1| MgtC family protein [Acidobacte... 35.4 5.1 gi|299136520|ref|ZP_07029703.1| MgtC/SapB transporter [Acidobact... 35.0 6.5 >gi|15610145|ref|NP_217524.1| hypothetical protein Rv3008 [Mycobacterium tuberculosis H37Rv] gi|15842566|ref|NP_337603.1| hypothetical protein MT3088 [Mycobacterium tuberculosis CDC1551] gi|31794185|ref|NP_856678.1| hypothetical protein Mb3033 [Mycobacterium bovis AF2122/97] 47 more sequence titlesLength=207 Score = 420 bits (1079), Expect = 7e-116, Method: Compositional matrix adjust. Identities = 207/207 (100%), Positives = 207/207 (100%), Gaps = 0/207 (0%) Query 1 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI Sbjct 1 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 Query 61 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG 120 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG Sbjct 61 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG 120 Query 121 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD 180 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD Sbjct 121 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD 180 Query 181 RAATIGTLNVYRRNSPDGDEPLPADGN 207 RAATIGTLNVYRRNSPDGDEPLPADGN Sbjct 181 RAATIGTLNVYRRNSPDGDEPLPADGN 207 >gi|148824198|ref|YP_001288952.1| hypothetical protein TBFG_13023 [Mycobacterium tuberculosis F11] gi|148722725|gb|ABR07350.1| hypothetical protein TBFG_13023 [Mycobacterium tuberculosis F11] Length=207 Score = 418 bits (1075), Expect = 2e-115, Method: Compositional matrix adjust. Identities = 206/207 (99%), Positives = 207/207 (100%), Gaps = 0/207 (0%) Query 1 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI Sbjct 1 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 Query 61 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG 120 TAAGLVSGARIGPGAAAKRDPQLAQWN+IRSHYQEIAEWIDHDTATAHPAVAATQISAAG Sbjct 61 TAAGLVSGARIGPGAAAKRDPQLAQWNKIRSHYQEIAEWIDHDTATAHPAVAATQISAAG 120 Query 121 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD 180 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD Sbjct 121 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD 180 Query 181 RAATIGTLNVYRRNSPDGDEPLPADGN 207 RAATIGTLNVYRRNSPDGDEPLPADGN Sbjct 181 RAATIGTLNVYRRNSPDGDEPLPADGN 207 >gi|340627998|ref|YP_004746450.1| hypothetical protein MCAN_30311 [Mycobacterium canettii CIPT 140010059] gi|340006188|emb|CCC45362.1| hypothetical protein MCAN_30311 [Mycobacterium canettii CIPT 140010059] Length=207 Score = 416 bits (1070), Expect = 9e-115, Method: Compositional matrix adjust. Identities = 205/207 (99%), Positives = 206/207 (99%), Gaps = 0/207 (0%) Query 1 MLTVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 MLTVVAVIGILECGL+LHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI Sbjct 1 MLTVVAVIGILECGLLLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 Query 61 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG 120 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG Sbjct 61 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAG 120 Query 121 SFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD 180 SFGRANMVDYLGLLDSRADETVRR EFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD Sbjct 121 SFGRANMVDYLGLLDSRADETVRRGEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYD 180 Query 181 RAATIGTLNVYRRNSPDGDEPLPADGN 207 RAATIGTLNVYRRNSPDGDEPLPADGN Sbjct 181 RAATIGTLNVYRRNSPDGDEPLPADGN 207 >gi|339295842|gb|AEJ47953.1| hypothetical protein CCDC5079_2763 [Mycobacterium tuberculosis CCDC5079] gi|339299455|gb|AEJ51565.1| hypothetical protein CCDC5180_2728 [Mycobacterium tuberculosis CCDC5180] Length=192 Score = 389 bits (999), Expect = 2e-106, Method: Compositional matrix adjust. Identities = 191/192 (99%), Positives = 192/192 (100%), Gaps = 0/192 (0%) Query 16 VLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAITAAGLVSGARIGPGA 75 +LHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAITAAGLVSGARIGPGA Sbjct 1 MLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAITAAGLVSGARIGPGA 60 Query 76 AAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANMVDYLGLLD 135 AAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANMVDYLGLLD Sbjct 61 AAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANMVDYLGLLD 120 Query 136 SRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYDRAATIGTLNVYRRNS 195 SRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYDRAATIGTLNVYRRNS Sbjct 121 SRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRHAYDRAATIGTLNVYRRNS 180 Query 196 PDGDEPLPADGN 207 PDGDEPLPADGN Sbjct 181 PDGDEPLPADGN 192 >gi|308375958|ref|ZP_07445655.2| hypothetical protein TMGG_02550 [Mycobacterium tuberculosis SUMu007] gi|308344699|gb|EFP33550.1| hypothetical protein TMGG_02550 [Mycobacterium tuberculosis SUMu007] Length=162 Score = 325 bits (833), Expect = 3e-87, Method: Compositional matrix adjust. Identities = 161/162 (99%), Positives = 162/162 (100%), Gaps = 0/162 (0%) Query 46 VWRGDRVATPLAVAITAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTA 105 +WRGDRVATPLAVAITAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTA Sbjct 1 MWRGDRVATPLAVAITAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTA 60 Query 106 TAHPAVAATQISAAGSFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSV 165 TAHPAVAATQISAAGSFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSV Sbjct 61 TAHPAVAATQISAAGSFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSV 120 Query 166 DAATIALPEFRHAYDRAATIGTLNVYRRNSPDGDEPLPADGN 207 DAATIALPEFRHAYDRAATIGTLNVYRRNSPDGDEPLPADGN Sbjct 121 DAATIALPEFRHAYDRAATIGTLNVYRRNSPDGDEPLPADGN 162 >gi|167967822|ref|ZP_02550099.1| hypothetical protein MtubH3_07237 [Mycobacterium tuberculosis H37Ra] gi|254552088|ref|ZP_05142535.1| hypothetical protein Mtube_16816 [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] gi|308232314|ref|ZP_07415641.2| hypothetical protein TMAG_01215 [Mycobacterium tuberculosis SUMu001] 22 more sequence titles Length=150 Score = 302 bits (774), Expect = 2e-80, Method: Compositional matrix adjust. Identities = 149/150 (99%), Positives = 150/150 (100%), Gaps = 0/150 (0%) Query 58 VAITAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQIS 117 +AITAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQIS Sbjct 1 MAITAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQIS 60 Query 118 AAGSFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRH 177 AAGSFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRH Sbjct 61 AAGSFGRANMVDYLGLLDSRADETVRRDEFSRWLSAKPDYLVTTEQSVDAATIALPEFRH 120 Query 178 AYDRAATIGTLNVYRRNSPDGDEPLPADGN 207 AYDRAATIGTLNVYRRNSPDGDEPLPADGN Sbjct 121 AYDRAATIGTLNVYRRNSPDGDEPLPADGN 150 >gi|221201097|ref|ZP_03574137.1| putative membrane protein [Burkholderia multivorans CGD2M] gi|221206451|ref|ZP_03579464.1| putative membrane protein [Burkholderia multivorans CGD2] gi|221173760|gb|EEE06194.1| putative membrane protein [Burkholderia multivorans CGD2] gi|221178947|gb|EEE11354.1| putative membrane protein [Burkholderia multivorans CGD2M] Length=487 Score = 40.8 bits (94), Expect = 0.13, Method: Compositional matrix adjust. Identities = 51/204 (25%), Positives = 87/204 (43%), Gaps = 23/204 (11%) Query 15 LVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAV----AITAAGLVSGAR 70 + L +P N WY P+ +++ + S G +R + LA+ + AA + A Sbjct 292 VFLRIP-NYHWYYAPFFYFLL----LFSALGTYRVIEMLVKLAIRQKTYLLAAVPMISAT 346 Query 71 IGPGAAAKR--DPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANMV 128 + GA + D + ++ Y+ I WI+ D + VAA +I G +G ++ Sbjct 347 VAFGAYNLKLADIERGSFDP----YKNIGIWIN-DNTPRNAVVAAAEIGTVGWYGNRYII 401 Query 129 DYLGLLDSRADETVRRDEFSRWLSA-KPDYLVTTEQ--SVDAATIALPEFRHAY---DRA 182 D LGL + + + + WL+ PDY++ E +AAT L AY R Sbjct 402 DILGLTNKYNADFIANKDVHSWLTKYSPDYILVHEPLWPFEAATSCLTR-TAAYAPAPRF 460 Query 183 ATIGTLNVYRRNSPDGDEPLPADG 206 G + + PD +E + A G Sbjct 461 NFPGYQLLVKSTEPDTNERITACG 484 >gi|154420390|ref|XP_001583210.1| F5/8 type C domain containing protein [Trichomonas vaginalis G3] gi|121917450|gb|EAY22224.1| F5/8 type C domain containing protein [Trichomonas vaginalis G3] Length=1128 Score = 39.3 bits (90), Expect = 0.40, Method: Composition-based stats. Identities = 19/56 (34%), Positives = 27/56 (49%), Gaps = 0/56 (0%) Query 61 TAAGLVSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQI 116 TA ++G R G A + +AQWNEIR Y E ++D D H T++ Sbjct 270 TAGACLTGIRWGKNAGLDLNVAMAQWNEIRDLYLEYLGYVDDDPNPVHKKQYGTRV 325 >gi|339482550|ref|YP_004694336.1| putative DNA repair protein [Nitrosomonas sp. Is79A3] gi|338804695|gb|AEJ00937.1| putative DNA repair protein [Nitrosomonas sp. Is79A3] Length=921 Score = 37.7 bits (86), Expect = 0.92, Method: Compositional matrix adjust. Identities = 32/113 (29%), Positives = 45/113 (40%), Gaps = 17/113 (15%) Query 73 PGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANMVDYLG 132 P A + R Q+ QW +R + EW+D + H AV AT+ + A G + L Sbjct 714 PNALSGRFAQIEQWRLLRL----VREWLDEEKKRGHFAVIATEETRAIRIGELVLNARLD 769 Query 133 LLDSRA-------DETVRRDEFSRWLSAKPD------YLVTTEQSVDAATIAL 172 +D D R+ L +PD YLV TE AA +A Sbjct 770 RVDELEDGQHIVIDYKTRKQSVQTMLGERPDEPQLPLYLVMTEVQQQAAGVAF 822 >gi|170692106|ref|ZP_02883270.1| hypothetical protein BgramDRAFT_2079 [Burkholderia graminis C4D1M] gi|170143390|gb|EDT11554.1| hypothetical protein BgramDRAFT_2079 [Burkholderia graminis C4D1M] Length=490 Score = 36.6 bits (83), Expect = 2.5, Method: Compositional matrix adjust. Identities = 35/153 (23%), Positives = 61/153 (40%), Gaps = 13/153 (8%) Query 15 LVLHMPDNDLWYCGP----WTLWVMAGRGVASGAGVWRGDRVATPLAVAITAAGLVSGAR 70 + LH+P N WY P W L+V G R A + + + A + A Sbjct 293 VFLHIP-NYHWYYAPFYFFWLLFVSVGAWKVLKFTYVRSQESAVFMTLFL--AFFLITAA 349 Query 71 IGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANMVDY 130 G + + Q + Y+ I W+ +T + +A +I G + ++D Sbjct 350 FGYRSFQISNVQRGSMDA----YRNIGGWLKDNTVN-NSVIAMVEIGTVGWYSHRYIIDI 404 Query 131 LGLLDSRADETVRRDEFSRWLSA-KPDYLVTTE 162 LGL + + + R + WLS PDY++ + Sbjct 405 LGLTNRYNADYIARGDVYSWLSKYSPDYILVHQ 437 >gi|118577270|ref|YP_899510.1| hypothetical protein Ppro_3665 [Pelobacter propionicus DSM 2379] gi|118504775|gb|ABL01257.1| protein of unknown function DUF955 [Pelobacter propionicus DSM 2379] Length=372 Score = 36.2 bits (82), Expect = 3.2, Method: Compositional matrix adjust. Identities = 40/154 (26%), Positives = 59/154 (39%), Gaps = 32/154 (20%) Query 66 VSGARIGPGAAAKRDPQLAQWNEIRSHYQEIAEWIDH------------DTATAHPAVAA 113 VS + AAKRD LA + E+A WI+H D P AA Sbjct 78 VSFRALSSMTAAKRDSALAAG----ALAVELASWIEHRFVLPGWKLTDVDFRGYQPEAAA 133 Query 114 TQISAAGSFGRANMVDYLGLLD-------SRADETVRRDEFSRWLSAKPDYLVTTEQSVD 166 + + G + + + LL+ S A++ D FS W A P + T +S + Sbjct 134 ASVRSHWGLGERPIKNTVHLLEANGVRVFSLAEDCKEVDAFSYWQEATPFIFLNTMKSGE 193 Query 167 AATIALPEFRHAYDRAATIGTLNVYRRNSPDGDE 200 R +D +G L ++R PDG E Sbjct 194 ---------RSRFDAMHELGHLILHRHGGPDGRE 218 >gi|163859078|ref|YP_001633376.1| hypothetical protein Bpet4757 [Bordetella petrii DSM 12804] gi|163262806|emb|CAP45109.1| unnamed protein product [Bordetella petrii] Length=477 Score = 35.8 bits (81), Expect = 4.0, Method: Compositional matrix adjust. Identities = 20/55 (37%), Positives = 27/55 (50%), Gaps = 3/55 (5%) Query 73 PGAAAKRDPQLAQWNEIRSHYQEIAEWIDHDTATAHPAVAATQISAAGSFGRANM 127 P A D L E+R HYQ ++W+ +A A+AA QI A SF R + Sbjct 6 PAPTAAYDEMLDAHGEVRPHYQAFSQWLSQQSAD---AMAARQIEADLSFRRVGI 57 >gi|225874898|ref|YP_002756357.1| MgtC family protein [Acidobacterium capsulatum ATCC 51196] gi|225791858|gb|ACO31948.1| MgtC family protein [Acidobacterium capsulatum ATCC 51196] Length=227 Score = 35.4 bits (80), Expect = 5.1, Method: Compositional matrix adjust. Identities = 20/58 (35%), Positives = 29/58 (50%), Gaps = 0/58 (0%) Query 3 TVVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLAVAI 60 +V IG L GL+LH D G T+W +A G+ GAG++ AT L + + Sbjct 75 NIVQGIGFLGAGLILHNRDRISGLTGAATVWAVASIGMCCGAGLYLPAAFATALVLLV 132 >gi|299136520|ref|ZP_07029703.1| MgtC/SapB transporter [Acidobacterium sp. MP5ACTX8] gi|298601035|gb|EFI57190.1| MgtC/SapB transporter [Acidobacterium sp. MP5ACTX8] Length=258 Score = 35.0 bits (79), Expect = 6.5, Method: Compositional matrix adjust. Identities = 21/64 (33%), Positives = 35/64 (55%), Gaps = 1/64 (1%) Query 4 VVAVIGILECGLVLHMPDNDLWYCGPWTLWVMAGRGVASGAGVWRGDRVATPLA-VAITA 62 +V +G L GL+LH + L T++V+A G+ GAG++ +AT L +++ A Sbjct 83 IVQGVGFLGAGLILHTKNRVLGLTSAATVFVVAAIGMTCGAGLYIEALIATVLVLISLQA 142 Query 63 AGLV 66 G V Sbjct 143 VGAV 146 Lambda K H 0.319 0.134 0.420 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 236380426956 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40