BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv1671 Length=130 Score E Sequences producing significant alignments: (Bits) Value gi|253799290|ref|YP_003032291.1| hypothetical protein TBMG_02323... 256 7e-67 gi|15608809|ref|NP_216187.1| hypothetical protein Rv1671 [Mycoba... 256 1e-66 gi|15841126|ref|NP_336163.1| hypothetical protein MT1709 [Mycoba... 255 1e-66 gi|339631725|ref|YP_004723367.1| hypothetical protein MAF_16880 ... 254 3e-66 gi|331695162|ref|YP_004331401.1| hypothetical protein Psed_1305 ... 161 4e-38 gi|240171059|ref|ZP_04749718.1| hypothetical protein MkanA1_1723... 158 3e-37 gi|83814518|ref|YP_444386.1| hypothetical protein SRU_0240 [Sali... 52.4 2e-05 gi|294506129|ref|YP_003570187.1| hypothetical protein SRM_00314 ... 52.4 2e-05 gi|91215129|ref|ZP_01252101.1| hypothetical protein P700755_1256... 50.4 8e-05 gi|91215124|ref|ZP_01252096.1| hypothetical protein P700755_1254... 47.0 8e-04 gi|307106705|gb|EFN54950.1| hypothetical protein CHLNCDRAFT_2411... 36.2 1.7 gi|255536713|ref|XP_002509423.1| protein with unknown function [... 35.4 2.5 gi|170744595|ref|YP_001773250.1| EmrB/QacA family drug resistanc... 34.7 4.2 gi|288959868|ref|YP_003450208.1| branched-chain amino acid trans... 34.7 4.7 gi|299138489|ref|ZP_07031668.1| MotA/TolQ/ExbB proton channel [A... 34.3 6.2 gi|297180449|gb|ADI16664.1| hypothetical protein [uncultured Rho... 33.9 9.0 >gi|253799290|ref|YP_003032291.1| hypothetical protein TBMG_02323 [Mycobacterium tuberculosis KZN 1435] gi|254364513|ref|ZP_04980559.1| hypothetical membrane protein [Mycobacterium tuberculosis str. Haarlem] gi|289554555|ref|ZP_06443765.1| membrane protein [Mycobacterium tuberculosis KZN 605] 25 more sequence titlesLength=131 Score = 256 bits (654), Expect = 7e-67, Method: Compositional matrix adjust. Identities = 130/130 (100%), Positives = 130/130 (100%), Gaps = 0/130 (0%) Query 1 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 60 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA Sbjct 2 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 61 Query 61 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 120 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS Sbjct 62 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 121 Query 121 AVTTWFGTFV 130 AVTTWFGTFV Sbjct 122 AVTTWFGTFV 131 >gi|15608809|ref|NP_216187.1| hypothetical protein Rv1671 [Mycobacterium tuberculosis H37Rv] gi|31792857|ref|NP_855350.1| hypothetical protein Mb1698 [Mycobacterium bovis AF2122/97] gi|121637578|ref|YP_977801.1| hypothetical protein BCG_1709 [Mycobacterium bovis BCG str. Pasteur 1173P2] 48 more sequence titles Length=130 Score = 256 bits (653), Expect = 1e-66, Method: Compositional matrix adjust. Identities = 130/130 (100%), Positives = 130/130 (100%), Gaps = 0/130 (0%) Query 1 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 60 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA Sbjct 1 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 60 Query 61 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 120 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS Sbjct 61 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 120 Query 121 AVTTWFGTFV 130 AVTTWFGTFV Sbjct 121 AVTTWFGTFV 130 >gi|15841126|ref|NP_336163.1| hypothetical protein MT1709 [Mycobacterium tuberculosis CDC1551] gi|13881344|gb|AAK45977.1| hypothetical protein MT1709 [Mycobacterium tuberculosis CDC1551] Length=152 Score = 255 bits (652), Expect = 1e-66, Method: Compositional matrix adjust. Identities = 130/130 (100%), Positives = 130/130 (100%), Gaps = 0/130 (0%) Query 1 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 60 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA Sbjct 23 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 82 Query 61 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 120 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS Sbjct 83 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 142 Query 121 AVTTWFGTFV 130 AVTTWFGTFV Sbjct 143 AVTTWFGTFV 152 >gi|339631725|ref|YP_004723367.1| hypothetical protein MAF_16880 [Mycobacterium africanum GM041182] gi|339331081|emb|CCC26759.1| putative membrane protein [Mycobacterium africanum GM041182] Length=130 Score = 254 bits (649), Expect = 3e-66, Method: Compositional matrix adjust. Identities = 129/130 (99%), Positives = 129/130 (99%), Gaps = 0/130 (0%) Query 1 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 60 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA Sbjct 1 MPTVGPADHAAGLDRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWA 60 Query 61 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 120 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMA LAIAVGTYGVLS Sbjct 61 NDLYDNYAWWFRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAALAIAVGTYGVLS 120 Query 121 AVTTWFGTFV 130 AVTTWFGTFV Sbjct 121 AVTTWFGTFV 130 >gi|331695162|ref|YP_004331401.1| hypothetical protein Psed_1305 [Pseudonocardia dioxanivorans CB1190] gi|326949851|gb|AEA23548.1| hypothetical protein Psed_1305 [Pseudonocardia dioxanivorans CB1190] Length=125 Score = 161 bits (407), Expect = 4e-38, Method: Compositional matrix adjust. Identities = 76/116 (66%), Positives = 94/116 (82%), Gaps = 0/116 (0%) Query 14 DRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDLYDNYAWWFRV 73 + D+LP+WRIG+ GLVG+LCCVGPT+LAL+G+++A TAFAWA +LY+ YAWWFR+ Sbjct 9 SEKTKADRLPVWRIGLAGGLVGILCCVGPTVLALLGLVTAGTAFAWATNLYNGYAWWFRL 68 Query 74 SGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLSAVTTWFGTF 129 GLAVLA LVW +LR RN+CSV +RR RWRL+AVL IAVGTY L A+TTW GTF Sbjct 69 GGLAVLAGLVWLSLRRRNQCSVAGMRRWRWRLLAVLGIAVGTYAALYALTTWLGTF 124 >gi|240171059|ref|ZP_04749718.1| hypothetical protein MkanA1_17239 [Mycobacterium kansasii ATCC 12478] Length=219 Score = 158 bits (399), Expect = 3e-37, Method: Compositional matrix adjust. Identities = 77/110 (70%), Positives = 89/110 (81%), Gaps = 0/110 (0%) Query 20 DQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDLYDNYAWWFRVSGLAVL 79 ++LPIWRIGI GLVG+LCCVGPT+LA+ GIIS ATA AWAN+LY NYAWWFR+SGL VL Sbjct 109 NRLPIWRIGITGGLVGILCCVGPTVLAMFGIISGATALAWANNLYGNYAWWFRLSGLGVL 168 Query 80 AILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLSAVTTWFGTF 129 A+L W ALR RN+CS+ +RRLRWRL +LAIA GTY VL VTTW F Sbjct 169 ALLAWIALRRRNQCSLGGVRRLRWRLATMLAIAAGTYAVLYGVTTWLERF 218 >gi|83814518|ref|YP_444386.1| hypothetical protein SRU_0240 [Salinibacter ruber DSM 13855] gi|83755912|gb|ABC44025.1| conserved hypothetical protein [Salinibacter ruber DSM 13855] Length=151 Score = 52.4 bits (124), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 42/128 (33%), Positives = 64/128 (50%), Gaps = 17/128 (13%) Query 14 DRRATPDQLPIW----RIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDLY--DNY 67 + RA +W ++ GL GMLCCV P +L VG++ A ++A+ Y D Sbjct 7 EDRADESPPSLWDWGLKVAASVGLAGMLCCVAPAVLFAVGLMGGIYAISFADLFYAPDGS 66 Query 68 ----AWWFRVSGLAVLAILV--WWALRHRNRCSVN-AIRRLRWRLMAVLAIAVGT--YGV 118 AW R G+A L LV W R +N+CS++ A +R L A+L + +GT Y Sbjct 67 AGLGAWLLR--GVAALVGLVGTWLYHRRQNQCSIDPARKRKNLALFALLVVVLGTGFYLS 124 Query 119 LSAVTTWF 126 +T+W+ Sbjct 125 FEELTSWY 132 >gi|294506129|ref|YP_003570187.1| hypothetical protein SRM_00314 [Salinibacter ruber M8] gi|294342457|emb|CBH23235.1| Conserved hypothetical protein, membrane [Salinibacter ruber M8] Length=151 Score = 52.4 bits (124), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 42/128 (33%), Positives = 64/128 (50%), Gaps = 17/128 (13%) Query 14 DRRATPDQLPIW----RIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDLY--DNY 67 + RA +W ++ GL GMLCCV P +L VG++ A ++A+ Y D Sbjct 7 EDRANESPPSLWDWGLKVAASVGLAGMLCCVAPAVLFAVGLMGGIYAISFADLFYAPDGS 66 Query 68 ----AWWFRVSGLAVLAILV--WWALRHRNRCSVN-AIRRLRWRLMAVLAIAVGT--YGV 118 AW R G+A L LV W R +N+CS++ A +R L A+L + +GT Y Sbjct 67 AGLGAWILR--GVAALVGLVGTWLYHRRQNQCSIDPARKRKNLALFALLVVVLGTGFYLS 124 Query 119 LSAVTTWF 126 +T+W+ Sbjct 125 FEELTSWY 132 >gi|91215129|ref|ZP_01252101.1| hypothetical protein P700755_12567 [Psychroflexus torquis ATCC 700755] gi|91186734|gb|EAS73105.1| hypothetical protein P700755_12567 [Psychroflexus torquis ATCC 700755] Length=150 Score = 50.4 bits (119), Expect = 8e-05, Method: Compositional matrix adjust. Identities = 27/105 (26%), Positives = 55/105 (53%), Gaps = 9/105 (8%) Query 31 SGLVGMLCCVGPTILALVGIISAATAFAWANDLYDNY------AWWFRVSGLAVLAILVW 84 +G+ G+LCCV P +L + G++ A ++A+ Y + +W R LA+ V+ Sbjct 25 AGIAGILCCVAPAVLFMFGLMGGIYAISFADFFYADDGSIGLGSWILRGLALAIGLFGVY 84 Query 85 WALRHRNRCSVNAIRRLRWRLMA---VLAIAVGTYGVLSAVTTWF 126 + +N+CS+N R+ + ++ L + VG + L +++W+ Sbjct 85 MFRKKQNQCSINPKRKKKNLILMTIITLVLGVGIFLSLEKLSSWY 129 >gi|91215124|ref|ZP_01252096.1| hypothetical protein P700755_12542 [Psychroflexus torquis ATCC 700755] gi|91186729|gb|EAS73100.1| hypothetical protein P700755_12542 [Psychroflexus torquis ATCC 700755] Length=150 Score = 47.0 bits (110), Expect = 8e-04, Method: Compositional matrix adjust. Identities = 26/105 (25%), Positives = 54/105 (52%), Gaps = 9/105 (8%) Query 31 SGLVGMLCCVGPTILALVGIISAATAFAWANDLYDNY------AWWFRVSGLAVLAILVW 84 +G+ G+LCCV P +L + G++ A ++A+ Y + + R LA+ V+ Sbjct 25 AGIAGILCCVAPAVLFMFGLMGGIYAISFADFFYADDGSIGLGSLILRGLALAIGLFGVY 84 Query 85 WALRHRNRCSVNAIRRLRWRLMA---VLAIAVGTYGVLSAVTTWF 126 + +N+CS+N R+ + ++ L + VG + L +++W+ Sbjct 85 MFRKKQNQCSINPKRKKKNLILMTTITLVLGVGIFLSLEKLSSWY 129 >gi|307106705|gb|EFN54950.1| hypothetical protein CHLNCDRAFT_24117 [Chlorella variabilis] Length=697 Score = 36.2 bits (82), Expect = 1.7, Method: Composition-based stats. Identities = 20/53 (38%), Positives = 28/53 (53%), Gaps = 5/53 (9%) Query 36 MLCCVGPTILALVGIISAATAFAWANDLYDNYAWWFRVSGLAVLAILVWWALR 88 +LCCVG +V AA AF+WA + D Y R+ GL V + ++W R Sbjct 404 VLCCVGRAFFEMVDYPEAAKAFSWARQV-DPY----RLRGLEVYSTVLWHCKR 451 >gi|255536713|ref|XP_002509423.1| protein with unknown function [Ricinus communis] gi|223549322|gb|EEF50810.1| protein with unknown function [Ricinus communis] Length=994 Score = 35.4 bits (80), Expect = 2.5, Method: Composition-based stats. Identities = 16/62 (26%), Positives = 34/62 (55%), Gaps = 4/62 (6%) Query 65 DNYAWW----FRVSGLAVLAILVWWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLS 120 + YAW F ++ L ++ +VW+ ++RN + AI + RW LM+ + + +L+ Sbjct 621 EGYAWLLKSIFILAALVLVIGVVWFYFKYRNYKNARAIDKSRWTLMSFHKLGFSEFEILA 680 Query 121 AV 122 ++ Sbjct 681 SL 682 >gi|170744595|ref|YP_001773250.1| EmrB/QacA family drug resistance transporter [Methylobacterium sp. 4-46] gi|168198869|gb|ACA20816.1| drug resistance transporter, EmrB/QacA subfamily [Methylobacterium sp. 4-46] Length=529 Score = 34.7 bits (78), Expect = 4.2, Method: Compositional matrix adjust. Identities = 32/102 (32%), Positives = 48/102 (48%), Gaps = 13/102 (12%) Query 22 LPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDLYDNYAWWFRVSGLAVLAI 81 P + GI+S ++G++ + PTI VG + DL+D + W F V+ L L + Sbjct 145 FPPSKRGIVSPMIGLVATLAPTIGPTVG--------GYLTDLFD-WHWLFLVNILPGLCV 195 Query 82 LV-WWALRHRNRCSVNAIRRLRWRLMAVLAIAVGTYGVLSAV 122 V WAL + ++ +RR W A LA G G L V Sbjct 196 TVSTWALVDFDEPNLALLRRFDW---AGLAFMAGFLGCLEYV 234 >gi|288959868|ref|YP_003450208.1| branched-chain amino acid transport system permease protein [Azospirillum sp. B510] gi|288912176|dbj|BAI73664.1| branched-chain amino acid transport system permease protein [Azospirillum sp. B510] Length=612 Score = 34.7 bits (78), Expect = 4.7, Method: Compositional matrix adjust. Identities = 27/73 (37%), Positives = 35/73 (48%), Gaps = 10/73 (13%) Query 8 DHAAGL----DRRATPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDL 63 D +A L D RA P QLP WRI I G +G L ++A VGI SAA + + Sbjct 112 DQSAQLLFSPDPRALPTQLPTWRIAIGGGSIGALDL----LIAGVGIGSAALLYGFLR-- 165 Query 64 YDNYAWWFRVSGL 76 + W R + L Sbjct 166 FTKLGWAVRATAL 178 >gi|299138489|ref|ZP_07031668.1| MotA/TolQ/ExbB proton channel [Acidobacterium sp. MP5ACTX8] gi|298599735|gb|EFI55894.1| MotA/TolQ/ExbB proton channel [Acidobacterium sp. MP5ACTX8] Length=259 Score = 34.3 bits (77), Expect = 6.2, Method: Compositional matrix adjust. Identities = 25/72 (35%), Positives = 33/72 (46%), Gaps = 13/72 (18%) Query 17 ATPDQLPIWRIGIISGLVG-MLCCVGPTILALVGIISAATAFAW------------ANDL 63 ATPDQL + SG + ML GPT L ++GI+ A+ F+W AN Sbjct 13 ATPDQLGPPPVAAHSGAIAEMLHNSGPTALTVLGILLLASIFSWAIMLSKWRSFGAANKQ 72 Query 64 YDNYAWWFRVSG 75 + FR SG Sbjct 73 NRRFVRAFRKSG 84 >gi|297180449|gb|ADI16664.1| hypothetical protein [uncultured Rhodobacterales bacterium HF0010_04M21] Length=158 Score = 33.9 bits (76), Expect = 9.0, Method: Compositional matrix adjust. Identities = 36/116 (32%), Positives = 58/116 (50%), Gaps = 12/116 (10%) Query 18 TPDQLPIWRIGIISGLVGMLCCVGPTILALVGIISAATAFAWANDLY--DNYAWWFR--- 72 + D +WR S + +CC +L G+ S ++A A +NDLY N W R Sbjct 24 STDVKSLWRWIGGSAFLASMCCFPSVVLVFFGLASVSSAAALSNDLYWGTNGMGWVRPTL 83 Query 73 --VSGLAVLAILVWWALRHRNRCSVNAIRRLRWRL----MAVLAIAVGTYGVLSAV 122 +S L VLA LV + R CS++ +R R R+ + VL++A+ +Y V + + Sbjct 84 MVLSTLLVLAGLVMY-FRGEGICSLDEAKRQRKRIINTSIVVLSLALLSYIVFNFI 138 Lambda K H 0.330 0.140 0.489 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 127765705240 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40