BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv3893c Length=77 Score E Sequences producing significant alignments: (Bits) Value gi|340628865|ref|YP_004747317.1| PE family protein [Mycobacteriu... 148 2e-34 gi|15843524|ref|NP_338561.1| PE family protein [Mycobacterium tu... 147 5e-34 gi|339633883|ref|YP_004725525.1| PE family protein [Mycobacteriu... 146 1e-33 gi|297733574|ref|ZP_06962692.1| PE family protein [Mycobacterium... 136 9e-31 gi|296167009|ref|ZP_06849422.1| PE2 family protein [Mycobacteriu... 103 8e-21 gi|41406255|ref|NP_959091.1| hypothetical protein MAP0157 [Mycob... 103 8e-21 gi|342860128|ref|ZP_08716780.1| PE family protein [Mycobacterium... 102 3e-20 gi|254773504|ref|ZP_05215020.1| PE family protein [Mycobacterium... 92.8 2e-17 gi|240168373|ref|ZP_04747032.1| PE family protein [Mycobacterium... 89.0 2e-16 gi|254822563|ref|ZP_05227564.1| PE family protein [Mycobacterium... 85.1 3e-15 gi|333988696|ref|YP_004521310.1| hypothetical protein JDM601_005... 68.6 3e-10 gi|240170857|ref|ZP_04749516.1| hypothetical protein MkanA1_1620... 40.0 0.11 gi|240170359|ref|ZP_04749018.1| hypothetical protein MkanA1_1368... 36.6 1.4 gi|169245932|gb|ACA50953.1| hypothetical protein MUDP_036 [Mycob... 34.7 5.4 >gi|340628865|ref|YP_004747317.1| PE family protein [Mycobacterium canettii CIPT 140010059] gi|340007055|emb|CCC46246.1| PE family protein [Mycobacterium canettii CIPT 140010059] Length=101 Score = 148 bits (374), Expect = 2e-34, Method: Compositional matrix adjust. Identities = 77/77 (100%), Positives = 77/77 (100%), Gaps = 0/77 (0%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS Sbjct 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 Query 61 YLGVVAEHASQRGLFAG 77 YLGVVAEHASQRGLFAG Sbjct 61 YLGVVAEHASQRGLFAG 77 >gi|15843524|ref|NP_338561.1| PE family protein [Mycobacterium tuberculosis CDC1551] gi|31795066|ref|NP_857559.1| PE family protein [Mycobacterium bovis AF2122/97] gi|57117167|ref|YP_178025.1| PE family protein [Mycobacterium tuberculosis H37Rv] 60 more sequence titlesLength=77 Score = 147 bits (371), Expect = 5e-34, Method: Compositional matrix adjust. Identities = 77/77 (100%), Positives = 77/77 (100%), Gaps = 0/77 (0%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS Sbjct 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 Query 61 YLGVVAEHASQRGLFAG 77 YLGVVAEHASQRGLFAG Sbjct 61 YLGVVAEHASQRGLFAG 77 >gi|339633883|ref|YP_004725525.1| PE family protein [Mycobacterium africanum GM041182] gi|339333239|emb|CCC28976.1| PE family protein [Mycobacterium africanum GM041182] Length=77 Score = 146 bits (368), Expect = 1e-33, Method: Compositional matrix adjust. Identities = 76/77 (99%), Positives = 76/77 (99%), Gaps = 0/77 (0%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 MVWSVQPE VLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS Sbjct 1 MVWSVQPETVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 Query 61 YLGVVAEHASQRGLFAG 77 YLGVVAEHASQRGLFAG Sbjct 61 YLGVVAEHASQRGLFAG 77 >gi|297733574|ref|ZP_06962692.1| PE family protein [Mycobacterium tuberculosis KZN R506] Length=73 Score = 136 bits (343), Expect = 9e-31, Method: Compositional matrix adjust. Identities = 72/73 (99%), Positives = 73/73 (100%), Gaps = 0/73 (0%) Query 5 VQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLGV 64 +QPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLGV Sbjct 1 MQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLGV 60 Query 65 VAEHASQRGLFAG 77 VAEHASQRGLFAG Sbjct 61 VAEHASQRGLFAG 73 >gi|296167009|ref|ZP_06849422.1| PE2 family protein [Mycobacterium parascrofulaceum ATCC BAA-614] gi|295897639|gb|EFG77232.1| PE2 family protein [Mycobacterium parascrofulaceum ATCC BAA-614] Length=112 Score = 103 bits (257), Expect = 8e-21, Method: Compositional matrix adjust. Identities = 64/77 (84%), Positives = 70/77 (91%), Gaps = 0/77 (0%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 M S+QPEAVLASA AESAISAETEAAA+ AAPALL T PMG DPDSAMF+AALNACGAS Sbjct 10 MTMSMQPEAVLASAGAESAISAETEAAASAAAPALLGTLPMGSDPDSAMFAAALNACGAS 69 Query 61 YLGVVAEHASQRGLFAG 77 YLGVV+EH++QRGLFAG Sbjct 70 YLGVVSEHSAQRGLFAG 86 >gi|41406255|ref|NP_959091.1| hypothetical protein MAP0157 [Mycobacterium avium subsp. paratuberculosis K-10] gi|118462731|ref|YP_879446.1| PE family protein [Mycobacterium avium 104] gi|41394603|gb|AAS02474.1| PE_2 [Mycobacterium avium subsp. paratuberculosis K-10] gi|118164018|gb|ABK64915.1| PE family protein [Mycobacterium avium 104] gi|212595750|gb|ACJ35520.1| PE2 [Mycobacterium avium subsp. avium] gi|336457763|gb|EGO36759.1| PE family protein [Mycobacterium avium subsp. paratuberculosis S397] Length=103 Score = 103 bits (257), Expect = 8e-21, Method: Compositional matrix adjust. Identities = 64/77 (84%), Positives = 69/77 (90%), Gaps = 0/77 (0%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 MV SVQ E VLASA AESAISAETEAAA+ A+PALL PMGGDPDSAMF+AALNACGAS Sbjct 1 MVVSVQSEMVLASAGAESAISAETEAAASAASPALLGVMPMGGDPDSAMFAAALNACGAS 60 Query 61 YLGVVAEHASQRGLFAG 77 YLGVV+EHA+QRGLFAG Sbjct 61 YLGVVSEHAAQRGLFAG 77 >gi|342860128|ref|ZP_08716780.1| PE family protein [Mycobacterium colombiense CECT 3035] gi|342132506|gb|EGT85735.1| PE family protein [Mycobacterium colombiense CECT 3035] Length=102 Score = 102 bits (253), Expect = 3e-20, Method: Compositional matrix adjust. Identities = 60/74 (82%), Positives = 67/74 (91%), Gaps = 0/74 (0%) Query 4 SVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLG 63 SVQ EAVLAS++AE+AISAETEAAA+ AAP LL PMGGDPDSAMF+AALNACG SYLG Sbjct 5 SVQTEAVLASSSAETAISAETEAAASAAAPVLLGVLPMGGDPDSAMFAAALNACGGSYLG 64 Query 64 VVAEHASQRGLFAG 77 VV+EHA+QRGLFAG Sbjct 65 VVSEHAAQRGLFAG 78 >gi|254773504|ref|ZP_05215020.1| PE family protein [Mycobacterium avium subsp. avium ATCC 25291] Length=95 Score = 92.8 bits (229), Expect = 2e-17, Method: Compositional matrix adjust. Identities = 58/68 (86%), Positives = 63/68 (93%), Gaps = 0/68 (0%) Query 10 VLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLGVVAEHA 69 VLASA AESAISAETEAAA+ A+PALL PMGGDPDSAMF+AALNACGASYLGVV+EHA Sbjct 2 VLASAGAESAISAETEAAASAASPALLGVMPMGGDPDSAMFAAALNACGASYLGVVSEHA 61 Query 70 SQRGLFAG 77 +QRGLFAG Sbjct 62 AQRGLFAG 69 >gi|240168373|ref|ZP_04747032.1| PE family protein [Mycobacterium kansasii ATCC 12478] Length=96 Score = 89.0 bits (219), Expect = 2e-16, Method: Compositional matrix adjust. Identities = 42/45 (94%), Positives = 43/45 (96%), Gaps = 0/45 (0%) Query 33 PALLSTTPMGGDPDSAMFSAALNACGASYLGVVAEHASQRGLFAG 77 PALL TTPMG DPDSAMFSAALNACGASYLGVVAEHA+QRGLFAG Sbjct 29 PALLGTTPMGDDPDSAMFSAALNACGASYLGVVAEHAAQRGLFAG 73 >gi|254822563|ref|ZP_05227564.1| PE family protein [Mycobacterium intracellulare ATCC 13950] gi|212596249|gb|ACJ35563.1| PE2 [Mycobacterium intracellulare ATCC 13950] Length=101 Score = 85.1 bits (209), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 64/77 (84%), Positives = 70/77 (91%), Gaps = 0/77 (0%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 MV SVQ EAVLASAAAESAISAETEAAA+ A+P LL PMGGDPDSAMF+AALNACGAS Sbjct 1 MVVSVQSEAVLASAAAESAISAETEAAASAASPVLLGVLPMGGDPDSAMFAAALNACGAS 60 Query 61 YLGVVAEHASQRGLFAG 77 YLGVV+EHA+QRGLF+G Sbjct 61 YLGVVSEHAAQRGLFSG 77 >gi|333988696|ref|YP_004521310.1| hypothetical protein JDM601_0056 [Mycobacterium sp. JDM601] gi|333484664|gb|AEF34056.1| conserved hypothetical protein [Mycobacterium sp. JDM601] Length=97 Score = 68.6 bits (166), Expect = 3e-10, Method: Compositional matrix adjust. Identities = 46/73 (64%), Positives = 59/73 (81%), Gaps = 0/73 (0%) Query 5 VQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLGV 64 ++PEAVLAS+ E+AI+AET AAA+ A+PALL PMG DPDS F AAL A G+ YLG+ Sbjct 1 MEPEAVLASSGVEAAITAETAAAASSASPALLGVLPMGNDPDSIAFQAALLASGSEYLGI 60 Query 65 VAEHASQRGLFAG 77 VAEH++QRGL++G Sbjct 61 VAEHSAQRGLYSG 73 >gi|240170857|ref|ZP_04749516.1| hypothetical protein MkanA1_16207 [Mycobacterium kansasii ATCC 12478] Length=81 Score = 40.0 bits (92), Expect = 0.11, Method: Compositional matrix adjust. Identities = 29/75 (39%), Positives = 44/75 (59%), Gaps = 4/75 (5%) Query 1 MVWSVQPEAVLASAAAESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGAS 60 +V++V+P V ASA +++ ++A+ A AG A AL+ PMG DSA F+ GA+ Sbjct 2 VVFAVEPAVVGASAVSQAGLAAQHGAGVAGCAAALVGVVPMGEVADSAAFA----GVGAA 57 Query 61 YLGVVAEHASQRGLF 75 Y+ EHA + G F Sbjct 58 YVSAAGEHARREGRF 72 >gi|240170359|ref|ZP_04749018.1| hypothetical protein MkanA1_13685 [Mycobacterium kansasii ATCC 12478] Length=85 Score = 36.6 bits (83), Expect = 1.4, Method: Compositional matrix adjust. Identities = 28/59 (48%), Positives = 38/59 (65%), Gaps = 0/59 (0%) Query 17 ESAISAETEAAAAGAAPALLSTTPMGGDPDSAMFSAALNACGASYLGVVAEHASQRGLF 75 ++ ++A A A AAP L + PMG D DSA F+AAL A GA+Y+ EHA+ RG+F Sbjct 3 QAGLAARLGAGVAAAAPTLSAVAPMGEDADSAAFTAALAAVGAAYVSTAGEHAAARGVF 61 >gi|169245932|gb|ACA50953.1| hypothetical protein MUDP_036 [Mycobacterium marinum DL240490] Length=98 Score = 34.7 bits (78), Expect = 5.4, Method: Compositional matrix adjust. Identities = 20/44 (46%), Positives = 27/44 (62%), Gaps = 0/44 (0%) Query 33 PALLSTTPMGGDPDSAMFSAALNACGASYLGVVAEHASQRGLFA 76 PA+ + PMGG+ SAM + A+ A GA +L V A +QR FA Sbjct 32 PAMGAMVPMGGEEVSAMLAQAIAAHGAQFLAVGAVGVAQREAFA 75 Lambda K H 0.311 0.120 0.334 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 130175841596 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40