BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv2023c Length=119 Score E Sequences producing significant alignments: (Bits) Value gi|15841506|ref|NP_336543.1| hypothetical protein MT2079 [Mycoba... 230 6e-59 gi|15609160|ref|NP_216539.1| hypothetical protein Rv2023c [Mycob... 228 2e-58 gi|167970448|ref|ZP_02552725.1| hypothetical protein MtubH3_2141... 196 9e-49 gi|91201361|emb|CAJ74421.1| predicted orf [Candidatus Kuenenia s... 45.4 0.003 gi|91199948|emb|CAJ72990.1| hypothetical protein kuste2245 [Cand... 43.1 0.014 gi|153820608|ref|ZP_01973275.1| cadherin domain protein [Vibrio ... 34.3 6.8 gi|154300115|ref|XP_001550474.1| hypothetical protein BC1G_10433... 33.9 7.3 gi|156044546|ref|XP_001588829.1| hypothetical protein SS1G_10377... 33.9 7.6 >gi|15841506|ref|NP_336543.1| hypothetical protein MT2079 [Mycobacterium tuberculosis CDC1551] gi|13881748|gb|AAK46357.1| hypothetical protein MT2079 [Mycobacterium tuberculosis CDC1551] Length=151 Score = 230 bits (586), Expect = 6e-59, Method: Compositional matrix adjust. Identities = 119/119 (100%), Positives = 119/119 (100%), Gaps = 0/119 (0%) Query 1 VAARHARAGRWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVE 60 VAARHARAGRWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVE Sbjct 33 VAARHARAGRWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVE 92 Query 61 RVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS 119 RVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS Sbjct 93 RVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS 151 >gi|15609160|ref|NP_216539.1| hypothetical protein Rv2023c [Mycobacterium tuberculosis H37Rv] gi|31793203|ref|NP_855696.1| hypothetical protein Mb2046c [Mycobacterium bovis AF2122/97] gi|121637907|ref|YP_978130.1| hypothetical protein BCG_2040c [Mycobacterium bovis BCG str. Pasteur 1173P2] 45 more sequence titlesLength=119 Score = 228 bits (581), Expect = 2e-58, Method: Compositional matrix adjust. Identities = 118/119 (99%), Positives = 119/119 (100%), Gaps = 0/119 (0%) Query 1 VAARHARAGRWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVE 60 +AARHARAGRWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVE Sbjct 1 MAARHARAGRWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVE 60 Query 61 RVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS 119 RVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS Sbjct 61 RVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS 119 >gi|167970448|ref|ZP_02552725.1| hypothetical protein MtubH3_21418 [Mycobacterium tuberculosis H37Ra] gi|254232194|ref|ZP_04925521.1| hypothetical protein TBCG_01976 [Mycobacterium tuberculosis C] gi|254551046|ref|ZP_05141493.1| hypothetical protein Mtube_11376 [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] 26 more sequence titles Length=102 Score = 196 bits (498), Expect = 9e-49, Method: Compositional matrix adjust. Identities = 102/102 (100%), Positives = 102/102 (100%), Gaps = 0/102 (0%) Query 18 MLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVERVDAHSRFSSSNLPKSS 77 MLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVERVDAHSRFSSSNLPKSS Sbjct 1 MLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGLVERVDAHSRFSSSNLPKSS 60 Query 78 RRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS 119 RRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS Sbjct 61 RRISGRVSLSGMSNSAAKVVASTSSSPWGQPLSVGLRRRWRS 102 >gi|91201361|emb|CAJ74421.1| predicted orf [Candidatus Kuenenia stuttgartiensis] Length=139 Score = 45.4 bits (106), Expect = 0.003, Method: Compositional matrix adjust. Identities = 25/84 (30%), Positives = 40/84 (48%), Gaps = 3/84 (3%) Query 10 RWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRLGL-VERVDAHSRF 68 +W Q PM + + YE+ AN A GGI +H++ R GLV + +E + H + Sbjct 52 QWGDQKHPMFTAKNIHYEIAANSQAIACGGIGVIHQMAIRSGLVKEIDENLELLKRHIPY 111 Query 69 SSSN--LPKSSRRISGRVSLSGMS 90 S+ L + +SG V L + Sbjct 112 HESDHILNIAYNVLSGNVRLEDIE 135 >gi|91199948|emb|CAJ72990.1| hypothetical protein kuste2245 [Candidatus Kuenenia stuttgartiensis] Length=507 Score = 43.1 bits (100), Expect = 0.014, Method: Composition-based stats. Identities = 17/47 (37%), Positives = 25/47 (54%), Gaps = 0/47 (0%) Query 10 RWAAQPRPMLGSGAVRYEVGANIDATGFGGIAAVHRLVTRLGLVTRL 56 +W Q PM + + YE+ AN A GGI +H++ R GLV + Sbjct 28 QWGDQKHPMFTAKNIHYEIAANSQAIACGGIGVIHQMAIRSGLVKEI 74 >gi|153820608|ref|ZP_01973275.1| cadherin domain protein [Vibrio cholerae NCTC 8457] gi|126508848|gb|EAZ71442.1| cadherin domain protein [Vibrio cholerae NCTC 8457] Length=287 Score = 34.3 bits (77), Expect = 6.8, Method: Compositional matrix adjust. Identities = 27/80 (34%), Positives = 32/80 (40%), Gaps = 4/80 (5%) Query 29 GANIDATGFGGIAAVHRLVTRLGLVTRLGLVERVDAHSRFSSSNL----PKSSRRISGRV 84 GA A F +A VH LV LG V+ D + + NL PK G Sbjct 202 GAEAAANDFEALANVHSLVVTATEDAGLGGVKTTDITVKLNEQNLDDNAPKFEGTTDGEY 261 Query 85 SLSGMSNSAAKVVASTSSSP 104 S S NSAA V T +P Sbjct 262 SFSYDENSAADTVLGTVKAP 281 >gi|154300115|ref|XP_001550474.1| hypothetical protein BC1G_10433 [Botryotinia fuckeliana B05.10] gi|150856722|gb|EDN31914.1| hypothetical protein BC1G_10433 [Botryotinia fuckeliana B05.10] Length=717 Score = 33.9 bits (76), Expect = 7.3, Method: Compositional matrix adjust. Identities = 22/72 (31%), Positives = 31/72 (44%), Gaps = 0/72 (0%) Query 34 ATGFGGIAAVHRLVTRLGLVTRLGLVERVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSA 93 A+ A +HR + L G+V H S+L K +RR SGR S + MS + Sbjct 446 ASSTNHYAEIHRQQAEMALNGSSGIVSPPSGHKESFFSHLRKRARRFSGRQSTTPMSPKS 505 Query 94 AKVVASTSSSPW 105 + A PW Sbjct 506 MDLEAQAGCGPW 517 >gi|156044546|ref|XP_001588829.1| hypothetical protein SS1G_10377 [Sclerotinia sclerotiorum 1980] gi|154694765|gb|EDN94503.1| hypothetical protein SS1G_10377 [Sclerotinia sclerotiorum 1980 UF-70] Length=795 Score = 33.9 bits (76), Expect = 7.6, Method: Compositional matrix adjust. Identities = 22/72 (31%), Positives = 31/72 (44%), Gaps = 0/72 (0%) Query 34 ATGFGGIAAVHRLVTRLGLVTRLGLVERVDAHSRFSSSNLPKSSRRISGRVSLSGMSNSA 93 A+ A +HR + L GLV H S+L K +RR SGR S + MS + Sbjct 513 ASSTNHYAEIHRQQAEMALSGNSGLVSPPSGHKESFFSHLRKRARRFSGRQSQTPMSPKS 572 Query 94 AKVVASTSSSPW 105 + + PW Sbjct 573 MDLESQAGCGPW 584 Lambda K H 0.318 0.129 0.381 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 129033565320 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40