BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv3222c Length=183 Score E Sequences producing significant alignments: (Bits) Value gi|15610358|ref|NP_217738.1| hypothetical protein Rv3222c [Mycob... 355 1e-96 gi|289763409|ref|ZP_06522787.1| conserved hypothetical protein [... 353 6e-96 gi|294993840|ref|ZP_06799531.1| hypothetical protein Mtub2_04816... 344 4e-93 gi|323718090|gb|EGB27272.1| hypothetical protein TMMG_02369 [Myc... 337 5e-91 gi|339296052|gb|AEJ48163.1| hypothetical protein CCDC5079_2973 [... 260 6e-68 gi|289751913|ref|ZP_06511291.1| conserved hypothetical protein [... 202 1e-50 gi|340628201|ref|YP_004746653.1| hypothetical protein MCAN_32401... 136 1e-30 gi|340628576|ref|YP_004747028.1| hypothetical protein MCAN_36231... 136 2e-30 gi|340626926|ref|YP_004745378.1| hypothetical protein MCAN_19331... 135 2e-30 gi|289570116|ref|ZP_06450343.1| predicted protein [Mycobacterium... 96.3 2e-18 gi|296169593|ref|ZP_06851213.1| conserved hypothetical protein [... 45.8 0.003 gi|169851680|ref|XP_001832529.1| hypothetical protein CC1G_03543... 34.3 7.6 >gi|15610358|ref|NP_217738.1| hypothetical protein Rv3222c [Mycobacterium tuberculosis H37Rv] gi|15842810|ref|NP_337847.1| hypothetical protein MT3319 [Mycobacterium tuberculosis CDC1551] gi|31794401|ref|NP_856894.1| hypothetical protein Mb3249c [Mycobacterium bovis AF2122/97] 62 more sequence titlesLength=183 Score = 355 bits (911), Expect = 1e-96, Method: Compositional matrix adjust. Identities = 183/183 (100%), Positives = 183/183 (100%), Gaps = 0/183 (0%) Query 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA Sbjct 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 Query 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG 120 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG Sbjct 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG 120 Query 121 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRM 180 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRM Sbjct 121 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRM 180 Query 181 VHQ 183 VHQ Sbjct 181 VHQ 183 >gi|289763409|ref|ZP_06522787.1| conserved hypothetical protein [Mycobacterium tuberculosis GM 1503] gi|289710915|gb|EFD74931.1| conserved hypothetical protein [Mycobacterium tuberculosis GM 1503] Length=183 Score = 353 bits (906), Expect = 6e-96, Method: Compositional matrix adjust. Identities = 182/183 (99%), Positives = 182/183 (99%), Gaps = 0/183 (0%) Query 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA Sbjct 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 Query 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG 120 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG Sbjct 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG 120 Query 121 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRM 180 VRGESFGA RRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRM Sbjct 121 VRGESFGAHRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRM 180 Query 181 VHQ 183 VHQ Sbjct 181 VHQ 183 >gi|294993840|ref|ZP_06799531.1| hypothetical protein Mtub2_04816 [Mycobacterium tuberculosis 210] Length=217 Score = 344 bits (882), Expect = 4e-93, Method: Compositional matrix adjust. Identities = 179/179 (100%), Positives = 179/179 (100%), Gaps = 0/179 (0%) Query 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA Sbjct 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 Query 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG 120 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG Sbjct 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVG 120 Query 121 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTR 179 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTR Sbjct 121 VRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTR 179 >gi|323718090|gb|EGB27272.1| hypothetical protein TMMG_02369 [Mycobacterium tuberculosis CDC1551A] Length=174 Score = 337 bits (863), Expect = 5e-91, Method: Compositional matrix adjust. Identities = 173/174 (99%), Positives = 174/174 (100%), Gaps = 0/174 (0%) Query 10 LANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGAVVQVGGLEV 69 +ANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGAVVQVGGLEV Sbjct 1 MANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGAVVQVGGLEV 60 Query 70 GSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVGVRGESFGAR 129 GSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVGVRGESFGAR Sbjct 61 GSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVGVRGESFGAR 120 Query 130 RRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ 183 RRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ Sbjct 121 RRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ 174 >gi|339296052|gb|AEJ48163.1| hypothetical protein CCDC5079_2973 [Mycobacterium tuberculosis CCDC5079] gi|339299662|gb|AEJ51772.1| hypothetical protein CCDC5180_2935 [Mycobacterium tuberculosis CCDC5180] Length=134 Score = 260 bits (665), Expect = 6e-68, Method: Compositional matrix adjust. Identities = 133/134 (99%), Positives = 134/134 (100%), Gaps = 0/134 (0%) Query 50 VPVVVAAGSGAVVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAG 109 +PVVVAAGSGAVVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAG Sbjct 1 MPVVVAAGSGAVVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAG 60 Query 110 QAGQQFGLGVGVRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYI 169 QAGQQFGLGVGVRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYI Sbjct 61 QAGQQFGLGVGVRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYI 120 Query 170 GNPSEITDTRMVHQ 183 GNPSEITDTRMVHQ Sbjct 121 GNPSEITDTRMVHQ 134 >gi|289751913|ref|ZP_06511291.1| conserved hypothetical protein [Mycobacterium tuberculosis T92] gi|289692500|gb|EFD59929.1| conserved hypothetical protein [Mycobacterium tuberculosis T92] Length=115 Score = 202 bits (515), Expect = 1e-50, Method: Compositional matrix adjust. Identities = 99/102 (98%), Positives = 100/102 (99%), Gaps = 0/102 (0%) Query 82 VAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVGVRGESFGARRRLALSTVGASG 141 + LFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVGVRGESFGARRRLALSTVGASG Sbjct 14 LRRLFVCRPTEPDVGDFVGLAGGAGDAGQAGQQFGLGVGVRGESFGARRRLALSTVGASG 73 Query 142 ATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ 183 ATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ Sbjct 74 ATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ 115 >gi|340628201|ref|YP_004746653.1| hypothetical protein MCAN_32401 [Mycobacterium canettii CIPT 140010059] gi|340006391|emb|CCC45571.1| putative uncharacterized protein [Mycobacterium canettii CIPT 140010059] Length=593 Score = 136 bits (343), Expect = 1e-30, Method: Compositional matrix adjust. Identities = 85/95 (90%), Positives = 87/95 (92%), Gaps = 0/95 (0%) Query 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 MSSPVSSRRLANLV ESLQGSVLGG+VSDAVLPAV DDVKPGAGEDAY V VVVAAGSGA Sbjct 1 MSSPVSSRRLANLVTESLQGSVLGGIVSDAVLPAVPDDVKPGAGEDAYGVRVVVAAGSGA 60 Query 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDV 95 VV+VGG VGSAAVAGEVAD VAELFVC PTEPDV Sbjct 61 VVKVGGPGVGSAAVAGEVADGVAELFVCGPTEPDV 95 Score = 122 bits (306), Expect = 2e-26, Method: Compositional matrix adjust. Identities = 58/58 (100%), Positives = 58/58 (100%), Gaps = 0/58 (0%) Query 126 FGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ 183 FGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ Sbjct 536 FGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYIGNPSEITDTRMVHQ 593 Score = 75.9 bits (185), Expect = 2e-12, Method: Compositional matrix adjust. Identities = 37/42 (89%), Positives = 38/42 (91%), Gaps = 0/42 (0%) Query 126 FGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRL 167 FGARRR LSTVGASGATAGLRKTHDGHHGCQA A+TQRRL Sbjct 451 FGARRRSVLSTVGASGATAGLRKTHDGHHGCQASRAITQRRL 492 >gi|340628576|ref|YP_004747028.1| hypothetical protein MCAN_36231 [Mycobacterium canettii CIPT 140010059] gi|340006766|emb|CCC45954.1| putative uncharacterized protein mb3249c [Mycobacterium canettii CIPT 140010059] Length=641 Score = 136 bits (342), Expect = 2e-30, Method: Compositional matrix adjust. Identities = 85/95 (90%), Positives = 87/95 (92%), Gaps = 0/95 (0%) Query 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 MSSPVSSRRLANLV ESLQGSVLGG+VSDAVLPAV DDVKPGAGEDAY V VVVAAGSGA Sbjct 1 MSSPVSSRRLANLVTESLQGSVLGGIVSDAVLPAVPDDVKPGAGEDAYGVRVVVAAGSGA 60 Query 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDV 95 VV+VGG VGSAAVAGEVAD VAELFVC PTEPDV Sbjct 61 VVKVGGPGVGSAAVAGEVADGVAELFVCGPTEPDV 95 Score = 76.3 bits (186), Expect = 2e-12, Method: Compositional matrix adjust. Identities = 37/42 (89%), Positives = 38/42 (91%), Gaps = 0/42 (0%) Query 126 FGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRL 167 FGARRR LSTVGASGATAGLRKTHDGHHGCQA A+TQRRL Sbjct 451 FGARRRSVLSTVGASGATAGLRKTHDGHHGCQASRAITQRRL 492 >gi|340626926|ref|YP_004745378.1| hypothetical protein MCAN_19331 [Mycobacterium canettii CIPT 140010059] gi|340005116|emb|CCC44265.1| putative uncharacterised protein [Mycobacterium canettii CIPT 140010059] Length=883 Score = 135 bits (341), Expect = 2e-30, Method: Compositional matrix adjust. Identities = 85/95 (90%), Positives = 87/95 (92%), Gaps = 0/95 (0%) Query 1 MSSPVSSRRLANLVKESLQGSVLGGVVSDAVLPAVSDDVKPGAGEDAYRVPVVVAAGSGA 60 MSSPVSSRRLANLV ESLQGSVLGG+VSDAVLPAV DDVKPGAGEDAY V VVVAAGSGA Sbjct 240 MSSPVSSRRLANLVTESLQGSVLGGIVSDAVLPAVPDDVKPGAGEDAYGVRVVVAAGSGA 299 Query 61 VVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDV 95 VV+VGG VGSAAVAGEVAD VAELFVC PTEPDV Sbjct 300 VVKVGGPGVGSAAVAGEVADGVAELFVCGPTEPDV 334 Score = 76.3 bits (186), Expect = 2e-12, Method: Compositional matrix adjust. Identities = 37/42 (89%), Positives = 38/42 (91%), Gaps = 0/42 (0%) Query 126 FGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRL 167 FGARRR LSTVGASGATAGLRKTHDGHHGCQA A+TQRRL Sbjct 690 FGARRRSVLSTVGASGATAGLRKTHDGHHGCQASRAITQRRL 731 >gi|289570116|ref|ZP_06450343.1| predicted protein [Mycobacterium tuberculosis T17] gi|289543870|gb|EFD47518.1| predicted protein [Mycobacterium tuberculosis T17] Length=153 Score = 96.3 bits (238), Expect = 2e-18, Method: Compositional matrix adjust. Identities = 65/129 (51%), Positives = 77/129 (60%), Gaps = 18/129 (13%) Query 56 AGSGAVVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGAGDAGQAGQQF 115 A SG VV L+ G+A +A D + + RP + L G Q Sbjct 42 AQSGGVV----LDRGAANLAPTRIDDRHRVIISRPIDSTRDAVPRLVG---------QGS 88 Query 116 GLGVGVRGESFGARRRLALSTVGASGATAGLRKTHDGHHGCQARGALTQRRL-YIGNPSE 174 GLG + FGARRR LSTVGASGATAGLR+TH GHHGC+A A+TQRRL IGNPS+ Sbjct 89 GLGRSL----FGARRRSVLSTVGASGATAGLRRTHRGHHGCRASRAITQRRLGCIGNPSK 144 Query 175 ITDTRMVHQ 183 +TDTRMVHQ Sbjct 145 VTDTRMVHQ 153 >gi|296169593|ref|ZP_06851213.1| conserved hypothetical protein [Mycobacterium parascrofulaceum ATCC BAA-614] gi|295895859|gb|EFG75554.1| conserved hypothetical protein [Mycobacterium parascrofulaceum ATCC BAA-614] Length=127 Score = 45.8 bits (107), Expect = 0.003, Method: Compositional matrix adjust. Identities = 44/127 (35%), Positives = 57/127 (45%), Gaps = 28/127 (22%) Query 85 LFVCRPT----EP-DVGDFVGLAGGAGD----AGQAGQQFGLGVGVR---GESFGARRRL 132 + + RP EP D + G++G D A +G+ G R GA+RRL Sbjct 1 MIITRPVDSTREPVDGFRWQGISGKLHDSLLAAKPSGEAPSCDTGARLLVRSPCGAQRRL 60 Query 133 ALSTVGASGATAGLRKTHDGHHGCQARGALTQRRLYI---------------GNPSE-IT 176 ALSTVG TAG R+TH G +A A+ Q+ + PSE I Sbjct 61 ALSTVGVPRVTAGPRRTHAGRQRRRASRAMNQQPPRVHRRSIQDHRHTGSCASQPSEQIN 120 Query 177 DTRMVHQ 183 DTRMVHQ Sbjct 121 DTRMVHQ 127 >gi|169851680|ref|XP_001832529.1| hypothetical protein CC1G_03543 [Coprinopsis cinerea okayama7#130] gi|116506383|gb|EAU89278.1| hypothetical protein CC1G_03543 [Coprinopsis cinerea okayama7#130] Length=166 Score = 34.3 bits (77), Expect = 7.6, Method: Compositional matrix adjust. Identities = 28/80 (35%), Positives = 38/80 (48%), Gaps = 3/80 (3%) Query 46 DAYRVPVVVAAGSGAVVQVGGLEVGSAAVAGEVADTVAELFVCRPTEPDVGDFVGLAGGA 105 DA + V A + AV+ VGGL VG +AG V DT E +C PD + GA Sbjct 18 DAVVLAVFAAQSAVAVIPVGGLCVG---IAGPVNDTCVEGSICCNVSPDRSLCTQVEEGA 74 Query 106 GDAGQAGQQFGLGVGVRGES 125 + ++ GL G+ G S Sbjct 75 ECPPKVIEENGLCAGIAGPS 94 Lambda K H 0.315 0.134 0.381 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 167689013960 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40