BLASTP 2.2.25+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 15,229,318 sequences; 5,219,829,388 total letters Query= Rv3845 Length=119 Score E Sequences producing significant alignments: (Bits) Value gi|15610981|ref|NP_218362.1| hypothetical protein Rv3845 [Mycoba... 237 4e-61 gi|254233333|ref|ZP_04926659.1| hypothetical protein TBCG_03772 ... 234 3e-60 gi|294995160|ref|ZP_06800851.1| hypothetical protein Mtub2_11757... 233 5e-60 gi|41409912|ref|NP_962748.1| hypothetical protein MAP3814c [Myco... 63.9 7e-09 gi|134287685|ref|YP_001109851.1| transposase IS116/IS110/IS902 f... 63.9 7e-09 gi|336460223|gb|EGO39126.1| transposase [Mycobacterium avium sub... 63.9 7e-09 gi|330990301|ref|ZP_08314272.1| Insertion element IS110 43.6 kDa... 63.9 8e-09 gi|146279518|ref|YP_001169676.1| IstB ATP binding domain-contain... 62.8 1e-08 gi|296169316|ref|ZP_06850946.1| conserved hypothetical protein [... 60.1 1e-07 gi|254777260|ref|ZP_05218776.1| hypothetical protein MaviaA2_216... 59.7 2e-07 gi|84497869|ref|ZP_00996666.1| hypothetical protein JNB_17318 [J... 58.2 4e-07 gi|84496567|ref|ZP_00995421.1| hypothetical protein JNB_03570 [J... 57.8 5e-07 gi|84497066|ref|ZP_00995888.1| hypothetical protein JNB_12768 [J... 57.8 5e-07 gi|111024172|ref|YP_707144.1| hypothetical protein RHA1_ro07222 ... 56.6 1e-06 gi|317122374|ref|YP_004102377.1| transposase IS116/IS110/IS902 f... 53.9 8e-06 gi|240168199|ref|ZP_04746858.1| hypothetical protein MkanA1_0272... 50.4 1e-04 gi|302380480|ref|ZP_07268946.1| transposase [Finegoldia magna AC... 37.4 0.74 gi|303234648|ref|ZP_07321278.1| transposase [Finegoldia magna BV... 36.2 1.7 gi|301626696|ref|XP_002942525.1| PREDICTED: hypothetical protein... 36.2 1.7 gi|311747638|ref|ZP_07721423.1| transposase [Algoriphagus sp. PR... 35.0 3.6 gi|218295527|ref|ZP_03496340.1| transposase IS116/IS110/IS902 fa... 33.9 9.0 >gi|15610981|ref|NP_218362.1| hypothetical protein Rv3845 [Mycobacterium tuberculosis H37Rv] gi|31795019|ref|NP_857512.1| hypothetical protein Mb3875 [Mycobacterium bovis AF2122/97] gi|121639763|ref|YP_979987.1| hypothetical protein BCG_3908 [Mycobacterium bovis BCG str. Pasteur 1173P2] 75 more sequence titlesLength=119 Score = 237 bits (604), Expect = 4e-61, Method: Compositional matrix adjust. Identities = 118/119 (99%), Positives = 119/119 (100%), Gaps = 0/119 (0%) Query 1 VDRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE 60 +DRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE Sbjct 1 MDRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE 60 Query 61 RTRVTITTGRPYQLRDTNGDPVTARGAKELIDAHYHVDTRTHPHNRAHTDTMQNSKPAR 119 RTRVTITTGRPYQLRDTNGDPVTARGAKELIDAHYHVDTRTHPHNRAHTDTMQNSKPAR Sbjct 61 RTRVTITTGRPYQLRDTNGDPVTARGAKELIDAHYHVDTRTHPHNRAHTDTMQNSKPAR 119 >gi|254233333|ref|ZP_04926659.1| hypothetical protein TBCG_03772 [Mycobacterium tuberculosis C] gi|124603126|gb|EAY61401.1| hypothetical protein TBCG_03772 [Mycobacterium tuberculosis C] Length=119 Score = 234 bits (598), Expect = 3e-60, Method: Compositional matrix adjust. Identities = 116/119 (98%), Positives = 119/119 (100%), Gaps = 0/119 (0%) Query 1 VDRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE 60 +DRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE Sbjct 1 MDRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE 60 Query 61 RTRVTITTGRPYQLRDTNGDPVTARGAKELIDAHYHVDTRTHPHNRAHTDTMQNSKPAR 119 RTRVTITTGRPYQLRDTNGDPVTARGA+ELIDAHYHVDTRTHPHNRAHT+TMQNSKPAR Sbjct 61 RTRVTITTGRPYQLRDTNGDPVTARGAEELIDAHYHVDTRTHPHNRAHTNTMQNSKPAR 119 >gi|294995160|ref|ZP_06800851.1| hypothetical protein Mtub2_11757 [Mycobacterium tuberculosis 210] Length=119 Score = 233 bits (595), Expect = 5e-60, Method: Compositional matrix adjust. Identities = 116/119 (98%), Positives = 117/119 (99%), Gaps = 0/119 (0%) Query 1 VDRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE 60 +DRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE Sbjct 1 MDRVRRVVTDRDSGAGALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAE 60 Query 61 RTRVTITTGRPYQLRDTNGDPVTARGAKELIDAHYHVDTRTHPHNRAHTDTMQNSKPAR 119 RTRVTITTGRPYQLRDTNGDPVT RGAKELIDAHYHVDTRTHPHNRAHTDTMQNS PAR Sbjct 61 RTRVTITTGRPYQLRDTNGDPVTGRGAKELIDAHYHVDTRTHPHNRAHTDTMQNSNPAR 119 >gi|41409912|ref|NP_962748.1| hypothetical protein MAP3814c [Mycobacterium avium subsp. paratuberculosis K-10] gi|41398745|gb|AAS06364.1| hypothetical protein MAP_3814c [Mycobacterium avium subsp. paratuberculosis K-10] Length=493 Score = 63.9 bits (154), Expect = 7e-09, Method: Compositional matrix adjust. Identities = 36/76 (48%), Positives = 46/76 (61%), Gaps = 1/76 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+TDPQLAA Y RLMTT+RH H A +A L R TG Y LRDT+G P+T Sbjct 376 RKTDPQLAAKYKRLMTTERH-HDSAICHIATTLLTRIATCWRTGAHYVLRDTDGRPITFE 434 Query 86 GAKELIDAHYHVDTRT 101 + ++ AH+ VD +T Sbjct 435 EGRRIVRAHHAVDKKT 450 >gi|134287685|ref|YP_001109851.1| transposase IS116/IS110/IS902 family protein [Burkholderia vietnamiensis G4] gi|134287727|ref|YP_001109893.1| transposase IS116/IS110/IS902 family protein [Burkholderia vietnamiensis G4] gi|134287782|ref|YP_001109948.1| transposase IS116/IS110/IS902 family protein [Burkholderia vietnamiensis G4] 39 more sequence titles Length=453 Score = 63.9 bits (154), Expect = 7e-09, Method: Compositional matrix adjust. Identities = 30/72 (42%), Positives = 41/72 (57%), Gaps = 0/72 (0%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+ DP LAA Y RLM + H H QA AVA +L R + TG PY LRD G+ ++ + Sbjct 365 RKIDPDLAAVYWRLMVNKGHHHKQAICAVATRLINRIYRVLKTGEPYVLRDQEGNSISVQ 424 Query 86 GAKELIDAHYHV 97 K ++ A + V Sbjct 425 EGKRIVAAQFKV 436 >gi|336460223|gb|EGO39126.1| transposase [Mycobacterium avium subsp. paratuberculosis S397] Length=461 Score = 63.9 bits (154), Expect = 7e-09, Method: Compositional matrix adjust. Identities = 36/76 (48%), Positives = 46/76 (61%), Gaps = 1/76 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+TDPQLAA Y RLMTT+RH H A +A L R TG Y LRDT+G P+T Sbjct 344 RKTDPQLAAKYKRLMTTERH-HDSAICHIATTLLTRIATCWRTGAHYVLRDTDGRPITFE 402 Query 86 GAKELIDAHYHVDTRT 101 + ++ AH+ VD +T Sbjct 403 EGRRIVRAHHAVDKKT 418 >gi|330990301|ref|ZP_08314272.1| Insertion element IS110 43.6 kDa protein [Gluconacetobacter sp. SXCC-1] gi|329762605|gb|EGG79078.1| Insertion element IS110 43.6 kDa protein [Gluconacetobacter sp. SXCC-1] Length=445 Score = 63.9 bits (154), Expect = 8e-09, Method: Compositional matrix adjust. Identities = 31/72 (44%), Positives = 43/72 (60%), Gaps = 0/72 (0%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+TDP+LA Y RLMT + H H QA AVA ++ R + G+PY++RD G+ +T Sbjct 357 RKTDPELAELYWRLMTAKGHHHKQALCAVANRIVNRIFSVLKRGKPYEVRDREGNGITMC 416 Query 86 GAKELIDAHYHV 97 AK +I Y V Sbjct 417 EAKAIILERYTV 428 >gi|146279518|ref|YP_001169676.1| IstB ATP binding domain-containing protein [Rhodobacter sphaeroides ATCC 17025] gi|145557759|gb|ABP72371.1| IstB domain protein ATP-binding protein [Rhodobacter sphaeroides ATCC 17025] Length=385 Score = 62.8 bits (151), Expect = 1e-08, Method: Compositional matrix adjust. Identities = 31/72 (44%), Positives = 42/72 (59%), Gaps = 0/72 (0%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+ DP LA Y RLMT + H H QA AVA +L R + +GRPY LRD +G ++ Sbjct 297 RKIDPDLAEVYWRLMTGKGHHHKQALCAVANRLVNRIFSVLRSGRPYVLRDADGRGISVA 356 Query 86 GAKELIDAHYHV 97 AK ++ Y+V Sbjct 357 EAKAIVAERYNV 368 >gi|296169316|ref|ZP_06850946.1| conserved hypothetical protein [Mycobacterium parascrofulaceum ATCC BAA-614] gi|295896020|gb|EFG75706.1| conserved hypothetical protein [Mycobacterium parascrofulaceum ATCC BAA-614] Length=455 Score = 60.1 bits (144), Expect = 1e-07, Method: Compositional matrix adjust. Identities = 35/76 (47%), Positives = 44/76 (58%), Gaps = 1/76 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+ DPQLAA Y RLM+T+RH H A +A L R TG+ Y LRDT+G PVT Sbjct 353 RKIDPQLAAKYQRLMSTERH-HDSAICHIATILLTRIATCWRTGQHYVLRDTDGRPVTEE 411 Query 86 GAKELIDAHYHVDTRT 101 + I H+ VD +T Sbjct 412 EGRRTIRTHHTVDKKT 427 >gi|254777260|ref|ZP_05218776.1| hypothetical protein MaviaA2_21684 [Mycobacterium avium subsp. avium ATCC 25291] Length=442 Score = 59.7 bits (143), Expect = 2e-07, Method: Compositional matrix adjust. Identities = 33/76 (44%), Positives = 45/76 (60%), Gaps = 1/76 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R++DPQLAA Y RLM+T+RH H A +A L R G Y LRDT+G P+T Sbjct 325 RKSDPQLAAKYKRLMSTERH-HDSAICHIATILLTRIATCWRAGAHYVLRDTDGRPITFE 383 Query 86 GAKELIDAHYHVDTRT 101 + ++ AH+ VD +T Sbjct 384 EGRRIVRAHHTVDKKT 399 >gi|84497869|ref|ZP_00996666.1| hypothetical protein JNB_17318 [Janibacter sp. HTCC2649] gi|84381369|gb|EAP97252.1| hypothetical protein JNB_17318 [Janibacter sp. HTCC2649] Length=277 Score = 58.2 bits (139), Expect = 4e-07, Method: Compositional matrix adjust. Identities = 29/75 (39%), Positives = 41/75 (55%), Gaps = 1/75 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+ DPQ+AA Y RLM RH H A +A +L R + TG+PY LRD +G P+T Sbjct 160 RKIDPQIAAKYQRLMVGDRH-HDSAICHLATQLLTRIATCMRTGQPYALRDVDGTPITEA 218 Query 86 GAKELIDAHYHVDTR 100 + ++ Y + R Sbjct 219 EGRAIVKERYQIPAR 233 >gi|84496567|ref|ZP_00995421.1| hypothetical protein JNB_03570 [Janibacter sp. HTCC2649] gi|84383335|gb|EAP99216.1| hypothetical protein JNB_03570 [Janibacter sp. HTCC2649] Length=384 Score = 57.8 bits (138), Expect = 5e-07, Method: Compositional matrix adjust. Identities = 29/75 (39%), Positives = 41/75 (55%), Gaps = 1/75 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+ DPQ+AA Y RLM RH H A +A +L R + TG+PY LRD +G P+T Sbjct 267 RKIDPQIAAKYQRLMVGDRH-HDSAICHLATQLLTRIATCMRTGQPYALRDVDGTPITEA 325 Query 86 GAKELIDAHYHVDTR 100 + ++ Y + R Sbjct 326 EGRAIVKERYQIPAR 340 >gi|84497066|ref|ZP_00995888.1| hypothetical protein JNB_12768 [Janibacter sp. HTCC2649] gi|84498258|ref|ZP_00997055.1| hypothetical protein JNB_19263 [Janibacter sp. HTCC2649] gi|84381758|gb|EAP97641.1| hypothetical protein JNB_19263 [Janibacter sp. HTCC2649] gi|84381954|gb|EAP97836.1| hypothetical protein JNB_12768 [Janibacter sp. HTCC2649] Length=384 Score = 57.8 bits (138), Expect = 5e-07, Method: Compositional matrix adjust. Identities = 29/75 (39%), Positives = 41/75 (55%), Gaps = 1/75 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+ DPQ+AA Y RLM RH H A +A +L R + TG+PY LRD +G P+T Sbjct 267 RKIDPQIAAKYQRLMVGDRH-HDSAICHLATQLLTRIATCMRTGQPYALRDVDGTPITEA 325 Query 86 GAKELIDAHYHVDTR 100 + ++ Y + R Sbjct 326 EGRAIVKERYQIPAR 340 >gi|111024172|ref|YP_707144.1| hypothetical protein RHA1_ro07222 [Rhodococcus jostii RHA1] gi|110823702|gb|ABG98986.1| conserved hypothetical protein [Rhodococcus jostii RHA1] Length=271 Score = 56.6 bits (135), Expect = 1e-06, Method: Compositional matrix adjust. Identities = 30/76 (40%), Positives = 39/76 (52%), Gaps = 1/76 (1%) Query 25 GRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTA 84 RR D Q AA Y RLM RH H A +A L R + G+PY LRD +G P+T Sbjct 153 ARRVDTQFAAKYQRLMVGDRH-HESAICHLATHLVTRIAACMRAGQPYALRDVDGTPITE 211 Query 85 RGAKELIDAHYHVDTR 100 + ++ A Y +D R Sbjct 212 AEGRAIVTARYKIDPR 227 >gi|317122374|ref|YP_004102377.1| transposase IS116/IS110/IS902 family protein [Thermaerobacter marianensis DSM 12885] gi|315592354|gb|ADU51650.1| transposase IS116/IS110/IS902 family protein [Thermaerobacter marianensis DSM 12885] Length=443 Score = 53.9 bits (128), Expect = 8e-06, Method: Compositional matrix adjust. Identities = 31/75 (42%), Positives = 40/75 (54%), Gaps = 0/75 (0%) Query 25 GRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTA 84 RR DPQLA Y R + + H H+ A VA LA+R GRPY LRD +G P+ Sbjct 350 ARRQDPQLAQIYQRCLLEKGHHHSHAVCVVAVHLADRLYAVRRDGRPYVLRDAHGTPLDR 409 Query 85 RGAKELIDAHYHVDT 99 + A+ L A+ DT Sbjct 410 QTARRLAQAYTVPDT 424 >gi|240168199|ref|ZP_04746858.1| hypothetical protein MkanA1_02722 [Mycobacterium kansasii ATCC 12478] Length=468 Score = 50.4 bits (119), Expect = 1e-04, Method: Compositional matrix adjust. Identities = 30/76 (40%), Positives = 41/76 (54%), Gaps = 1/76 (1%) Query 26 RRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTAR 85 R+TDPQLAA Y R + T+RH H A +A L R TG LRDT+G P+T Sbjct 351 RKTDPQLAAKYKRFLATERH-HDSAICHIATILLTRIATCWRTGEHCVLRDTDGRPLTPE 409 Query 86 GAKELIDAHYHVDTRT 101 + ++ + VD +T Sbjct 410 QGRRIVRTRHAVDKQT 425 >gi|302380480|ref|ZP_07268946.1| transposase [Finegoldia magna ACS-171-V-Col3] gi|302311716|gb|EFK93731.1| transposase [Finegoldia magna ACS-171-V-Col3] Length=389 Score = 37.4 bits (85), Expect = 0.74, Method: Composition-based stats. Identities = 20/59 (34%), Positives = 30/59 (51%), Gaps = 1/59 (1%) Query 17 ALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLR 75 AL + L DP + +Y + ++ +H H AT AVARKL +T PYQ++ Sbjct 331 ALFQSALRAEFCDPVFSDYYQKKISEGKH-HLVATNAVARKLCHTIFAVLTKNEPYQVQ 388 >gi|303234648|ref|ZP_07321278.1| transposase [Finegoldia magna BVS033A4] gi|302494238|gb|EFL54014.1| transposase [Finegoldia magna BVS033A4] Length=389 Score = 36.2 bits (82), Expect = 1.7, Method: Composition-based stats. Identities = 19/59 (33%), Positives = 30/59 (51%), Gaps = 1/59 (1%) Query 17 ALARHPLAGRRTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLR 75 AL + L DP + +Y + ++ +H H AT AV+RKL +T PYQ++ Sbjct 331 ALFQSALRAEFCDPVFSDYYQKKISEGKH-HLVATNAVSRKLCHTIFAVLTKNEPYQVQ 388 >gi|301626696|ref|XP_002942525.1| PREDICTED: hypothetical protein LOC100497668 [Xenopus (Silurana) tropicalis] Length=706 Score = 36.2 bits (82), Expect = 1.7, Method: Compositional matrix adjust. Identities = 19/57 (34%), Positives = 26/57 (46%), Gaps = 0/57 (0%) Query 38 RLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQLRDTNGDPVTARGAKELIDAH 94 RL T QR CH ++ E + +P Q DT GDP T G +E+ + H Sbjct 21 RLPTLQRLCHRWGIDPEGKRRHELMELLTPHAQPTQEGDTQGDPETPEGGEEMCNLH 77 >gi|311747638|ref|ZP_07721423.1| transposase [Algoriphagus sp. PR1] gi|311747964|ref|ZP_07721749.1| transposase [Algoriphagus sp. PR1] gi|126575623|gb|EAZ79933.1| transposase [Algoriphagus sp. PR1] gi|311302728|gb|EFQ79253.1| transposase [Algoriphagus sp. PR1] Length=351 Score = 35.0 bits (79), Expect = 3.6, Method: Compositional matrix adjust. Identities = 18/46 (40%), Positives = 25/46 (55%), Gaps = 2/46 (4%) Query 29 DPQLAAFYHRLMTTQRHCHTQATIAVARKLAERTRVTITTGRPYQL 74 DP + Y L+ QRH +A + +ARKL R + TG PYQ+ Sbjct 304 DPVMLNRYEELL--QRHTGKRAIVIIARKLLSRIYHVLKTGEPYQV 347 >gi|218295527|ref|ZP_03496340.1| transposase IS116/IS110/IS902 family protein [Thermus aquaticus Y51MC23] gi|218244159|gb|EED10685.1| transposase IS116/IS110/IS902 family protein [Thermus aquaticus Y51MC23] Length=317 Score = 33.9 bits (76), Expect = 9.0, Method: Compositional matrix adjust. Identities = 17/36 (48%), Positives = 23/36 (64%), Gaps = 1/36 (2%) Query 27 RTDPQLAAFYHRLMTTQRHCHTQATIAVARKLAERT 62 R DP++ AFYHRL++ + QA +AVA KL R Sbjct 266 RHDPEMGAFYHRLLSRGKR-KKQALVAVAHKLLRRM 300 Lambda K H 0.321 0.131 0.391 Gapped Lambda K H 0.267 0.0410 0.140 Effective search space used: 129033565320 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Sep 5, 2011 4:36 AM Number of letters in database: 5,219,829,388 Number of sequences in database: 15,229,318 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Neighboring words threshold: 11 Window for multiple hits: 40