BLASTP 2.2.25+
Reference:
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
15,229,318 sequences; 5,219,829,388 total letters
Query= Rv3472
Length=168
Score E
Sequences producing significant alignments: (Bits) Value
gi|15610608|ref|NP_217989.1| hypothetical protein Rv3472 [Mycoba... 338 1e-91
gi|340628442|ref|YP_004746894.1| hypothetical protein MCAN_34881... 336 5e-91
gi|344221308|gb|AEN01939.1| hypothetical protein MTCTRI2_3540 [M... 335 2e-90
gi|146386422|pdb|2CHC|A Chain A, Structure Of Rv3472(D26n), A Fu... 334 3e-90
gi|254552576|ref|ZP_05143023.1| hypothetical protein Mtube_19370... 330 6e-89
gi|158314545|ref|YP_001507053.1| hypothetical protein Franean1_2... 78.6 3e-13
gi|297181518|gb|ADI17704.1| hypothetical protein [uncultured Oce... 67.8 5e-10
gi|288916396|ref|ZP_06410774.1| aromatic ring hydroxylating diox... 67.8 5e-10
gi|91783074|ref|YP_558280.1| aromatic ring hydroxylating dioxyge... 67.8 5e-10
gi|345137697|dbj|BAK67306.1| hypothetical protein SLG_26310 [Sph... 64.7 4e-09
gi|255591142|ref|XP_002535448.1| conserved hypothetical protein ... 62.4 2e-08
gi|312196375|ref|YP_004016436.1| aromatic-ring-hydroxylating dio... 59.3 2e-07
gi|343927340|ref|ZP_08766813.1| hypothetical protein GOALK_092_0... 58.9 2e-07
gi|254284387|ref|ZP_04959355.1| conserved hypothetical protein [... 58.9 3e-07
gi|312195909|ref|YP_004015970.1| hypothetical protein FraEuI1c_2... 58.2 4e-07
gi|103486353|ref|YP_615914.1| hypothetical protein Sala_0863 [Sp... 56.6 1e-06
gi|345012268|ref|YP_004814622.1| aromatic-ring-hydroxylating dio... 56.6 1e-06
gi|339321911|ref|YP_004680805.1| short-chain dehydrogenase/reduc... 56.6 1e-06
gi|116694598|ref|YP_728809.1| hypothetical protein H16_B0647 [Ra... 55.8 2e-06
gi|288915941|ref|ZP_06410323.1| hypothetical protein FrEUN1fDRAF... 55.1 4e-06
gi|284167046|ref|YP_003405324.1| hypothetical protein Htur_3789 ... 54.7 4e-06
gi|262194952|ref|YP_003266161.1| hypothetical protein Hoch_1719 ... 53.5 9e-06
gi|288916397|ref|ZP_06410775.1| conserved hypothetical protein [... 53.5 1e-05
gi|119504553|ref|ZP_01626632.1| hypothetical protein MGP2080_131... 53.1 1e-05
gi|329897222|ref|ZP_08271961.1| hypothetical protein IMCC3088_26... 51.6 4e-05
gi|148553282|ref|YP_001260864.1| hypothetical protein Swit_0355 ... 51.2 5e-05
gi|254821966|ref|ZP_05226967.1| hypothetical protein MintA_18682... 51.2 6e-05
gi|312198020|ref|YP_004018081.1| hypothetical protein FraEuI1c_4... 50.8 7e-05
gi|114761430|ref|ZP_01441345.1| hypothetical protein 11000110013... 50.4 8e-05
gi|158314546|ref|YP_001507054.1| hypothetical protein Franean1_2... 50.4 9e-05
gi|240170442|ref|ZP_04749101.1| hypothetical protein MkanA1_1411... 50.4 9e-05
gi|78059878|ref|YP_366453.1| hypothetical protein Bcep18194_C676... 50.1 1e-04
gi|146275952|ref|YP_001166112.1| hypothetical protein Saro_3727 ... 50.1 1e-04
gi|312141326|ref|YP_004008662.1| polyketide cyclase [Rhodococcus... 49.7 1e-04
gi|341613724|ref|ZP_08700593.1| hypothetical protein CJLT1_02175... 49.3 2e-04
gi|342859388|ref|ZP_08716042.1| hypothetical protein MCOL_10928 ... 49.3 2e-04
gi|254283111|ref|ZP_04958079.1| conserved hypothetical protein, ... 49.3 2e-04
gi|312141738|ref|YP_004009074.1| hypothetical protein REQ_44330 ... 48.9 2e-04
gi|312141743|ref|YP_004009079.1| hypothetical protein REQ_44380 ... 48.9 3e-04
gi|325672997|ref|ZP_08152691.1| ring hydroxylating beta subunit ... 48.5 3e-04
gi|325524074|gb|EGD02248.1| hypothetical protein B1M_22467 [Burk... 48.5 3e-04
gi|342858501|ref|ZP_08715156.1| hypothetical protein MCOL_06486 ... 48.5 3e-04
gi|336178861|ref|YP_004584236.1| nuclear transport factor 2 [Fra... 48.5 4e-04
gi|108799571|ref|YP_639768.1| hypothetical protein Mmcs_2604 [My... 48.1 4e-04
gi|148554687|ref|YP_001262269.1| hypothetical protein Swit_1769 ... 48.1 5e-04
gi|254818640|ref|ZP_05223641.1| hypothetical protein MintA_01889... 48.1 5e-04
gi|126432646|ref|YP_001068337.1| hypothetical protein Mjls_0033 ... 47.8 5e-04
gi|325673836|ref|ZP_08153526.1| hypothetical protein HMPREF0724_... 47.8 5e-04
gi|297563578|ref|YP_003682552.1| aromatic-ring-hydroxylating dio... 47.8 5e-04
gi|94495150|ref|ZP_01301731.1| hypothetical protein SKA58_01615 ... 47.8 5e-04
>gi|15610608|ref|NP_217989.1| hypothetical protein Rv3472 [Mycobacterium tuberculosis H37Rv]
gi|15843084|ref|NP_338121.1| hypothetical protein MT3578 [Mycobacterium tuberculosis CDC1551]
gi|31794648|ref|NP_857141.1| hypothetical protein Mb3501 [Mycobacterium bovis AF2122/97]
76 more sequence titles
Length=168
Score = 338 bits (867), Expect = 1e-91, Method: Compositional matrix adjust.
Identities = 168/168 (100%), Positives = 168/168 (100%), Gaps = 0/168 (0%)
Query 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA
Sbjct 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
Query 61 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW
Sbjct 61 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
Query 121 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT
Sbjct 121 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
>gi|340628442|ref|YP_004746894.1| hypothetical protein MCAN_34881 [Mycobacterium canettii CIPT
140010059]
gi|340006632|emb|CCC45819.1| conserved hypothetical protein [Mycobacterium canettii CIPT 140010059]
Length=168
Score = 336 bits (862), Expect = 5e-91, Method: Compositional matrix adjust.
Identities = 167/168 (99%), Positives = 167/168 (99%), Gaps = 0/168 (0%)
Query 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA
Sbjct 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
Query 61 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW
Sbjct 61 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
Query 121 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
RIAYRRLRNDRLVSDPSVAVNVADADVA VVGHLLAAARRLGTQMSDT
Sbjct 121 RIAYRRLRNDRLVSDPSVAVNVADADVAVVVGHLLAAARRLGTQMSDT 168
>gi|344221308|gb|AEN01939.1| hypothetical protein MTCTRI2_3540 [Mycobacterium tuberculosis
CTRI-2]
Length=168
Score = 335 bits (858), Expect = 2e-90, Method: Compositional matrix adjust.
Identities = 167/168 (99%), Positives = 167/168 (99%), Gaps = 0/168 (0%)
Query 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA
Sbjct 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
Query 61 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
DAHARVVRG HLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW
Sbjct 61 DAHARVVRGCHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
Query 121 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT
Sbjct 121 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
>gi|146386422|pdb|2CHC|A Chain A, Structure Of Rv3472(D26n), A Function Unknown Protein
From Mycobacterium Tuberculosis
gi|146386423|pdb|2CHC|B Chain B, Structure Of Rv3472(D26n), A Function Unknown Protein
From Mycobacterium Tuberculosis
gi|146386424|pdb|2CHC|C Chain C, Structure Of Rv3472(D26n), A Function Unknown Protein
From Mycobacterium Tuberculosis
Length=170
Score = 334 bits (856), Expect = 3e-90, Method: Compositional matrix adjust.
Identities = 166/168 (99%), Positives = 167/168 (99%), Gaps = 0/168 (0%)
Query 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
M PVDEQWIEILRIQALCARYCLTI+TQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA
Sbjct 3 MGPVDEQWIEILRIQALCARYCLTINTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 62
Query 61 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW
Sbjct 63 DAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 122
Query 121 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT
Sbjct 123 RIAYRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 170
>gi|254552576|ref|ZP_05143023.1| hypothetical protein Mtube_19370 [Mycobacterium tuberculosis
'98-R604 INH-RIF-EM']
Length=165
Score = 330 bits (845), Expect = 6e-89, Method: Compositional matrix adjust.
Identities = 164/165 (99%), Positives = 165/165 (100%), Gaps = 0/165 (0%)
Query 4 VDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAH 63
+DEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAH
Sbjct 1 MDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAH 60
Query 64 ARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIA 123
ARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIA
Sbjct 61 ARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIA 120
Query 124 YRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 168
YRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT
Sbjct 121 YRRLRNDRLVSDPSVAVNVADADVAAVVGHLLAAARRLGTQMSDT 165
>gi|158314545|ref|YP_001507053.1| hypothetical protein Franean1_2721 [Frankia sp. EAN1pec]
gi|158109950|gb|ABW12147.1| conserved hypothetical protein [Frankia sp. EAN1pec]
Length=134
Score = 78.6 bits (192), Expect = 3e-13, Method: Compositional matrix adjust.
Identities = 50/123 (41%), Positives = 61/123 (50%), Gaps = 2/123 (1%)
Query 5 DEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREY-ADAH 63
D ++ L +QA A YC D D G FT DG F FDG RG AL E+ +
Sbjct 3 DSALLDELAVQATLAHYCHRCDDGDLAGVVALFTPDGVFSFDGRTARGSQALLEFFQSSQ 62
Query 64 ARV-VRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRI 122
R RG+HLT + + DGDVA S V A A G G Y D L++ DG+WRI
Sbjct 63 GRPDQRGKHLTVNTVVRPDGDVARSVSDFVFLRAGAEGLVPAIVGRYHDELVRLDGEWRI 122
Query 123 AYR 125
A R
Sbjct 123 ARR 125
>gi|297181518|gb|ADI17704.1| hypothetical protein [uncultured Oceanospirillales bacterium
HF0130_25G24]
Length=153
Score = 67.8 bits (164), Expect = 5e-10, Method: Compositional matrix adjust.
Identities = 47/147 (32%), Positives = 72/147 (49%), Gaps = 19/147 (12%)
Query 3 PVDEQ-WIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALRE--- 58
PV E+ + EIL L RY ID E WA CFT DG+FE + I+GR RE
Sbjct 2 PVSEKDYAEILH---LAGRYAFAIDHNKPEEWADCFTSDGSFESN---IQGRFTGREDLV 55
Query 59 ------YADAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDR 112
A + R RH + EV+G+ A + +++ A+ + + G+Y D+
Sbjct 56 YFCETVMAFCESEGQRPRHWNNQWVIEVEGNKAVSKCYALII--DASNFSPISVGQYNDK 113
Query 113 LIKQDGQWRIAYRRLRNDRLVSDPSVA 139
LI+++G+W R D + DPS++
Sbjct 114 LIRENGEWLFKERIYNFDGEI-DPSLS 139
>gi|288916396|ref|ZP_06410774.1| aromatic ring hydroxylating dioxygenase beta subunit [Frankia
sp. EUN1f]
gi|288352167|gb|EFC86366.1| aromatic ring hydroxylating dioxygenase beta subunit [Frankia
sp. EUN1f]
Length=138
Score = 67.8 bits (164), Expect = 5e-10, Method: Compositional matrix adjust.
Identities = 46/123 (38%), Positives = 59/123 (48%), Gaps = 2/123 (1%)
Query 9 IEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADA-HARV- 66
I++L +Q++ ARYC D D G FT DG F + GR L + R
Sbjct 12 IDVLAVQSVLARYCHRCDDGDFAGLVDLFTPDGVFTYGDRTAHGRSELLAFFQGTQGRPG 71
Query 67 VRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRR 126
RG+HLT +L DGD A S V A G + +G Y D L + DGQWRIA R
Sbjct 72 QRGKHLTFNLDVRPDGDTARSVSDYVFLRTGADGPVLRLAGRYHDELRRLDGQWRIARRD 131
Query 127 LRN 129
+ N
Sbjct 132 VIN 134
>gi|91783074|ref|YP_558280.1| aromatic ring hydroxylating dioxygenase beta subunit [Burkholderia
xenovorans LB400]
gi|91687028|gb|ABE30228.1| Predicted aromatic ring hydroxylating dioxygenase beta subunit
[Burkholderia xenovorans LB400]
Length=145
Score = 67.8 bits (164), Expect = 5e-10, Method: Compositional matrix adjust.
Identities = 39/120 (33%), Positives = 62/120 (52%), Gaps = 5/120 (4%)
Query 13 RIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDG-WVIRGRPALREYADAHARV----V 67
+I L + +C +D Q + +A FTEDG F D +GR A+R +A+ V
Sbjct 10 KIHDLLSSFCDNMDLQRFDEFAALFTEDGVFYADDVGQPKGRAAIRAFAEGILPVEGEGP 69
Query 68 RGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
+ +H+ T++ VDGD A S ++ +A+ I +G Y+D L + +G WR RRL
Sbjct 70 KRKHIMTNVFVNVDGDTARSNSTFIMVRESASEIVIAAAGRYEDELARDNGVWRFKTRRL 129
>gi|345137697|dbj|BAK67306.1| hypothetical protein SLG_26310 [Sphingobium sp. SYK-6]
Length=202
Score = 64.7 bits (156), Expect = 4e-09, Method: Compositional matrix adjust.
Identities = 41/134 (31%), Positives = 62/134 (47%), Gaps = 15/134 (11%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPAL--------------REY 59
I+ L RY D D +A F EDG ++ I+GR A+ R
Sbjct 37 IEDLHGRYLFAFDWHDAASYAATFAEDGILDYGAGEIKGRKAIAAFIEEGRKRTAEARAK 96
Query 60 ADAHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGS-GEYQDRLIKQDG 118
A A R GRH+ ++++ ++DG+ A G + + GY + G Y+D L+K DG
Sbjct 97 APAGERPKGGRHIISNIVVKLDGNKAHGLAYWTHMTSGPTGYGTVDFWGHYEDELVKVDG 156
Query 119 QWRIAYRRLRNDRL 132
QW A RR+ N +
Sbjct 157 QWLYARRRIYNQAI 170
>gi|255591142|ref|XP_002535448.1| conserved hypothetical protein [Ricinus communis]
gi|223523073|gb|EEF26936.1| conserved hypothetical protein [Ricinus communis]
Length=144
Score = 62.4 bits (150), Expect = 2e-08, Method: Compositional matrix adjust.
Identities = 44/123 (36%), Positives = 62/123 (51%), Gaps = 11/123 (8%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTED-----GAFEFDGWVIRGRPALREYADAHARVVR 68
I+ + YC D D E W FTED GAF +++G+ A+RE+ A +
Sbjct 16 IREVLYGYCYGTDGGDTELWVEGFTEDCVWDGGAFG----MLKGKDAMREFHRASGEGSK 71
Query 69 G-RHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
RHLT + + ++DGD A S V +A I G Y D+ +K DGQWRI R++
Sbjct 72 ALRHLTLNSVIDLDGDTAHVVS-YVAVVAQGQPAAIYFLGHYDDQFVKVDGQWRIKSRKV 130
Query 128 RND 130
R D
Sbjct 131 RAD 133
>gi|312196375|ref|YP_004016436.1| aromatic-ring-hydroxylating dioxygenase subunit beta [Frankia
sp. EuI1c]
gi|311227711|gb|ADP80566.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Frankia
sp. EuI1c]
Length=139
Score = 59.3 bits (142), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 41/119 (35%), Positives = 57/119 (48%), Gaps = 5/119 (4%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA-DAHARV-VRGRH 71
I ARYC D D EG FT D F + G L + D +R RG+H
Sbjct 14 IHMTLARYCHRCDDADFEGLVALFTADAVFTYGNRSAHGSAELLAFFRDTQSRPEQRGKH 73
Query 72 LTTDLLYEVDGDVA--TGRSASVVTLATAAGYKILG-SGEYQDRLIKQDGQWRIAYRRL 127
LT + +YE DGD ++ V L A+G + +G Y D+ ++ DG+WRIA R +
Sbjct 74 LTVNEVYEPDGDRGDRVLAASDFVFLRFASGRLVPAIAGRYHDQFVRVDGEWRIARREV 132
>gi|343927340|ref|ZP_08766813.1| hypothetical protein GOALK_092_00030 [Gordonia alkanivorans NBRC
16433]
gi|343762677|dbj|GAA13739.1| hypothetical protein GOALK_092_00030 [Gordonia alkanivorans NBRC
16433]
Length=162
Score = 58.9 bits (141), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 42/122 (35%), Positives = 62/122 (51%), Gaps = 6/122 (4%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWV-IRGRPALREY--ADAHARV 66
+I I+ L ARYC +D + + FTEDG EFDG RG+ +R + A +
Sbjct 16 DIEAIRTLDARYCRHLDDGNWDELIALFTEDG--EFDGLSNPRGKAEMRAFFAGLADGGL 73
Query 67 VRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRR 126
H T+L ++DGD+AT RS + G + +G Y D L+KQDG W ++
Sbjct 74 TSFWHFITNLEIDIDGDMATARSF-LWQPCVLDGVASIAAGRYTDTLVKQDGHWLYRVKK 132
Query 127 LR 128
+R
Sbjct 133 VR 134
>gi|254284387|ref|ZP_04959355.1| conserved hypothetical protein [gamma proteobacterium NOR51-B]
gi|219680590|gb|EED36939.1| conserved hypothetical protein [gamma proteobacterium NOR51-B]
Length=135
Score = 58.9 bits (141), Expect = 3e-07, Method: Compositional matrix adjust.
Identities = 36/128 (29%), Positives = 57/128 (45%), Gaps = 4/128 (3%)
Query 5 DEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHA 64
D ++ L I+AL RYC ++ +D W G + ED +E + G A
Sbjct 3 DNTLMDQLAIRALLDRYCDGVNQRDATIWGGTWAEDAVWELPHLEMSGIQGRENIVSAWV 62
Query 65 RVVRGRHLTTDL----LYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQW 120
++ + + EV GD AT RS + T G +I GEY D +K DG+W
Sbjct 63 EAMKLFPFVNMMAQPGVIEVTGDKATMRSYTTEVAVTQDGNEIRPRGEYHDECVKVDGEW 122
Query 121 RIAYRRLR 128
+ + R+ +
Sbjct 123 KFSLRKFK 130
>gi|312195909|ref|YP_004015970.1| hypothetical protein FraEuI1c_2050 [Frankia sp. EuI1c]
gi|311227245|gb|ADP80100.1| hypothetical protein FraEuI1c_2050 [Frankia sp. EuI1c]
Length=154
Score = 58.2 bits (139), Expect = 4e-07, Method: Compositional matrix adjust.
Identities = 46/131 (36%), Positives = 59/131 (46%), Gaps = 12/131 (9%)
Query 12 LRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRH 71
L I L ARYC+ +D D +GW G FT D F+ DG RG LR RG H
Sbjct 13 LAITGLLARYCVLLDLVDVDGWVGLFTTDAGFDIDGRTYRGHDGLRRLMRT---AQRGTH 69
Query 72 LTTDLLYEVDGD--VATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRI---AYRR 126
L + E G V T R+A V +AA L Y+D +++ WRI R
Sbjct 70 LANPPVIEAVGPDRVRTTRNALFVNRHSAALRHTL----YRDEVVRTADGWRIQSVVCRF 125
Query 127 LRNDRLVSDPS 137
+ D LV+ P
Sbjct 126 VTGDGLVTWPE 136
>gi|103486353|ref|YP_615914.1| hypothetical protein Sala_0863 [Sphingopyxis alaskensis RB2256]
gi|98976430|gb|ABF52581.1| conserved hypothetical protein [Sphingopyxis alaskensis RB2256]
Length=146
Score = 56.6 bits (135), Expect = 1e-06, Method: Compositional matrix adjust.
Identities = 43/120 (36%), Positives = 56/120 (47%), Gaps = 7/120 (5%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRG---R 70
I+ L ARY D + A CF DG E+ G G A++ + R R R
Sbjct 11 IRDLLARYTYHGDRGRIDDLAACFAPDGVLEYPGAAPCGPAAIKASLSSGTRDRRLTFIR 70
Query 71 HLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLRND 130
H T+ L VDGDVAT RS V + + SG Y DR+++ WR A RR+R D
Sbjct 71 HHITNPLIIVDGDVATARSYFCV----HSNFGPDHSGTYDDRIVRTAEGWRFARRRVRID 126
>gi|345012268|ref|YP_004814622.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Streptomyces
violaceusniger Tu 4113]
gi|344038617|gb|AEM84342.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Streptomyces
violaceusniger Tu 4113]
Length=157
Score = 56.6 bits (135), Expect = 1e-06, Method: Compositional matrix adjust.
Identities = 44/135 (33%), Positives = 66/135 (49%), Gaps = 16/135 (11%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALRE--------YADAHAR 65
I+ L ARY +D D G G + F G + GR A+ + YAD R
Sbjct 18 IENLIARYAELVDDGDFAGL-GVLLAEATFTGVGEPVSGRDAIEKMFQDTLIVYADGTPR 76
Query 66 VVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGY---KILGSGEYQDRLIKQDGQWRI 122
+H+TT++ EVD T S S VT+ A + + + +G Y+DR +++GQWR
Sbjct 77 T---QHVTTNVAIEVDEQAGTAVSRSYVTVFQALPHLPLQPIAAGRYRDRFERREGQWRF 133
Query 123 AYRRLRNDRLVSDPS 137
RR+R + L+ D S
Sbjct 134 VERRVRIN-LIGDVS 147
>gi|339321911|ref|YP_004680805.1| short-chain dehydrogenase/reductase SDR [Cupriavidus necator
N-1]
gi|338168518|gb|AEI79572.1| short-chain dehydrogenase/reductase SDR [Cupriavidus necator
N-1]
Length=139
Score = 56.6 bits (135), Expect = 1e-06, Method: Compositional matrix adjust.
Identities = 40/120 (34%), Positives = 54/120 (45%), Gaps = 3/120 (2%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRHLT 73
I AL RY +D +D A FT D G+ + G A+ + + + +H
Sbjct 10 IHALTCRYAQAVDRRDFPKLAALFTADAWLSGPGFRMDGPQAIADGMASLGQYSATQHHV 69
Query 74 TDLLYEVDGDVATGRSASVVTL---ATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLRND 130
L EVDGD ATG + V K+ YQDR +++DGQWRIA R L D
Sbjct 70 HQQLVEVDGDTATGETYCVANHLYEQDCVPRKLDWGIRYQDRFVRRDGQWRIAARELLVD 129
>gi|116694598|ref|YP_728809.1| hypothetical protein H16_B0647 [Ralstonia eutropha H16]
gi|113529097|emb|CAJ95444.1| conserved hypothetical protein [Ralstonia eutropha H16]
Length=139
Score = 55.8 bits (133), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 41/117 (36%), Positives = 55/117 (48%), Gaps = 3/117 (2%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRHLT 73
I AL RY +D +D A FT D G+ + G A+ + + + +H
Sbjct 10 IHALTCRYAQAVDRRDFAKLAALFTADAWLSGPGFRMDGAQAIADGMASLGQYSATQHHV 69
Query 74 TDLLYEVDGDVATGRSASVVT-LATAAGY--KILGSGEYQDRLIKQDGQWRIAYRRL 127
L EVDGD ATG + V L G K+ YQDR +++DGQWRIA R L
Sbjct 70 HQQLVEVDGDTATGETYCVANHLYEQDGVPRKLDWGIRYQDRFVRRDGQWRIAAREL 126
>gi|288915941|ref|ZP_06410323.1| hypothetical protein FrEUN1fDRAFT_0016 [Frankia sp. EUN1f]
gi|288352570|gb|EFC86765.1| hypothetical protein FrEUN1fDRAFT_0016 [Frankia sp. EUN1f]
Length=175
Score = 55.1 bits (131), Expect = 4e-06, Method: Compositional matrix adjust.
Identities = 44/134 (33%), Positives = 59/134 (45%), Gaps = 20/134 (14%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVI-----------------RGRPAL 56
I+ L RYC D QD E + FT DGA F+ + RGR L
Sbjct 29 IRDLVHRYCHAADQQDHEQLSTLFTADGALSFERSPLATPPSSSPLPPSAFVTYRGRADL 88
Query 57 REYADAHARVVRGRHLTTDLLYEVDG-DVATGRSASVVTLATAAGYKILGSGEYQDRLIK 115
R + + RG H+TT+ + DG D ATG S V L GY+I +G Y+DR +
Sbjct 89 RSM--PRSPLPRGLHVTTNTVITFDGQDDATGLSYFVRLLTDDDGYRIGNAGLYRDRYRR 146
Query 116 QDGQWRIAYRRLRN 129
W I R + +
Sbjct 147 TPAGWFIYQRTIHS 160
>gi|284167046|ref|YP_003405324.1| hypothetical protein Htur_3789 [Haloterrigena turkmenica DSM
5511]
gi|284016701|gb|ADB62651.1| hypothetical protein Htur_3789 [Haloterrigena turkmenica DSM
5511]
Length=156
Score = 54.7 bits (130), Expect = 4e-06, Method: Compositional matrix adjust.
Identities = 42/121 (35%), Positives = 59/121 (49%), Gaps = 6/121 (4%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGW-VIRGRPALREYADAHARVV-RGRH 71
IQ L + YC ID +D E FTED + ++ +GR +RE+AD A + R H
Sbjct 16 IQDLRSNYCYAIDDRDFESLPHLFTEDVSLDYGALGTYQGRDGVREFADFVAESLERTTH 75
Query 72 LTTDLLYEV----DGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
L + V D D ATGR + ++ A G GEY+D + D +WRIA +
Sbjct 76 LLANPTVSVGVDGDRDRATGRLYVIASITYADGTGGWRIGEYRDEYRRVDDEWRIADATM 135
Query 128 R 128
R
Sbjct 136 R 136
>gi|262194952|ref|YP_003266161.1| hypothetical protein Hoch_1719 [Haliangium ochraceum DSM 14365]
gi|262078299|gb|ACY14268.1| hypothetical protein Hoch_1719 [Haliangium ochraceum DSM 14365]
Length=152
Score = 53.5 bits (127), Expect = 9e-06, Method: Compositional matrix adjust.
Identities = 44/148 (30%), Positives = 69/148 (47%), Gaps = 13/148 (8%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGW------VIRGRPALREYADAH 63
+IL I+ L ARYCLT D D +G+ C+ E EF G+ + LRE+ H
Sbjct 6 DILEIRNLIARYCLTTDNADADGFMDCWVEPD--EFGGYESGPFGSMATWQELREFEVHH 63
Query 64 ---ARVVRG-RHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQ 119
+ G RH T+++ E G+ A + ++ L A +++ +G Y ++ + +
Sbjct 64 VGPGGMANGKRHQATNIMIEAAGEGAAHVTHDLLVLEVAQEPRVIATGRYNKSVVVRTAK 123
Query 120 -WRIAYRRLRNDRLVSDPSVAVNVADAD 146
WR R L+ D S A N A AD
Sbjct 124 GWRFKSRSLQVDPGFFVLSGASNQAQAD 151
>gi|288916397|ref|ZP_06410775.1| conserved hypothetical protein [Frankia sp. EUN1f]
gi|288352168|gb|EFC86367.1| conserved hypothetical protein [Frankia sp. EUN1f]
Length=133
Score = 53.5 bits (127), Expect = 1e-05, Method: Compositional matrix adjust.
Identities = 41/116 (36%), Positives = 52/116 (45%), Gaps = 7/116 (6%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRHLT 73
I L ARYCLT+D D EGW FTED +++ G G LR+ A G HL
Sbjct 11 IGDLLARYCLTLDLDDVEGWVALFTEDASYQVYGRSFDGHAGLRKMMGA---APGGLHLG 67
Query 74 TDLLYEVDG-DVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLR 128
+ E+DG D A R + T + S Y D L++ WRI R R
Sbjct 68 GPPVIEMDGADTARTRRNLLFVDRTDG---VSRSAVYTDELVRTADGWRIRNTRCR 120
>gi|119504553|ref|ZP_01626632.1| hypothetical protein MGP2080_13143 [marine gamma proteobacterium
HTCC2080]
gi|119459575|gb|EAW40671.1| hypothetical protein MGP2080_13143 [marine gamma proteobacterium
HTCC2080]
Length=133
Score = 53.1 bits (126), Expect = 1e-05, Method: Compositional matrix adjust.
Identities = 32/121 (27%), Positives = 52/121 (43%), Gaps = 4/121 (3%)
Query 12 LRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRH 71
L I+AL RYC ++ +D E W + D +E + G A ++
Sbjct 8 LEIRALLERYCDGVNQRDAEIWGSTWANDAVWELPHLEMSGITGRDNIVSAWLEAMQLFP 67
Query 72 LTTDL----LYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
+ +DGD A RS + G +I GEY+D I+++G+W+ + RR
Sbjct 68 FVNMMAQPGYISIDGDHAVMRSYTSEVAVMQDGTQIEPRGEYEDECIRENGEWKFSLRRF 127
Query 128 R 128
R
Sbjct 128 R 128
>gi|329897222|ref|ZP_08271961.1| hypothetical protein IMCC3088_2632 [gamma proteobacterium IMCC3088]
gi|328921284|gb|EGG28679.1| hypothetical protein IMCC3088_2632 [gamma proteobacterium IMCC3088]
Length=154
Score = 51.6 bits (122), Expect = 4e-05, Method: Compositional matrix adjust.
Identities = 41/123 (34%), Positives = 56/123 (46%), Gaps = 9/123 (7%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWV----IRGR-PALREYADAHARVVR 68
I+ L ARY +DT + EG FTED F G ++GR L+ YA +
Sbjct 10 IKQLKARYFRFLDTGNQEGLESVFTEDATAHFIGGHYDIDVQGRDKLLKFYAYSFTEEKF 69
Query 69 GRHLTTDLLYEVDGDVATGR-SASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIA---Y 124
G H VDGD ATG + + ++GS Y+D +K DG+W+I Y
Sbjct 70 GMHNGHHPEISVDGDNATGLWYLQDIFINLEENTTVMGSAIYEDTYVKVDGEWKIKTTNY 129
Query 125 RRL 127
RL
Sbjct 130 ERL 132
>gi|148553282|ref|YP_001260864.1| hypothetical protein Swit_0355 [Sphingomonas wittichii RW1]
gi|148498472|gb|ABQ66726.1| hypothetical protein Swit_0355 [Sphingomonas wittichii RW1]
Length=181
Score = 51.2 bits (121), Expect = 5e-05, Method: Compositional matrix adjust.
Identities = 41/128 (33%), Positives = 51/128 (40%), Gaps = 19/128 (14%)
Query 14 IQALCARYCLTIDTQDGE-----GWAGCFTEDGAFE-----FDGWVIRGRPALREYADAH 63
IQ RY +D D E W F E G E F W I EY H
Sbjct 23 IQQCLLRYTRGVDRHDRELMLSAYWPNAFDEHGVAEGEAAAFVDWAIGWHG---EYQTRH 79
Query 64 ARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIA 123
++ L E+DGD A G + V G L G Y DR K+DG+WRIA
Sbjct 80 QHIIANHSL------ELDGDTAHGETYYVFWGENRMGPPTLAFGRYVDRFEKRDGEWRIA 133
Query 124 YRRLRNDR 131
+R N++
Sbjct 134 HRVCVNEK 141
>gi|254821966|ref|ZP_05226967.1| hypothetical protein MintA_18682 [Mycobacterium intracellulare
ATCC 13950]
Length=136
Score = 51.2 bits (121), Expect = 6e-05, Method: Compositional matrix adjust.
Identities = 43/115 (38%), Positives = 49/115 (43%), Gaps = 6/115 (5%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRHLT 73
I+ L A Y L +D D +G F DG F G G + A AR G HLT
Sbjct 12 IRDLIAAYALALDAGDVDGCVRLFASDGEFLVYGKTFAGHDGIAGMFRAAAR---GLHLT 68
Query 74 TDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLR 128
VDGD A+ RS L G L Y D L + DGQWR A RR R
Sbjct 69 GVSRITVDGDRASARSQ---VLFVRCGDLQLRPALYDDELTRVDGQWRFARRRCR 120
>gi|312198020|ref|YP_004018081.1| hypothetical protein FraEuI1c_4212 [Frankia sp. EuI1c]
gi|311229356|gb|ADP82211.1| hypothetical protein FraEuI1c_4212 [Frankia sp. EuI1c]
Length=143
Score = 50.8 bits (120), Expect = 7e-05, Method: Compositional matrix adjust.
Identities = 41/126 (33%), Positives = 58/126 (47%), Gaps = 5/126 (3%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEF-DGWVIRGRPALREYADA-HARVVRGRH 71
I + RY ID +D + C+T D ++ D G A+ E+ A H + RH
Sbjct 11 ISEVLIRYATGIDRRDWSLFRTCWTPDVEADYGDMGRFGGADAITEFMTAVHKDMGSTRH 70
Query 72 LTTDLLYEVDGDVATGRSASVVTLA---TAAGYKILGSGEYQDRLIKQDGQWRIAYRRLR 128
++ + EVDGD AT S LA T I G Y+D L++ WRI+ R R
Sbjct 71 QLSNFVIEVDGDRATASSYVHAVLALSPTDPALWIDAVGGYEDELVRTPEGWRISRRTFR 130
Query 129 NDRLVS 134
RL+S
Sbjct 131 PTRLIS 136
>gi|114761430|ref|ZP_01441345.1| hypothetical protein 1100011001310_R2601_03868 [Pelagibaca bermudensis
HTCC2601]
gi|114545678|gb|EAU48680.1| hypothetical protein R2601_03868 [Roseovarius sp. HTCC2601]
Length=148
Score = 50.4 bits (119), Expect = 8e-05, Method: Compositional matrix adjust.
Identities = 40/133 (31%), Positives = 63/133 (48%), Gaps = 14/133 (10%)
Query 14 IQALCAR----YCLTIDTQDGEGWAGCFTEDGAFE--FDGWVIRGRPALREYADAHARVV 67
I+ CA+ YC +D D + +A +TED ++ + I GR A+R++ + +
Sbjct 3 IEHECAKLTVLYCRHLDHLDPDAFAAIYTEDAVYKPAVEPVPIEGRAAIRDWIGRYPKDR 62
Query 68 RGRHLTTDLLYEV-DGDVATGRSASVVTLATAAGYKILGSG-------EYQDRLIKQDGQ 119
GRH+ T+ + EV D D ATG S ++V A ++ EY D +
Sbjct 63 LGRHVATNQIVEVIDEDSATGSSYAIVFREPAPREGVISDRVTPRSLVEYSDSYRRTAEG 122
Query 120 WRIAYRRLRNDRL 132
W+IA R R D L
Sbjct 123 WKIARRVYRFDFL 135
>gi|158314546|ref|YP_001507054.1| hypothetical protein Franean1_2722 [Frankia sp. EAN1pec]
gi|158109951|gb|ABW12148.1| conserved hypothetical protein [Frankia sp. EAN1pec]
Length=133
Score = 50.4 bits (119), Expect = 9e-05, Method: Compositional matrix adjust.
Identities = 43/129 (34%), Positives = 60/129 (47%), Gaps = 10/129 (7%)
Query 1 MRPVDEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYA 60
M P E + I L ARYCLT+D D +GW G FTE +++ G G LR
Sbjct 1 MLPPPEDQVAI---GDLLARYCLTLDLDDVDGWVGLFTEGASYQVYGRSFDGHEGLRAM- 56
Query 61 DAHARVVRGRHLTTDLLYE-VDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQ 119
A G HL + E + DVA S +++ +++A G + S Y D L++
Sbjct 57 --MAAAPGGLHLGGPPVIEMLSADVAR-TSRNLLFVSSADG--VSRSAVYTDELVRTPDG 111
Query 120 WRIAYRRLR 128
WRI R R
Sbjct 112 WRIRSCRCR 120
>gi|240170442|ref|ZP_04749101.1| hypothetical protein MkanA1_14110 [Mycobacterium kansasii ATCC
12478]
Length=139
Score = 50.4 bits (119), Expect = 9e-05, Method: Compositional matrix adjust.
Identities = 36/118 (31%), Positives = 52/118 (45%), Gaps = 7/118 (5%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRG 69
+++ IQ L ARY +TI +D EG FT DG + G L + + A +G
Sbjct 7 DLVEIQQLLARYAVTITREDIEGLLSVFTPDGTYSAFGDTYH----LDRFPELVAAAPKG 62
Query 70 RHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
LT L E+DGD ATG +I G Y+D ++ WR+ R +
Sbjct 63 LFLTGTALVELDGDAATGTQPLCFIEHATHDMRI---GYYRDSYVRTADGWRLKTRAM 117
>gi|78059878|ref|YP_366453.1| hypothetical protein Bcep18194_C6762 [Burkholderia sp. 383]
gi|77964428|gb|ABB05809.1| hypothetical protein Bcep18194_C6762 [Burkholderia sp. 383]
Length=151
Score = 50.1 bits (118), Expect = 1e-04, Method: Compositional matrix adjust.
Identities = 36/132 (28%), Positives = 58/132 (44%), Gaps = 2/132 (1%)
Query 13 RIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHAR-VVRGRH 71
I L ++YC ID +D + + +DG +E G +R+ R V + H
Sbjct 20 EITTLMSKYCHGIDKKDEAIFMSIWADDGVYELPRGQTSGIDGIRQLVHKVWREVPKCHH 79
Query 72 LTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLRNDR 131
T+ L ++DGD AT ++ + T G L SG Y R + G+W+ Y + +
Sbjct 80 HITNPLIDIDGDRATAKTDVIYYRQTDDGVLQLLSGTYAFRFARIAGEWKTTYLKFASFD 139
Query 132 LVSDPSVAVNVA 143
VS P N+
Sbjct 140 TVS-PVFKENIG 150
>gi|146275952|ref|YP_001166112.1| hypothetical protein Saro_3727 [Novosphingobium aromaticivorans
DSM 12444]
gi|145322643|gb|ABP64586.1| hypothetical protein Saro_3727 [Novosphingobium aromaticivorans
DSM 12444]
Length=153
Score = 50.1 bits (118), Expect = 1e-04, Method: Compositional matrix adjust.
Identities = 42/135 (32%), Positives = 62/135 (46%), Gaps = 10/135 (7%)
Query 5 DEQWIEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGW--VIRGRPA--LREYA 60
D + L IQAL Y +D+ D +G F F+G + PA R +A
Sbjct 6 DFSIADYLAIQALVHSYPRRLDSGDLQGLGALFAH-ATVHFEGRDDPVVNDPAEVTRMFA 64
Query 61 D---AHARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKI--LGSGEYQDRLIK 115
D + V R RH+ +L+ E DG A +++V L A G + + +G+Y+DR K
Sbjct 65 DFVRLYDGVPRTRHMICNLIVEPDGPGAAVATSAVFVLQDAPGVPLQPIITGDYRDRFEK 124
Query 116 QDGQWRIAYRRLRND 130
WR A R + ND
Sbjct 125 VGDTWRFAERFITND 139
>gi|312141326|ref|YP_004008662.1| polyketide cyclase [Rhodococcus equi 103S]
gi|311890665|emb|CBH49984.1| putative polyketide cyclase [Rhodococcus equi 103S]
Length=165
Score = 49.7 bits (117), Expect = 1e-04, Method: Compositional matrix adjust.
Identities = 41/134 (31%), Positives = 56/134 (42%), Gaps = 12/134 (8%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEF------DGWVIRGRPALREYADAH 63
+ L + L Y D+ D + WA FT DG FE G IRGR ALR++
Sbjct 26 DTLAVLQLEGAYSPAWDSGDADAWAALFTVDGVFELAEVGAVPGTTIRGRDALRQFCVDF 85
Query 64 ARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILG---SGEYQDRLIKQDGQW 120
G HL +DGD AT R ++ + +G Y + + W
Sbjct 86 TATTSGIHLLNTPSIVLDGDEATARVHFEFRSGASSDTETRHAHVAGHYTVQYRRTPEGW 145
Query 121 RIAYRR---LRNDR 131
RIA+RR +R DR
Sbjct 146 RIAHRREVAVRRDR 159
>gi|341613724|ref|ZP_08700593.1| hypothetical protein CJLT1_02175 [Citromicrobium sp. JLT1363]
Length=142
Score = 49.3 bits (116), Expect = 2e-04, Method: Compositional matrix adjust.
Identities = 39/120 (33%), Positives = 57/120 (48%), Gaps = 7/120 (5%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRG---- 69
IQ A+YC+ +DT + A F +D EF G G+ A+ +A A+ +R
Sbjct 9 IQGHLAKYCILVDTAEPAAIAELFWDDARLEFGG-EYEGKDAILACFEAWAKDMREPIEG 67
Query 70 -RHLTTDLLYEVDGDVATGRS-ASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
RHL E+DG+ AT +S A A +G I Y D L K++G W+ RR+
Sbjct 68 LRHLLHIPHIEIDGETATSKSYADADGHAKRSGKPIRNRAMYVDVLEKREGAWKFRDRRI 127
>gi|342859388|ref|ZP_08716042.1| hypothetical protein MCOL_10928 [Mycobacterium colombiense CECT
3035]
gi|342133629|gb|EGT86832.1| hypothetical protein MCOL_10928 [Mycobacterium colombiense CECT
3035]
Length=136
Score = 49.3 bits (116), Expect = 2e-04, Method: Compositional matrix adjust.
Identities = 45/129 (35%), Positives = 54/129 (42%), Gaps = 9/129 (6%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRGRHLT 73
I+ L A Y L +D D + F DG F G G A+ A AR G HL
Sbjct 12 IRELLAGYALALDVGDADECVHLFAPDGEFLVYGKTFAGHDAIAGMFRAAAR---GLHLN 68
Query 74 TDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLR---ND 130
EV G+ AT RS L AG L Y D L + GQWR A RR R +
Sbjct 69 GSARIEVVGERATARSQ---VLFVRAGDLQLRPAIYDDELTRTAGQWRFARRRCRFVTSA 125
Query 131 RLVSDPSVA 139
L + P V+
Sbjct 126 GLANSPEVS 134
>gi|254283111|ref|ZP_04958079.1| conserved hypothetical protein, putative [gamma proteobacterium
NOR51-B]
gi|219679314|gb|EED35663.1| conserved hypothetical protein, putative [gamma proteobacterium
NOR51-B]
Length=185
Score = 49.3 bits (116), Expect = 2e-04, Method: Compositional matrix adjust.
Identities = 39/117 (34%), Positives = 55/117 (48%), Gaps = 5/117 (4%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAF-EFDGWV-IRGRPALREYADAHARVV-RGR 70
I+AL ARY + +D +D EG FT+D DG + RGR A E V+
Sbjct 20 IKALVARYGIVMDDRDIEGMPDLFTDDVHIRSLDGVMDSRGRDAAVELFKGRFEVLGPSN 79
Query 71 HLTTDLLYEVDGDVATGRSASVVTLA--TAAGYKILGSGEYQDRLIKQDGQWRIAYR 125
H T D + E D + +V++ A G +L + Y DR + G+WRIA R
Sbjct 80 HFTHDKIIEFDENDPDSARGTVLSHAEMNRKGQPMLAAIRYHDRYRRDAGKWRIAER 136
>gi|312141738|ref|YP_004009074.1| hypothetical protein REQ_44330 [Rhodococcus equi 103S]
gi|311891077|emb|CBH50396.1| conserved hypothetical protein [Rhodococcus equi 103S]
Length=162
Score = 48.9 bits (115), Expect = 2e-04, Method: Compositional matrix adjust.
Identities = 39/136 (29%), Positives = 62/136 (46%), Gaps = 6/136 (4%)
Query 9 IEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVR 68
I+ I + RY +D +D + A C+ D G++ + + HA +
Sbjct 4 IDKQEITEVLYRYARAVDRKDFDRVADCYFPDAIDNHGGYIGTVAGLIEDMKSRHATIDS 63
Query 69 GRHLTTDLLYEVDGDVATGRSASVVTL----ATAAGYKILGSGE--YQDRLIKQDGQWRI 122
H T++L ++DGD A S + L A A G + + + Y DR ++DGQWRI
Sbjct 64 SLHYVTNVLIDLDGDTADVESYCLCYLRQAPAVAGGPQSRATVKCRYVDRFERRDGQWRI 123
Query 123 AYRRLRNDRLVSDPSV 138
A R + D V+D V
Sbjct 124 ADRIVVFDESVTDEIV 139
>gi|312141743|ref|YP_004009079.1| hypothetical protein REQ_44380 [Rhodococcus equi 103S]
gi|311891082|emb|CBH50401.1| conserved hypothetical protein [Rhodococcus equi 103S]
Length=150
Score = 48.9 bits (115), Expect = 3e-04, Method: Compositional matrix adjust.
Identities = 40/131 (31%), Positives = 62/131 (48%), Gaps = 9/131 (6%)
Query 9 IEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWV--IRGRPALREYADA---H 63
+ I+ I L R ID WA FT DG F + + G AL ++ADA +
Sbjct 14 VAIVEIHQLYGRQSHLIDEGHASEWAATFTADGEFHSPSYPAPVVGVEALTQFADAFFTN 73
Query 64 ARVV--RGRHLTTDLLYEVDGDVATGRSASVVTLATAAG--YKILGSGEYQDRLIKQDGQ 119
A V RH+ +++ + GD A + +AT G +++ DR++++ GQ
Sbjct 74 ASVAGEAHRHVLSNIAVDRVGDDELEVHAYLQIVATRIGGDSRLVRFTTVTDRVVREGGQ 133
Query 120 WRIAYRRLRND 130
WRIA R +R D
Sbjct 134 WRIARRVVRRD 144
>gi|325672997|ref|ZP_08152691.1| ring hydroxylating beta subunit superfamily protein [Rhodococcus
equi ATCC 33707]
gi|325556250|gb|EGD25918.1| ring hydroxylating beta subunit superfamily protein [Rhodococcus
equi ATCC 33707]
Length=149
Score = 48.5 bits (114), Expect = 3e-04, Method: Compositional matrix adjust.
Identities = 40/131 (31%), Positives = 62/131 (48%), Gaps = 9/131 (6%)
Query 9 IEILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWV--IRGRPALREYADA---H 63
+ I+ I L R ID WA FT DG F + + G AL ++ADA +
Sbjct 13 VAIVEIHQLYGRQSHLIDEGHASEWAATFTADGEFHSPSYPAPVVGVEALTQFADAFFTN 72
Query 64 ARVV--RGRHLTTDLLYEVDGDVATGRSASVVTLATAAG--YKILGSGEYQDRLIKQDGQ 119
A V RH+ +++ + GD A + +AT G +++ DR++++ GQ
Sbjct 73 ASVAGEAHRHVLSNIAVDRVGDDELEVHAYLQIVATRIGGDSRLVRFTTVTDRVVREGGQ 132
Query 120 WRIAYRRLRND 130
WRIA R +R D
Sbjct 133 WRIARRVVRRD 143
>gi|325524074|gb|EGD02248.1| hypothetical protein B1M_22467 [Burkholderia sp. TJI49]
Length=151
Score = 48.5 bits (114), Expect = 3e-04, Method: Compositional matrix adjust.
Identities = 37/130 (29%), Positives = 56/130 (44%), Gaps = 2/130 (1%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHAR-VVRGRHL 72
I+ L ++YC ID D + + +DG +E G +R+ R V + H
Sbjct 21 IRTLMSKYCHGIDKHDEALFMSIWADDGIYELPRGQTAGIDGIRQLVHKVWREVPKCHHH 80
Query 73 TTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLRNDRL 132
T+ L E+DGD AT + + T G L SG Y R + G W+ Y + +
Sbjct 81 ITNPLIEIDGDRATAATDVIYYRQTDDGVLQLLSGTYAFRFARIAGAWKTTYLKFSSFGT 140
Query 133 VSDPSVAVNV 142
VS P N+
Sbjct 141 VS-PVFKENI 149
>gi|342858501|ref|ZP_08715156.1| hypothetical protein MCOL_06486 [Mycobacterium colombiense CECT
3035]
gi|342134205|gb|EGT87385.1| hypothetical protein MCOL_06486 [Mycobacterium colombiense CECT
3035]
Length=187
Score = 48.5 bits (114), Expect = 3e-04, Method: Compositional matrix adjust.
Identities = 45/120 (38%), Positives = 56/120 (47%), Gaps = 7/120 (5%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADA-HARVVR 68
EIL A AR C D D + A + DG E G+ PA +A+A HA+ R
Sbjct 22 EILDCIARHARGC---DRHDVDLIAAAYHSDGIDEH-GYATNAGPAYGAWANATHAQTSR 77
Query 69 -GRHLTTDLLYEVDGDVATGRSASVVTL-ATAAGYKILGSGEYQDRLIKQDGQWRIAYRR 126
H T E+ GD A S +V L A +G Y DRL ++DGQWRIA RR
Sbjct 78 VHTHNITTHTCEIGGDTAHAESYVIVVLIGPDAKSAQFITGRYLDRLERRDGQWRIAVRR 137
>gi|336178861|ref|YP_004584236.1| nuclear transport factor 2 [Frankia symbiont of Datisca glomerata]
gi|334859841|gb|AEH10315.1| nuclear transport factor 2 [Frankia symbiont of Datisca glomerata]
Length=150
Score = 48.5 bits (114), Expect = 4e-04, Method: Compositional matrix adjust.
Identities = 36/116 (32%), Positives = 50/116 (44%), Gaps = 2/116 (1%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAF-EFDGWVIRGRPALREYA-DAHARVVRGRH 71
I L A+Y D E A FT+DG F D + GR AL E AR RH
Sbjct 15 ITELYAQYTHAFDDNSPEDLADLFTDDGIFVRDDAEPVHGRAALAELVRGVAARGAGSRH 74
Query 72 LTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
L + ++ E A+G + V A +++ G Y D + +G+WR RR
Sbjct 75 LVSSVVVEPSATGASGSAYVQVISIDADTVRLVVIGRYHDEFARSEGRWRFRSRRF 130
>gi|108799571|ref|YP_639768.1| hypothetical protein Mmcs_2604 [Mycobacterium sp. MCS]
gi|119868681|ref|YP_938633.1| hypothetical protein Mkms_2648 [Mycobacterium sp. KMS]
gi|126435214|ref|YP_001070905.1| hypothetical protein Mjls_2633 [Mycobacterium sp. JLS]
gi|108769990|gb|ABG08712.1| hypothetical protein Mmcs_2604 [Mycobacterium sp. MCS]
gi|119694770|gb|ABL91843.1| conserved hypothetical protein [Mycobacterium sp. KMS]
gi|126235014|gb|ABN98414.1| conserved hypothetical protein [Mycobacterium sp. JLS]
Length=149
Score = 48.1 bits (113), Expect = 4e-04, Method: Compositional matrix adjust.
Identities = 42/131 (33%), Positives = 57/131 (44%), Gaps = 9/131 (6%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALRE-YADAHARVVR-GRH 71
I+ L A + ID + G G A FT DG +E + GR + E Y HA R RH
Sbjct 14 IETLVAEFAWLIDHESGRGVAELFTHDGEYEMGPVSLSGRSEIEEFYRRRHAAGPRTSRH 73
Query 72 LTTDL-LYEVDGDVATGRSASVVTLATAAGYKILGS---GEYQDRLIK-QDGQWRIAYRR 126
L T+L L +VDGD G + A L +Y D ++ +DG W +RR
Sbjct 74 LFTNLRLRDVDGDSVRGTCVLSLHAANGVPPHPLSPVIVADYDDEYVRGEDGSW--LFRR 131
Query 127 LRNDRLVSDPS 137
L +P
Sbjct 132 RTVTVLFGEPP 142
>gi|148554687|ref|YP_001262269.1| hypothetical protein Swit_1769 [Sphingomonas wittichii RW1]
gi|148499877|gb|ABQ68131.1| hypothetical protein Swit_1769 [Sphingomonas wittichii RW1]
Length=139
Score = 48.1 bits (113), Expect = 5e-04, Method: Compositional matrix adjust.
Identities = 37/115 (33%), Positives = 53/115 (47%), Gaps = 9/115 (7%)
Query 17 LCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREY----ADAHARVVRGRHL 72
L ARY + G FT DGAFE +RGR L + A H VV L
Sbjct 20 LVARYAWLVGQGRGAEVPALFTADGAFEGRNQQLRGREELARFYARSASGHGEVV---PL 76
Query 73 TTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
++++++ GD ATG S ++ +G + SG Y+D + DG+W RR
Sbjct 77 VGNMIFDLAGDRATGVS-TLTGHKRGSGVHVF-SGVYEDDFARVDGRWLFDRRRF 129
>gi|254818640|ref|ZP_05223641.1| hypothetical protein MintA_01889 [Mycobacterium intracellulare
ATCC 13950]
Length=156
Score = 48.1 bits (113), Expect = 5e-04, Method: Compositional matrix adjust.
Identities = 32/118 (28%), Positives = 54/118 (46%), Gaps = 7/118 (5%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGWVIRGRPALREYADAHARVVRG 69
+++ IQ L RY +TI QD +G FT DG + G +L + + A +G
Sbjct 16 DLVEIQQLLGRYAVTITQQDIDGLVAVFTPDGTYSAFGETY----SLSRFPELVAAAPKG 71
Query 70 RHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRL 127
LT + ++DGD A+G ++ +I G Y+D ++ WR+ R +
Sbjct 72 LFLTGTAVVDLDGDAASGTQPLCFIDHSSHDMRI---GYYRDTYVRTADGWRLKTRAM 126
>gi|126432646|ref|YP_001068337.1| hypothetical protein Mjls_0033 [Mycobacterium sp. JLS]
gi|126232446|gb|ABN95846.1| conserved hypothetical protein [Mycobacterium sp. JLS]
Length=135
Score = 47.8 bits (112), Expect = 5e-04, Method: Compositional matrix adjust.
Identities = 38/122 (32%), Positives = 54/122 (45%), Gaps = 3/122 (2%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEF-DGWVIRGRPALREYAD-AHARVVRGRH 71
I + RY ID +D + FT+D A ++ D G A+ E+ D AHA H
Sbjct 7 IADVLIRYATGIDRRDWPLFRTVFTDDCALDYGDIGTFDGVDAVTEFMDQAHAMAGHTLH 66
Query 72 LTTDLLYEVDGDVATGRS-ASVVTLATAAGYKILGSGEYQDRLIKQDGQWRIAYRRLRND 130
T+ +VDGD A R+ + A + G Y D L++ WRIA RR
Sbjct 67 RLTNFAIDVDGDSARARAYVDALIFAPDNASGVNAVGFYDDELVRTPAGWRIARRRFTTV 126
Query 131 RL 132
R+
Sbjct 127 RV 128
>gi|325673836|ref|ZP_08153526.1| hypothetical protein HMPREF0724_11308 [Rhodococcus equi ATCC
33707]
gi|325555101|gb|EGD24773.1| hypothetical protein HMPREF0724_11308 [Rhodococcus equi ATCC
33707]
Length=166
Score = 47.8 bits (112), Expect = 5e-04, Method: Compositional matrix adjust.
Identities = 40/134 (30%), Positives = 55/134 (42%), Gaps = 12/134 (8%)
Query 10 EILRIQALCARYCLTIDTQDGEGWAGCFTEDGAFEF------DGWVIRGRPALREYADAH 63
+ L + L Y D+ D + WA FT DG FE G +RGR ALR++
Sbjct 27 DTLAVLQLEGAYSPAWDSGDADAWAALFTVDGVFELAQVGAVPGTTVRGRDALRQFCVDF 86
Query 64 ARVVRGRHLTTDLLYEVDGDVATGRSASVVTLATAAGYKILG---SGEYQDRLIKQDGQW 120
G HL +DGD AT R ++ + +G Y + W
Sbjct 87 TATTSGIHLLNTPSIVLDGDEATARVHFEFRSGASSDTETRHAHVAGHYTVLYRRTPEGW 146
Query 121 RIAYRR---LRNDR 131
RIA+RR +R DR
Sbjct 147 RIAHRREVAVRRDR 160
>gi|297563578|ref|YP_003682552.1| aromatic-ring-hydroxylating dioxygenase subunit beta [Nocardiopsis
dassonvillei subsp. dassonvillei DSM 43111]
gi|296848026|gb|ADH70046.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Nocardiopsis
dassonvillei subsp. dassonvillei DSM 43111]
Length=137
Score = 47.8 bits (112), Expect = 5e-04, Method: Compositional matrix adjust.
Identities = 40/126 (32%), Positives = 52/126 (42%), Gaps = 9/126 (7%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFDGW--VIRGRPALREYADAHARVVRG-- 69
I L AR ID + WA FT DG F + + GR LR +A+ A R
Sbjct 9 IHQLYARQSHLIDGGHADAWARTFTPDGEFHSPTYPAPVVGRDRLRAFAEDFAHAAREAG 68
Query 70 ---RHLTTDLLYEVDGDVATGRSASVVTLAT--AAGYKILGSGEYQDRLIKQDGQWRIAY 124
RH+ T+L E SA + + T IL DRL++ WRIA
Sbjct 69 EVRRHVVTNLFVESADTTQAQVSAYLQVIGTRVTGDTAILRFTTVSDRLVRDGQDWRIAR 128
Query 125 RRLRND 130
R +R D
Sbjct 129 RDVRRD 134
>gi|94495150|ref|ZP_01301731.1| hypothetical protein SKA58_01615 [Sphingomonas sp. SKA58]
gi|94425416|gb|EAT10436.1| hypothetical protein SKA58_01615 [Sphingomonas sp. SKA58]
Length=151
Score = 47.8 bits (112), Expect = 5e-04, Method: Compositional matrix adjust.
Identities = 36/125 (29%), Positives = 55/125 (44%), Gaps = 11/125 (8%)
Query 14 IQALCARYCLTIDTQDGEGWAGCFTEDGAFEFD----GWVIRGRPALREYADAHARVVRG 69
I+ + ARYC +DT+D +G+ FT D + + + G A+ E + R
Sbjct 14 IRNVKARYCRFLDTKDWDGFLSLFTPDAVLDVEEDTGNPPMSGHAAILEQVRSAVIDARS 73
Query 70 RHLTTDLLYEVDGDVA---TGRSASVV----TLATAAGYKILGSGEYQDRLIKQDGQWRI 122
H ++ GD A T VV G I G G Y +R +++DGQW+I
Sbjct 74 AHQIHSSEIDLRGDEADMITAMQDRVVWAPGKCPIPGGQSITGFGHYTERYVRRDGQWKI 133
Query 123 AYRRL 127
A +L
Sbjct 134 AALKL 138
Lambda K H
0.322 0.137 0.419
Gapped
Lambda K H
0.267 0.0410 0.140
Effective search space used: 130819067360
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
Posted date: Sep 5, 2011 4:36 AM
Number of letters in database: 5,219,829,388
Number of sequences in database: 15,229,318
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40