BLASTP 2.2.25+
Reference:
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST
protein database searches with composition-based statistics and
other refinements", Nucleic Acids Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
15,229,318 sequences; 5,219,829,388 total letters
Query= Rv1954A
Length=100
Score E
Sequences producing significant alignments: (Bits) Value
gi|332321901|sp|P0CV86.1|Y954A_MYCTU RecName: Full=Uncharacteriz... 199 1e-49
gi|308231991|ref|ZP_07663967.1| hypothetical protein TMAG_02132 ... 145 2e-33
gi|336120575|ref|YP_004575361.1| hypothetical protein MLP_49440 ... 89.7 1e-16
gi|312193910|ref|YP_004013971.1| hypothetical protein FraEuI1c_0... 81.6 3e-14
gi|257074511|ref|YP_003162908.1| hypothetical protein CAP2UW1_45... 56.2 1e-06
gi|229494973|ref|ZP_04388723.1| conserved hypothetical protein [... 54.3 5e-06
gi|294667860|ref|ZP_06733069.1| hypothetical protein XAUC_38210 ... 51.6 4e-05
gi|291543334|emb|CBL16443.1| hypothetical protein RUM_01900 [Rum... 50.8 7e-05
gi|289165202|ref|YP_003455340.1| hypothetical protein LLO_1865 [... 50.1 1e-04
gi|330719200|ref|ZP_08313800.1| hypothetical protein LfalK3_0824... 36.2 1.8
gi|296117367|ref|ZP_06835957.1| hypothetical protein GXY_16162 [... 35.8 2.2
gi|257898390|ref|ZP_05678043.1| predicted protein [Enterococcus ... 35.0 3.6
gi|330991891|ref|ZP_08315840.1| hypothetical protein SXCC_01796 ... 33.9 8.0
>gi|332321901|sp|P0CV86.1|Y954A_MYCTU RecName: Full=Uncharacterized protein Rv1954A
Length=100
Score = 199 bits (506), Expect = 1e-49, Method: Compositional matrix adjust.
Identities = 100/100 (100%), Positives = 100/100 (100%), Gaps = 0/100 (0%)
Query 1 MARGRVVCIGDAGCDCTPGVFRATAGGMPVLVVIESGTGGDQMARKATSPGKPAPTSGQY 60
MARGRVVCIGDAGCDCTPGVFRATAGGMPVLVVIESGTGGDQMARKATSPGKPAPTSGQY
Sbjct 1 MARGRVVCIGDAGCDCTPGVFRATAGGMPVLVVIESGTGGDQMARKATSPGKPAPTSGQY 60
Query 61 RPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGRG 100
RPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGRG
Sbjct 61 RPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGRG 100
>gi|308231991|ref|ZP_07663967.1| hypothetical protein TMAG_02132 [Mycobacterium tuberculosis SUMu001]
gi|308369570|ref|ZP_07666753.1| hypothetical protein TMBG_00489 [Mycobacterium tuberculosis SUMu002]
gi|308370873|ref|ZP_07667034.1| hypothetical protein TMCG_00038 [Mycobacterium tuberculosis SUMu003]
20 more sequence titles
Length=73
Score = 145 bits (366), Expect = 2e-33, Method: Compositional matrix adjust.
Identities = 73/73 (100%), Positives = 73/73 (100%), Gaps = 0/73 (0%)
Query 28 MPVLVVIESGTGGDQMARKATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKW 87
MPVLVVIESGTGGDQMARKATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKW
Sbjct 1 MPVLVVIESGTGGDQMARKATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKW 60
Query 88 VNVDPTKNKSGRG 100
VNVDPTKNKSGRG
Sbjct 61 VNVDPTKNKSGRG 73
>gi|336120575|ref|YP_004575361.1| hypothetical protein MLP_49440 [Microlunatus phosphovorus NM-1]
gi|334688373|dbj|BAK37958.1| hypothetical protein MLP_49440 [Microlunatus phosphovorus NM-1]
Length=57
Score = 89.7 bits (221), Expect = 1e-16, Method: Compositional matrix adjust.
Identities = 45/54 (84%), Positives = 46/54 (86%), Gaps = 0/54 (0%)
Query 46 KATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGR 99
K PG PAPTSGQYRPVGGG EVTVPKGHRLPP P+PG WVNVDPTKNKSGR
Sbjct 3 KGNKPGTPAPTSGQYRPVGGGPEVTVPKGHRLPPGPRPGVTWVNVDPTKNKSGR 56
>gi|312193910|ref|YP_004013971.1| hypothetical protein FraEuI1c_0013 [Frankia sp. EuI1c]
gi|311225246|gb|ADP78101.1| hypothetical protein FraEuI1c_0013 [Frankia sp. EuI1c]
Length=58
Score = 81.6 bits (200), Expect = 3e-14, Method: Compositional matrix adjust.
Identities = 42/58 (73%), Positives = 44/58 (76%), Gaps = 0/58 (0%)
Query 43 MARKATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGRG 100
M K PG PAP SGQYRP GGG EVTVPKGHRLPP+P+PGQ W VD TKN SGRG
Sbjct 1 MLAKPVKPGTPAPASGQYRPKGGGAEVTVPKGHRLPPTPRPGQVWKIVDRTKNASGRG 58
>gi|257074511|ref|YP_003162908.1| hypothetical protein CAP2UW1_4501 [Candidatus Accumulibacter
phosphatis clade IIA str. UW-1]
gi|257048732|gb|ACV37917.1| hypothetical protein CAP2UW1_4501 [Candidatus Accumulibacter
phosphatis clade IIA str. UW-1]
Length=54
Score = 56.2 bits (134), Expect = 1e-06, Method: Compositional matrix adjust.
Identities = 31/51 (61%), Positives = 36/51 (71%), Gaps = 0/51 (0%)
Query 46 KATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNK 96
K PG PAP SGQY+ G GNEVT KG LPP+P+PGQ + VDPTK+K
Sbjct 3 KILKPGTPAPRSGQYKNPGTGNEVTGVKGKPLPPTPRPGQGYTLVDPTKHK 53
>gi|229494973|ref|ZP_04388723.1| conserved hypothetical protein [Rhodococcus erythropolis SK121]
gi|229318125|gb|EEN83996.1| conserved hypothetical protein [Rhodococcus erythropolis SK121]
Length=63
Score = 54.3 bits (129), Expect = 5e-06, Method: Compositional matrix adjust.
Identities = 33/62 (54%), Positives = 39/62 (63%), Gaps = 5/62 (8%)
Query 43 MARKATSPGKPAPTSGQYRPVG-----GGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKS 97
MA K PG PAP SGQ VG G+E T +G LPP+PKPGQ ++ VDPTKN +
Sbjct 1 MASKPLKPGTPAPRSGQVEIVGPRGGRTGDERTTVRGKPLPPTPKPGQGYILVDPTKNGA 60
Query 98 GR 99
GR
Sbjct 61 GR 62
>gi|294667860|ref|ZP_06733069.1| hypothetical protein XAUC_38210 [Xanthomonas fuscans subsp. aurantifolii
str. ICPB 10535]
gi|292602363|gb|EFF45805.1| hypothetical protein XAUC_38210 [Xanthomonas fuscans subsp. aurantifolii
str. ICPB 10535]
Length=156
Score = 51.6 bits (122), Expect = 4e-05, Method: Compositional matrix adjust.
Identities = 30/56 (54%), Positives = 37/56 (67%), Gaps = 5/56 (8%)
Query 50 PGKPAPTSGQYRPVG-----GGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGRG 100
PG+ AP SGQY VG G E+T +G LPP+P PGQ +V VDP+KN +GRG
Sbjct 100 PGQAAPRSGQYERVGPRGGATGQEITGVRGKPLPPTPGPGQGYVLVDPSKNGAGRG 155
>gi|291543334|emb|CBL16443.1| hypothetical protein RUM_01900 [Ruminococcus sp. 18P13]
Length=54
Score = 50.8 bits (120), Expect = 7e-05, Method: Compositional matrix adjust.
Identities = 28/53 (53%), Positives = 35/53 (67%), Gaps = 1/53 (1%)
Query 43 MARKATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPPSPKPGQKWVNVDPTKN 95
MA K T G+ AP SGQY+PVG EVT+ KG +PP+PK +V VD TK+
Sbjct 1 MATK-TKTGQKAPVSGQYKPVGSKTEVTLVKGKTVPPTPKGATTFVLVDKTKH 52
>gi|289165202|ref|YP_003455340.1| hypothetical protein LLO_1865 [Legionella longbeachae NSW150]
gi|288858375|emb|CBJ12243.1| hypothetical protein LLO_1865 [Legionella longbeachae NSW150]
Length=59
Score = 50.1 bits (118), Expect = 1e-04, Method: Compositional matrix adjust.
Identities = 30/55 (55%), Positives = 35/55 (64%), Gaps = 5/55 (9%)
Query 50 PGKPAPTSGQYRPVG-----GGNEVTVPKGHRLPPSPKPGQKWVNVDPTKNKSGR 99
PG A SGQY VG G E TV KG LPP+PKPGQ ++ VD +KNKSG+
Sbjct 4 PGSIADKSGQYEIVGPRGGKTGEERTVTKGEPLPPTPKPGQGYILVDSSKNKSGK 58
>gi|330719200|ref|ZP_08313800.1| hypothetical protein LfalK3_08244 [Leuconostoc fallax KCTC 3537]
Length=56
Score = 36.2 bits (82), Expect = 1.8, Method: Compositional matrix adjust.
Identities = 20/50 (40%), Positives = 27/50 (54%), Gaps = 7/50 (14%)
Query 46 KATSPGKPAPTSGQYRPVG-------GGNEVTVPKGHRLPPSPKPGQKWV 88
K +PG +G+Y+ VG G V + +G RLPP+ KPG KWV
Sbjct 5 KLINPGTDNQPAGRYQEVGPRGGTVPDGKNVMIDQGDRLPPTNKPGDKWV 54
>gi|296117367|ref|ZP_06835957.1| hypothetical protein GXY_16162 [Gluconacetobacter hansenii ATCC
23769]
gi|295976133|gb|EFG82921.1| hypothetical protein GXY_16162 [Gluconacetobacter hansenii ATCC
23769]
Length=78
Score = 35.8 bits (81), Expect = 2.2, Method: Compositional matrix adjust.
Identities = 18/39 (47%), Positives = 21/39 (54%), Gaps = 0/39 (0%)
Query 41 DQMARKATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPP 79
D + P PAP SG YR VG G E+ P+G LPP
Sbjct 15 DAEFDRLHHPRTPAPHSGIYRCVGCGFEIATPEGQLLPP 53
>gi|257898390|ref|ZP_05678043.1| predicted protein [Enterococcus faecium Com15]
gi|257836302|gb|EEV61376.1| predicted protein [Enterococcus faecium Com15]
Length=54
Score = 35.0 bits (79), Expect = 3.6, Method: Compositional matrix adjust.
Identities = 21/47 (45%), Positives = 27/47 (58%), Gaps = 7/47 (14%)
Query 49 SPGKPAPTSGQYRPVG--GGN-----EVTVPKGHRLPPSPKPGQKWV 88
PG+ +G+Y+ VG GGN T+ KG RLPP+ KPG KW
Sbjct 6 KPGEDNKPAGKYKEVGPKGGNVPKGHNATIDKGDRLPPTSKPGNKWT 52
>gi|330991891|ref|ZP_08315840.1| hypothetical protein SXCC_01796 [Gluconacetobacter sp. SXCC-1]
gi|329760912|gb|EGG77407.1| hypothetical protein SXCC_01796 [Gluconacetobacter sp. SXCC-1]
Length=80
Score = 33.9 bits (76), Expect = 8.0, Method: Compositional matrix adjust.
Identities = 18/34 (53%), Positives = 20/34 (59%), Gaps = 0/34 (0%)
Query 46 KATSPGKPAPTSGQYRPVGGGNEVTVPKGHRLPP 79
K P AP SG YR VG G E+ V +GH LPP
Sbjct 20 KIHPPRTVAPHSGIYRCVGCGVEIAVAEGHLLPP 53
Lambda K H
0.314 0.137 0.436
Gapped
Lambda K H
0.267 0.0410 0.140
Effective search space used: 129239199826
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
Posted date: Sep 5, 2011 4:36 AM
Number of letters in database: 5,219,829,388
Number of sequences in database: 15,229,318
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40