20000316 18.44 Leszek: Dear Aleksander, Scientific comments: Thanks for the article. As I understand Your point, You say that to predict the structure of a protein it is better to know the DNA sequence (there is more information) than the amino acid sequence. I would probably not agree with this because if You modify the DNA of a gene so that the amino acid sequence does not change (protein before modification and after modification have 100% identical residues) than I expect the structure of the new protein to be identical. I may be wrong and there could be still some (especially evolutionary) additional information encoded in the DNA. ... but that's irrelevant. We should think how to integrate Your prediction method into CAFASP. Technical comments: As You now there will be 2 events CASP and CAFASP. You can compete in CASP if You have a method for protein structure prediction or even if You don't have any and You think You can use public methods or Your own intuition better than other people. CAFASP is a slightly different event because it does not allow any intervention of the thinking human expert. To compete in CAFASP You need a method. You have one. The problem is that there are some technical requirements to participate. We can not run all the methods ourselves (I want find the time to do this). That's why CAFASP will focus only on servers which will give automated answer to the query we send to it. We would probably send only the amino acid sequence because we believe that the structure depends on this only and not on the underlying DNA sequence but we could try to make an exception (no promise). But You need a server to send us the response in an automated fashion. In the wars case You can set up a fake server: 1) give us an E-mail 2) we will send requests to this E-mail 3) You will read it, run Your program and send the results back to the address we provide ... but this is only if You want to compete in CAFASP. If You have the best method and You will not compete in CAFASP , You can still take part in the CASP event and be the best there. CASP is more known, CAFASP is a special discipline for servers. Summarizing: Can You set up a server for the method? Best regards, Leszek . . . 20 19.41 Leszek: nanoworld wrote: Dear Sir, If the DNA-sequence is changed, the ribosome builds the other conformation of the protein, however it is hardly steady and renaturates in the stable form. You are right, that the structure of the protein isn’t changed, however there is a loss in time. During the process of evolution the slow processes are eliminated, therefore only those DNA-sequences on which steady conformations of proteins at once are created remain. We are ready to place the server at your disposal, but the DNA-sequence is necessary as an input. Would you be so kind as to give us a sample of input data with DNA –sequence and we will prepare the program according to it on the server. With best regards A.Kushelev Great Your can decide about the input format. I would recommend to use 2 parameters as input: "dna-sequence" the DNA-sequence of the query "email" the E-mail where to send back the prediction results The server can be either an HTTP server or an E-mail server. We can send requests to both types. Best regards, Leszek PS: a simple example: http://imtech.ernet.in/raghava/pssp/ . . . 20000405 20.44 Leszek: Dear Alexander Kushelev I list Your server in our server database (http://cafasp.bioinfo.pl/server/) but I can not couple it to the meta server since it is still a java applet. To take part in CAFASP You can participate in the category "2) Non-available-programs predictions" . Please check the CAFASP announcement and the evaluation rules (http://www.cs.bgu.ac.il/~dfischer/CAFASP2/evalr.html) for details. 1) You should also register Your server to CASP as a server : http://PredictionCenter.llnl.gov/casp4/predictors/forms/casp4-reg-ser.html 2) You should submit Your predictions manually to the CASP site (we will automatically obtain a copy). Best regards Leszek . . . 20000517 14.48 Leszek: Nanoworld wrote: Dear Sir,We quite ready to receive targets on our server. We have solved the problems with our program. With best regards A.Kushelev Dear Sir, Since the server is a Java Applet I will not be able to couple the meta server to Your server. The best solution will be that I send You the target sequences and ask You to e-mail me back the results to nanoworld@leszek.bioinfo.pl . There are 3 targets available now (the CAFASP deadline already expired but please E-mail me back the results so I can include them in the database): target: T0086 SHPALTQLRA LRYCKEIPAL DPQLLDWLLL EDSMTKRFEQ QGKTVSVTMI REGFVEQNEI PEELPLLPKE SRYWLREILL CADGEPWLAG RTVVPVSTLS GPELALQKLG KTPLGRYLFT SSTLTRDFIE IGRDAGLWGR RSRLRLSGKP LLLTELFLPA SPLY target: T0087 MSKILVFGHQ NPDSDAIGSS MAYAYLKRQL GVDAQAVALG NPNEETAFVL DYFGIQAPPV VKSAQAEGAK QVILTDHNEF QQSIADIREV EVVEVVDHHR VANFETANPL YMRLEPVGSA SSIVYRLYKE NGVAIPKEIA GVMLSGLISD TLLLKSPTTH ASDPAVAEDL AKIAGVDLQE YGLAMLKAGT NLASKTAAQL VDIDAKTFEL NGSQVRVAQV NTVDINEVLE RQNEIEEAIK ASQAANGYSD FVLMITDILN SNSEILALGN NTDKVEAAFN FTLKNNHAFL AGAVSRKKQV VPQLTESFNG target: T0088 AVSFIGSTEN DVGPSQGSYS STHAMDNLPF VYNTGYNIGY QNANVWRISG GFCVGLDGKV DLPVVGSLDG QSIYGLTEEV GLLIWMGDTN YSRGTAMSGN SWENVFSGWC VGNYVSTQGL SVHVRPVILK RNSSAQYSVQ KTSIGSIRMR PYNGSS You can find more information about the targets on: http://predictioncenter.llnl.gov/casp4/targets/templates/targets.html Please don't forget to register Your method in CASP as well: http://PredictionCenter.llnl.gov/casp4/predictors/forms/casp4-reg-ser.html Please don't forget that You have to submit CASP formatted results to CASP before the official CASP target deadline (please read the instructions on: http://www.cs.bgu.ac.il/~dfischer/CAFASP2/). Best regards, Leszek . . . 20000518 14.44 Leszek: Nanoworld wrote: -Yes!Yours, Alexander Dear Alexander, I attach 3 files. They contain the alignment of the query (top part) with a hit form the DNA database. Please send my the results to my E-mail. Tonight I will send You 2 more targets which have to be processed during 48 hours. Leszek >gb|AE000477.1|AE000477 Escherichia coli K-12 MG1655 section 367 of 400 of the complete genome Length = 11314 Score = 233 bits (588), Expect = 4e-60 Identities = 123/164 (75%), Positives = 123/164 (75%) Frame = +2 Query: 1 SHPALTQLRALRYCKEIPAXXXXXXXXXXXXXSMTKRFEQQGKTVSVTMIREGFVEQNXX 60 SHPALTQLRALRYCKEIPA SMTKRFEQQGKTVSVTMIREGFVEQN Sbjct: 4604 SHPALTQLRALRYCKEIPALDPQLLDWLLLEDSMTKRFEQQGKTVSVTMIREGFVEQNEI 4783 Query: 61 XXXXXXXXKESRYWLREILLCADGEPWLAGRTVVPVSTLSGPELALQKLGKTPLGRYLFT 120 KESRYWLREILLCADGEPWLAGRTVVPVSTLSGPELALQKLGKTPLGRYLFT Sbjct: 4784 PEELPLLPKESRYWLREILLCADGEPWLAGRTVVPVSTLSGPELALQKLGKTPLGRYLFT 4963 Query: 121 SSTLTRDFIEIGRDAXXXXXXXXXXXXXXXXXXTELFLPASPLY 164 SSTLTRDFIEIGRDA TELFLPASPLY Sbjct: 4964 SSTLTRDFIEIGRDAGLWGRRSRLRLSGKPLLLTELFLPASPLY 5095 1: AE000477. Escherichia coli K...[gi:2367338] LOCUS AE000477 11314 bp DNA BCT 12-NOV-1998 DEFINITION Escherichia coli K-12 MG1655 section 367 of 400 of the complete genome. ACCESSION AE000477 U00096 VERSION AE000477.1 GI:2367338 KEYWORDS . SOURCE Escherichia coli. ORGANISM Escherichia coli Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 11314) AUTHORS Blattner,F.R., Plunkett,G. III, Bloch,C.A., Perna,N.T., Burland,V., Riley,M., Collado-Vides,J., Glasner,J.D., Rode,C.K., Mayhew,G.F., Gregor,J., Davis,N.W., Kirkpatrick,H.A., Goeden,M.A., Rose,D.J., Mau,B. and Shao,Y. TITLE The complete genome sequence of Escherichia coli K-12 JOURNAL Science 277 (5331), 1453-1474 (1997) MEDLINE 97426617 PUBMED 97426617 REFERENCE 2 (bases 1 to 11314) AUTHORS Blattner,F.R. TITLE Direct Submission JOURNAL Submitted (16-JAN-1997) Guy Plunkett III, Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA. Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 608-263-7459 REFERENCE 3 (bases 1 to 11314) AUTHORS Blattner,F.R. TITLE Direct Submission JOURNAL Submitted (02-SEP-1997) Guy Plunkett III, Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA. Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 608-263-7459 REFERENCE 4 (bases 1 to 11314) AUTHORS Plunkett,G. III. TITLE Direct Submission JOURNAL Submitted (13-OCT-1998) Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA COMMENT On Sep 9, 1997 this sequence version replaced gi:1790468. This sequence was determined by the E. coli Genome Project at the University of Wisconsin-Madison (Frederick R. Blattner, director). Supported by NIH grants HG00301 and HG01428 (from the Human Genome Project and NCHGR). The entire sequence was independently determined from E. coli K-12 strain MG1655. Predicted open reading frames were determined using GeneMark software, kindly supplied by Mark Borodovsky, Georgia Institute of Technology, Atlanta, GA, 30332 [e-mail: mark@amber.gatech.edu]. Open reading frames that have been correlated with genetic loci are being annotated with CG Site Nos., unique ID nos. for the genes in the E. coli Genetic Stock Center (CGSC) database at Yale University, kindly supplied by Mary Berlyn. A public version of the database is accessible (http://cgsc.biology.yale.edu). Annotation of the genome is an ongoing task whose goal is to make the genome sequence more useful by correlating it with other data. Comments to the authors are appreciated. Updated information will be available at the E. coli Genome Project's World Wide Web site (http://www.genetics.wisc.edu). *** The E. coli K-12 sequence and its annotations are periodically updated; this is version M54. No sequence changes. Annotation updates: updated gene identifications and products; all new functional assignments courtesy of Monica Riley; added promoters, protein binding sites, and repeated sequences described in reference 1. The unique numeric identifiers beginning with a lowercase 'b' assigned to each gene (protein- or RNA-encoding) are now designated as gene synonyms instead of labels. This should allow them to be searched for in Entrez as gene names. FEATURES Location/Qualifiers source 1..11314 /organism="Escherichia coli" /strain="K-12" /sub_strain="MG1655" /db_xref="taxon:562" gene 66..1406 /gene="lamB" /note="b4036" CDS 66..1406 /gene="lamB" /function="IS, phage, Tn; Transport of small molecules: Carbohydrates, organic acids, alcohols" /note="o446; 99 pct identical amino acid sequence and equal length to LAMB_ECOLI SW: P02943; CG Site No. 575" /codon_start=1 /transl_table=11 /product="phage lambda receptor protein; maltose high-affinity receptor" /protein_id="AAC77006.1" /db_xref="GI:1790469" /db_xref="PID:g1790469" /translation="MMITLRKLPLAVAVAAGVMSAQAMAVDFHGYARSGIGWTGSGGE QQCFQTTGAQSKYRLGNECETYAELKLGQEVWKEGDKSFYFDTNVAYSVAQQNDWEAT DPAFREANVQGKNLIEWLPGSTIWAGKRFYQRHDVHMIDFYYWDISGPGAGLENIDVG FGKLSLAATRSSEAGGSSSFASNNIYDYTNETANDVFDVRLAQMEINPGGTLELGVDY GRANLRDNYRLVDGASKDGWLFTAEHTQSVLKGFNKFVVQYATDSMTSQGKGLSQGSG VAFDNEKFAYNINNNGHMLRILDHGAISMGDNWDMMYVGMYQDINWDNDNGTKWWTVG IRPMYKWTPIMSTVMEIGYDNVESQRTGDKNNQYKITLAQQWQAGDSIWSRPAIRVFA TYAKWDEKWGYDYTGNADNNANFGKAVPADFNGGSFGRGDSDEWTFGAQMEIWW" repeat_region 1416..1618 /note="REP (repetitive extragenic palindromic) element; contains 4 REP sequences" gene 1649..2569 /gene="malM" /note="b4037" CDS 1649..2569 /gene="malM" /function="phenotype; Degradation of small molecules: Carbon compounds" /note="o306; CG Site No. 18178; alternate gene name molA" /codon_start=1 /transl_table=11 /product="periplasmic protein of mal regulon" /protein_id="AAC77007.1" /db_xref="GI:1790470" /db_xref="PID:g1790470" /translation="MKMNKSLIVLCLSAGLLASAPGISLADVNYVPQNTSDAPAIPSA ALQQLTWTPVDQSKTQTTQLATGGQQLNVPGISGPVAAYSVPANIGELTLTLTSEVNK QTSVFAPNVLILDQNMTPSAFFPSSYFTYQEPGVMSADRLEGVMRLTPALGQQKLYVL VFTTEKDLQQTTQLLDPAKAYAKGVGNSIPDIPDPVARHTTDGLLKLKVKTNSSSSVL VGPLFGSSAPAPVTVGNTAAPAVAAPAPAPVKKSEPMLNDTESYFNTAIKNAVAKGDV DKALKLLDEAERLGSTSARSTFISSVKGKG" promoter 3001..3028 /note="factor Sigma70; predicted +1 start at 4248519" gene 3050..4378 /gene="yjbI" /note="b4038" CDS 3050..4378 /gene="yjbI" /function="orf; Unknown" /note="o442; 100 pct identical amino acid sequence and equal length to YJBI_ECOLI SW: P32690" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC77008.1" /db_xref="GI:1790471" /db_xref="PID:g1790471" /translation="MKKIECACNFLMDKDAQGYIDLSDLDLTSCHFKGDVISKVSFLS SNLQHVTFECKEIGDCNFTTAIVDNVIFRCRRLHNVIFIKASGECVDFSKNILDTVDF SQSQLGHSNFRECQIRNSNFDNCYLYASHFTRAEFLSAKEISFIKSNLTAVMFDYVRM STGNFKDCITEQLELTIDYSDIFWNEDLDGYINNIIKMIDTLPDNAMILKSVLAVKLV MQLKILNIVNKNFIENMKKIFSHCPYIKDPIIRSYIHSDEDNKFDDFMRQHRFSEVNF DTQQMIDFINRFNTNKWLIDKNNNFFIQLIDQALRSTDDMIKANVWHLYKEWIRSDDV SPIFIETEDNLRTFNTNELTRNDNIFILFSSVDDGPVMVVSSQRLHDMLNPTKDTNWN STYIYKSRHEMLPVNLTQETLFSSKSHGKYALFPIFTASWRAHRIMNKGV" promoter 4330..4361 /gene="yjbI" /note="factor Sigma70; predicted +1 start at 4249852" gene 4490..5098 /gene="ubiC" /note="b4039" CDS 4490..5098 /gene="ubiC" /function="enzyme; Biosynthesis of cofactors, carriers: Menaquinone, ubiquinone" /note="o202a; CG Site No. 48" /codon_start=1 /transl_table=11 /product="chorismate lyase" /protein_id="AAC77009.1" /db_xref="GI:1790472" /db_xref="PID:g1790472" /translation="MRLLRFCCVLDHLICFTSPVNTFLRYNAFTLCNGEFGMSHPALT QLRALRYCKEIPALDPQLLDWLLLEDSMTKRFEQQGKTVSVTMIREGFVEQNEIPEEL PLLPKESRYWLREILLCADGEPWLAGRTVVPVSTLSGPELALQKLGKTPLGRYLFTSS TLTRDFIEIGRDAGLWGRRSRLRLSGKPLLLTELFLPASPLY" gene 5111..5983 /gene="ubiA" /note="b4040" CDS 5111..5983 /gene="ubiA" /EC_number="2.5.1.-" /function="enzyme; Biosynthesis of cofactors, carriers: Menaquinone, ubiquinone" /note="o290b; 99 pct identical amino acid sequence and equal length to UBIA_ECOLI SW: P26601; CG Site No. 50" /codon_start=1 /transl_table=11 /product="4-hydroxybenzoate-octaprenyltransferase" /protein_id="AAC77010.1" /db_xref="GI:1790473" /db_xref="PID:g1790473" /translation="MEWSLTQNKLLAFHRLMRTDKPIGALLLLWPTLWALWVATPGVP QLWILAVFVAGVWLMRAAGCVVNDYADRKFDGHVKRTANRPLPSGAVTEKEARALFVV LVLISFLLVLTLNTMTILLSIAALALAWVYPFMKRYTHLPQVVLGAAFGWSIPMAFAA VSESVPLSCWLMFLANILWAVAYDTQYAMVDRDDDVKIGIKSTAILFGQYDKLIIGIL QIGVLALMAIIGELNGLGWGYYWSILVAGALFVYQQKLIANREREACFKAFMNNNYVG LVLFLGLAMSYWHF" repeat_region 6024..6125 /note="REP (repetitive extragenic palindromic) element; contains 2 REP sequences" gene complement(6138..8621) /gene="plsB" /note="b4041" CDS complement(6138..8621) /gene="plsB" /EC_number="2.3.1.15" /function="enzyme; Macromolecule synthesis, modification: Phospholipids" /note="f827; 99 pct identical amino acid sequence and equal length to PLSB_ECOLI SW: P00482; CG Site No. 382" /codon_start=1 /transl_table=11 /product="glycerol-3-phosphate acyltransferase" /protein_id="AAC77011.1" /db_xref="GI:1790474" /db_xref="PID:g1790474" /translation="MTFCYPCRAFALLTRGFTSFMSGWPRIYYKLLNLPLSILVKSKS IPADPAPELGLDTSRPIMYVLPYNSKADLLTLRAQCLAHDLPDPLEPLEIDGTLLPRY VFIHGGPRVFTYYTPKEESIKLFHDYLDLHRSNPNLDVQMVPVSVMFGRAPGREKGEV NPPLRMLNGVQKFFAVLWLGRDSFVRFSPSVSLRRMADEHGTDKTIAQKLARVARMHF ARQRLAAVGPRLPARQDLFNKLLASRAIAKAVEDEARSKKISHEKAQQNAIALMEEIA ANFSYEMIRLTDRILGFTWNRLYQGINVHNAERVRQLAHDGHELVYVPCHRSHMDYLL LSYVLYHQGLVPPHIAAGINLNFWPAGPIFRRLGAFFIRRTFKGNKLYSTVFREYLGE LFSRGYSVEYFVEGGRSRTGRLLDPKTGTLSMTIQAMLRGGTRPITLIPIYIGYEHVM EVGTYAKELRGATKEKESLPQMLRGLSKLRNLGQGYVNFGEPMPLMTYLNQHVPDWRE SIDPIEAVRPAWLTPTVNNIAADLMVRINNAGAANAMNLCCTALLASRQRSLTREQLT EQLNCYLDLMRNVPYSTDSTVPSASASELIDHALQMNKFEVEKDTIGDIIILPREQAV LMTYYRNNIAHMLVLPSLMAAIVTQHRHISRDVLMEHVNVLYPMLKAELFLRWDRDEL PDVIDALANEMQRQGLITLQDDELHINPAHSRTLQLLAAGARETLQRYAITFWLLSAN PSINRGTLEKESRTVAQRLSVLHGINAPEFFDKAVFSSLVLTLRDEGYISDSGDAEPA ETMKVYQLLAELITSDVRLTIESATQGEG" promoter 8682..8713 /note="factor Sigma70; predicted +1 start at 4254204" gene 8732..9100 /gene="dgkA" /note="b4042" CDS 8732..9100 /gene="dgkA" /EC_number="2.7.1.107" /function="enzyme; Fatty acid and phosphatidic acid biosynthesis" /note="o122; 100 pct identical to KDGL_ECOLI SW: P00556; CG Site No. 862" /codon_start=1 /transl_table=11 /product="diacylglycerol kinase" /protein_id="AAC77012.1" /db_xref="GI:1790475" /db_xref="PID:g1790475" /translation="MANNTTGFTRIIKAAGYSWKGLRAAWINEAAFRQEGVAVLLAVV IACWLDVDAITRVLLISSVMLVMIVEILNSAIEAVVDRIGSEYHELSGRAKDMGSAAV LIAIIVAVITWCILLWSHFG" promoter complement(8740..8770) /note="factor Sigma70; predicted +1 start at 4254217" promoter 9146..9174 /note="factor Sigma70; promoter lexA; documented +1 at4254666" protein_bind 9163..9183 /note="central position to lexA promoter: -8.5" /bound_moiety="LexA documented site" protein_bind 9184..9204 /note="central position to lexA promoter:12.5" /bound_moiety="LexA documented site" gene 9210..9818 /gene="lexA" /note="b4043" CDS 9210..9818 /gene="lexA" /function="regulator; Global regulatory functions" /note="o202b; CG Site No. 558" /codon_start=1 /transl_table=11 /product="regulator for SOS(lexA) regulon" /protein_id="AAC77013.1" /db_xref="GI:1790476" /db_xref="PID:g1790476" /translation="MKALTARQQEVFDLIRDHISQTGMPPTRAEIAQRLGFRSPNAAE EHLKALARKGVIEIVSGASRGIRLLQEEEEGLPLVGRVAAGEPLLAQQHIEGHYQVDP SLFKPNADFLLRVSGMSMKDIGIMDGDLLAVHKTQDVRNGQVVVARIDDEVTVKRLKK QGNKVELLPENSEFKPIVVDLRQQSFTIEGLAVGVIRNGDWL" gene 9837..11216 /gene="dinF" /note="b4044" CDS 9837..11216 /gene="dinF" /function="factor; DNA - replication, repair, restriction/modification" /note="o459; 100 pct identical to DINF_ECOLI SW: P28303; CG Site No. 854" /codon_start=1 /transl_table=11 /product="DNA-damage-inducible protein F" /protein_id="AAC77014.1" /db_xref="GI:1790477" /db_xref="PID:g1790477" /translation="MPPGVAVCFSSLFIRLVCMAFLTSSDKALWHLALPMIFSNITVP LLGLVDTAVIGHLDSPVYLGGVAVGATATSFLFMLLLFLRMSTTGLTAQAYGAKNPQA LARTLVQPLLLALGAGALIALLRTPIIDLALHIVGGSEAVLEQARRFLEIRWLSAPAS LANLVLLGWLLGVQYARAPVILLVVGNILNIVLDVWLVMGLHMNVQGAALATVIAEYA TLLIGLLMVRKILKLRGISGEMLKTAWRGNFRRLLALNRDIMLRSLLLQLCFGAITVL GARLGSDIIAVNAVLMTLLTFTAYALDGFAYAVEAHSGQAYGARDGSQLLDVWRAACR QSGIVALLFSVVYLLAGEHIIALLTSLTQIQQLADRYLIWQVILPVVGVWCYLLDGMF IGATRATEMRNSMAVAAAGFALTLLTLPWLGNHALWLALTVFLALRGLSLAAIWRRHW RNGTWFAAT" promoter 11190..11207 /gene="dinF" /note="factor Sigma54; predicted +1 start at 4256698" promoter 11211..11239 /note="factor Sigma70; predicted +1 start at 4256730" BASE COUNT 2757 a 2687 c 2937 g 2933 t ORIGIN 1 cacaaaacac acaaagcctg tcacaggtga tgtgaaaaaa gaaaagcaat gactcaggag 61 atagaatgat gattactctg cgcaaacttc ctctggcggt tgccgtcgca gcgggcgtaa 121 tgtctgctca ggcaatggct gttgatttcc acggctatgc acgttccggt attggttgga 181 caggtagcgg cggtgaacaa cagtgtttcc agactaccgg tgctcaaagt aaataccgtc 241 ttggcaacga atgtgaaact tatgctgaat taaaattggg tcaggaagtg tggaaagagg 301 gcgataagag cttctatttc gacactaacg tggcctattc cgtcgcacaa cagaatgact 361 gggaagctac cgatccggcc ttccgtgaag caaacgtgca gggtaaaaac ctgatcgaat 421 ggctgccagg ctccaccatc tgggcaggta agcgcttcta ccaacgtcat gacgttcata 481 tgatcgactt ctactactgg gatatttctg gtcctggtgc cggtctggaa aacatcgatg 541 ttggcttcgg taaactctct ctggcagcaa cccgctcctc tgaagctggt ggttcttcct 601 ctttcgccag caacaatatt tatgactata ccaacgaaac cgcgaacgac gttttcgatg 661 tgcgtttagc gcagatggaa atcaacccgg gcggcacatt agaactgggt gtcgactacg 721 gtcgtgccaa cttgcgtgat aactatcgtc tggttgatgg cgcatcgaaa gacggctggt 781 tattcactgc tgaacatact cagagtgtcc tgaagggctt taacaagttt gttgttcagt 841 acgctactga ctcgatgacc tcgcagggta aagggctgtc gcagggttct ggcgttgcat 901 ttgataacga aaaatttgcc tacaatatca acaacaacgg tcacatgctg cgtatcctcg 961 accacggtgc gatctccatg ggcgacaact gggacatgat gtacgtgggt atgtaccagg 1021 atatcaactg ggataacgac aacggcacca agtggtggac cgtcggtatt cgcccgatgt 1081 acaagtggac gccaatcatg agcaccgtga tggaaatcgg ctacgacaac gtcgaatccc 1141 agcgcaccgg cgacaagaac aatcagtaca aaattaccct cgcacaacaa tggcaggctg 1201 gcgacagcat ctggtcacgc ccggctattc gtgtcttcgc aacctacgcc aagtgggatg 1261 agaaatgggg ttacgactac accggtaacg ctgataacaa cgcgaacttc ggcaaagccg 1321 ttcctgctga tttcaacggc ggcagcttcg gtcgtggcga cagcgacgag tggaccttcg 1381 gtgcccagat ggaaatctgg tggtaatagc aaaacctggg ccggataagg cgtttacgcc 1441 gcattcggca accaacgcct gatgcgacgc ttgcgcgtct tatcaggcct acaacggctg 1501 tcaaatgtag gccggataag gcgtttacgc cgcatccggc ataaaaacag gttgtcatta 1561 tctgaaaggg gcgaaagccc ctctgattat cgggtttagc gcgctattgc ctggctaccg 1621 ctgagctcca gattttgagg tgaaaacaat gaaaatgaat aaaagtctca tcgtcctctg 1681 tttatcagca gggttactgg caagcgcgcc tggaattagc cttgccgatg ttaactacgt 1741 accgcaaaac accagcgacg cgccagccat tccatctgct gcgctgcaac aactcacctg 1801 gacaccggtc gatcaatcta aaacccagac cacccaactg gcgaccggcg gccaacaact 1861 gaacgttccc ggcatcagtg gtccggttgc tgcgtacagc gtcccggcaa acattggcga 1921 actgaccctg acgctgacca gcgaagtgaa caaacaaacc agcgtttttg cgccgaacgt 1981 gctgattctt gatcagaaca tgaccccatc agccttcttc cccagcagtt atttcaccta 2041 ccaggaacca ggcgtgatga gtgcagatcg gctggaaggc gttatgcgcc tgacaccggc 2101 gttggggcag caaaaacttt atgttctggt ctttaccacg gaaaaagatc tccagcagac 2161 gacccaactg ctcgacccgg ctaaagccta tgccaagggc gtcggtaact cgatcccgga 2221 tatccccgat ccggttgctc gtcataccac cgatggctta ctgaaactga aagtgaaaac 2281 gaactccagc tccagcgtgt tggtaggacc tttatttggt tcttccgctc cagctccggt 2341 tacggtaggt aacacggcgg caccagctgt ggctgcaccc gctccggcac cggtgaagaa 2401 aagcgagccg atgctcaacg acacggaaag ttattttaat accgcgatca aaaacgctgt 2461 cgcgaaaggt gatgttgata aggcgttaaa actgcttgat gaagctgaac gcctgggatc 2521 gacatctgcc cgttccacct ttatcagcag tgtaaaaggc aaggggtaat tacgccccac 2581 agtgctgatt ttgcaacaac tggtgcgtct cctggcgcac ctttttttat gcttccttcc 2641 tgggatatga gcgatttttt atagtaactc acttcttctt cactaagaat atccattatc 2701 tcaatgcctt atcagagatt cttttccttt cgccggtagt gtctggacat tcaggctact 2761 tttccaggtt attttatttc tgttatgcag aggttttatg ataagtcata tcctaaattc 2821 tggcggcaat aactctttga tgaaacatga tgtggtgcaa ggaaataata tagtagatct 2881 tgatttacta cgtaatttaa atggggtgcc aggtttaaac agagataact ttatttatat 2941 cagcaatatt ttttcaaata taaaacaacg gaacgaaaaa atcatgcaat aaatatgttt 3001 cgtgaagtct caatcagtaa tgatactata agtgtaaaat tctacagaaa tgaaaaaaat 3061 tgaatgcgct tgcaattttc tgatggataa agatgcgcag gggtatatcg acctgtctga 3121 tttggattta acaagttgtc attttaaagg tgacgttata tcgaaggtgt cttttttatc 3181 atcaaatcta caacatgtaa cattcgaatg taaagaaatt ggggattgca attttactac 3241 tgcaatagtt gataatgtca tatttagatg tcgacgttta cacaatgtga tttttatcaa 3301 agcgagtggt gaatgtgtcg atttcagcaa aaatattctt gatacagttg acttctcgca 3361 gagtcaactt ggtcatagta attttcgcga atgtcagatt agaaattcaa acttcgataa 3421 ttgttatctt tacgcttcgc acttcaccag agcagagttt ctgtctgcca aagaaatatc 3481 atttattaaa tcgaatttga cagctgttat gtttgattat gtgcgaatgt cgacagggaa 3541 ttttaaagat tgcattacag aacaattgga attaactatt gattattcag atatattttg 3601 gaatgaagat ctcgatggtt atatcaataa cattataaaa atgattgata cattgccaga 3661 taatgcaatg atattgaaat ccgttctggc cgtaaaactg gtgatgcaat taaaaatact 3721 taatattgtt aataaaaact ttattgagaa tatgaagaaa atatttagcc attgtcctta 3781 tataaaagat cccattatac gcagttatat ccattctgat gaagataaca agttcgatga 3841 ttttatgcgt caacatcgat tcagtgaggt gaatttcgat acccaacaga tgatcgattt 3901 tattaacaga tttaatacga ataaatggct aattgataaa aataacaatt tttttatcca 3961 acttatcgat caggccttac gatcaacgga tgatatgatc aaagcaaatg tttggcatct 4021 ttataaagag tggattcgta gtgatgatgt ttcacctata tttatagaaa ctgaagataa 4081 tttaagaacc tttaacacga atgaattaac acgaaacgat aatatcttta tcctgttctc 4141 ctcagtcgat gatgggccag ttatggtggt aagctcccag cgcttacatg atatgttgaa 4201 tcctacaaaa gataccaatt ggaattccac gtatatctac aaatccagac atgagatgtt 4261 gcctgttaat cttactcagg aaacactttt cagctccaaa tctcatggta aatatgcgct 4321 tttccccatt tttactgcga gttggcgagc tcatcgtata atgaataagg gtgtttaagt 4381 aaaggaaaac atcaccgttc ctggcatcct ggacggtgat gcccctacgg ttgccctcgc 4441 cagcacgggc atcggtaaag cgtaaggttc aacatcgttt taccacttca tgcgattgtt 4501 gcgtttttgt tgcgtattag atcacttaat ttgctttaca tctcccgtaa acacttttct 4561 gcgatacaat gcctttacgt tatgtaacgg agagttcggc atgtcacacc ccgcgttaac 4621 gcaactgcgt gcgctgcgct attgtaaaga gatccctgcc ctggatccgc aactgctcga 4681 ctggctgttg ctggaggatt ccatgacaaa acgttttgaa cagcagggaa aaacggtaag 4741 cgtgacgatg atccgcgaag ggtttgtcga gcagaatgaa atccccgaag aactgccgct 4801 gctgccgaaa gagtctcgtt actggttacg tgaaattttg ttatgtgccg atggtgaacc 4861 gtggcttgcc ggtcgtaccg tcgttcctgt gtcaacgtta agcgggccgg agctggcgtt 4921 acaaaaattg ggtaaaacgc cgttaggacg ctatctgttc acatcatcga cattaacccg 4981 ggactttatt gagataggcc gtgatgccgg gctgtggggg cgacgttccc gcctgcgatt 5041 aagcggtaaa ccgctgttgc taacagaact gtttttaccg gcgtcaccgt tgtactaaga 5101 ggaaaaaaat atggagtgga gtctgacgca gaataagctg ctggcgtttc atcgcttaat 5161 gcgtacggat aagccaattg gcgcgttact gctgctctgg ccaacattat gggcgttgtg 5221 ggtggcgaca ccgggcgttc cccagctctg gatcctggcg gtgtttgtcg cgggtgtctg 5281 gctgatgcgc gctgccggat gtgtggtgaa tgattatgct gaccgcaagt ttgatggtca 5341 tgttaagcgc acggcgaacc gaccacttcc cagcggcgcg gtaacagaga aagaggcgcg 5401 cgcgctgttt gtcgtgctgg tactgatttc gtttttactg gtgctgacgc tgaatacgat 5461 gaccattctg ttgtcgattg ccgcgctagc gctggcgtgg gtgtacccgt ttatgaagcg 5521 gtatacccat ctaccgcaag tggtgctggg cgcggcgttt ggctggtcga ttccaatggc 5581 ttttgccgct gtgagtgagt cggtgccatt gagttgctgg ttaatgttcc tcgccaatat 5641 tctctgggcg gtggcttacg acacgcagta tgcgatggtt gaccgcgatg atgatgtgaa 5701 gattggcatt aaatccacgg caatcctgtt cggccaatac gataaattga ttattggtat 5761 tttgcagatt ggcgtactgg cactgatggc gatcatcggt gagttaaatg gcttaggctg 5821 gggatattac tggtcaattc tggtggctgg cgcgctgttt gtttatcaac aaaaactgat 5881 tgccaaccgc gagcgtgaag cctgctttaa agcatttatg aataataact atgttggtct 5941 ggtactattt ttagggctgg caatgagtta ctggcatttc tgatgatgta aaaaagccgg 6001 atgatcatcc ggctttcttc tgggttgcct gatgcgcggc gcttctcagg cctacacaac 6061 acatcgcaat ttattgaatt tgcagattat ggaaggccgg ataaggcgtt ttcgccgcat 6121 ccggcaattc tctctgatta cccttcgccc tgcgtcgcac tctcaatcgt caaacgcacg 6181 tctgatgtaa tcaactccgc cagcaactga taaaccttca tcgtttctgc cggttcggca 6241 tcgccgctat cgctgatata cccttcatca cgcagtgtca gcaccagaga actgaacacc 6301 gccttgtcga agaactccgg cgcgttgatg ccgtgcagca cggagagacg ttgcgcgacg 6361 gtgcggctct ctttctccag cgtaccgcgg ttgatcgacg ggttggcact caacaaccag 6421 aaggtgatgg cataacgttg cagcgtttcg cgcgcgcctg cggccagcag ctgtagcgtg 6481 cgagaatgcg ccgggttgat atgcaactca tcatcttgca gggtaatcag cccctgacgt 6541 tgcatctcat ttgccagcgc atcaataacg tccggcaact cgtcgcgatc ccagcgcagg 6601 aacagctccg ctttcagcat tgggtaaagc acattgacgt gctccatcaa tacgtcgcgg 6661 gagatgtggc gatgctgggt gacgattgcc gccatcagcg aaggcagcac caacatatgc 6721 gcaatgttgt tgcgatagta ggtcatcagc accgcttgct cgcgcggcag aatgatgatg 6781 tcgccgattg tgtctttctc gacttcaaac ttgttcattt gcagcgcgtg atcgataagc 6841 tcgctggcgc tggctgaagg aacggtagag tccgtggagt agggcacgtt gcgcatcaga 6901 tccaggtagc agttgagttg ctcggttaac tgctcgcggg tgagtgagcg ctgacgtgat 6961 gccagtagcg cagtacagca caggttcatg gcgtttgccg cgcctgcgtt gttaatgcgt 7021 accatcagat cggcagcaat attattgacc gtcggcgtta accatgccgg acgcaccgct 7081 tcgatgggat cgatagattc acgccagtca ggtacatgct ggttaaggta ggtcatcaac 7141 ggcattggtt caccgaagtt gacgtaaccc tgaccgagat tacgcagctt gcttaaaccg 7201 cgcagcatct gcggcaggct ctctttctct ttcgtcgcgc cgcgcagttc tttggcgtaa 7261 gtacccactt ccatgacgtg ctcataaccg atatagatcg gaatcagcgt aatcggacgc 7321 gtgccgccac gcagcatcgc ctgaatggtc atcgacagcg taccagtttt cggatccagc 7381 aaacgccccg tacgggaacg accgccttcc acgaagtact cgacggaata accacggctg 7441 aacagttcgc cgagatactc ccggaaaacg gtggaataaa gtttattgcc tttaaacgta 7501 cggcgaataa agaacgcccc cagacggcgg aaaatcggcc cggcaggcca gaaattcagg 7561 ttgatcccgg cggcgatatg cggcggcacc agcccctggt gatacagcac gtaagaaagc 7621 agcaggtagt ccatgtgact gcggtggcaa ggcacatata ccagctcatg accgtcgtgg 7681 gccagctggc gaacgcgctc agcgttatgg acgttgatgc cctggtaaag tcggttccag 7741 gtgaagccca gaatacggtc agtcaggcga atcatctcgt aagagaaatt cgccgcaatc 7801 tcttccatca gtgcaatcgc gttctgctgc gctttttcat gggagatttt tttgctgcgc 7861 gcttcatctt ctaccgcttt ggcaatggcg cgggaggcga gcagcttatt aaacagatcc 7921 tgacgagcag gcaaacgtgg gcctacggca gccagacgtt gacgggcaaa gtgcatacgc 7981 gccacgcgcg ccagtttctg agcgatagtt ttatccgtgc cgtgttcatc cgccatacgg 8041 cgcagcgaaa ctgacggcga gaaacgcaca aaactgtcgc gaccgagcca cagtacagcg 8101 aaaaatttct gtacgccgtt aagcatacgc agcggcgggt tcacttcgcc tttttcacgc 8161 cccggcgcgc gaccaaacat caccgacact ggcaccatct gcacatccag atttgggttg 8221 ctacggtgca aatcgagata gtcgtggaac agcttaatag actcttcttt cggcgtgtaa 8281 taggtgaaca cacgcggccc gccgtgaatg aacacatagc gcggcagtag cgtgccgtcg 8341 atttccagcg gctctaacgg gtcaggcaag tcatgtgcca gacactgggc gcgcaacgtc 8401 agcaaatctg ctttcgagtt gtacggtaaa acgtacataa ttggacgaga ggtatccagc 8461 cccagttccg gggcaggatc tgccggaata gacttgcttt ttaccaggat gcttaatggt 8521 aaattcagta atttgtagta aattcgtggc cagccggaca taaacgatgt aaagcctctg 8581 gttaataatg caaatgcgcg gcaaggatag cagaaagtca tgggaaattc tgtggtatcc 8641 gctcatgttt cgcgcggcgc tacgcaaacc cgaatcatcg gatttaacgg tacactgata 8701 ttgacgctca taatgtaaaa aggttctttc aatggccaat aataccactg gattcacccg 8761 aattatcaaa gctgctggct attcctggaa aggtttacgc gctgcatgga tcaacgaagc 8821 ggcattccgt caggaaggcg tagcggtatt gttggcggtg gtcatcgcct gctggctgga 8881 tgtggacgcg attacccgcg tgctgcttat cagctccgtg atgctggtga tgattgtgga 8941 aatcctcaat agcgccatcg aagcagtggt tgaccgaatt ggctctgaat accatgagct 9001 ttccggacgc gcaaaagata tgggatccgc tgcggtgctg attgccatta tcgtcgccgt 9061 gattacctgg tgcattctgt tatggtcgca ttttggataa cccttccaga attcgataaa 9121 tctctggttt attgtgcagt ttatggttcc aaaatcgcct tttgctgtat atactcacag 9181 cataactgta tatacaccca gggggcggaa tgaaagcgtt aacggccagg caacaagagg 9241 tgtttgatct catccgtgat cacatcagcc agacaggtat gccgccgacg cgtgcggaaa 9301 tcgcgcagcg tttggggttc cgttccccaa acgcggctga agaacatctg aaggcgctgg 9361 cacgcaaagg cgttattgaa attgtttccg gcgcatcacg cgggattcgt ctgttgcagg 9421 aagaggaaga agggttgccg ctggtaggtc gtgtggctgc cggtgaacca cttctggcgc 9481 aacagcatat tgaaggtcat tatcaggtcg atccttcctt attcaagccg aatgctgatt 9541 tcctgctgcg cgtcagcggg atgtcgatga aagatatcgg cattatggat ggtgacttgc 9601 tggcagtgca taaaactcag gatgtacgta acggtcaggt cgttgtcgca cgtattgatg 9661 acgaagttac cgttaagcgc ctgaaaaaac agggcaataa agtcgaactg ttgccagaaa 9721 atagcgagtt taaaccaatt gtcgttgacc ttcgtcagca gagcttcacc attgaagggc 9781 tggcggttgg ggttattcgc aacggcgact ggctgtaaca tatctctgag accgcgatgc 9841 cgcctggcgt cgcggtttgt ttttcatctc tcttcatcag gcttgtctgc atggcattcc 9901 tcacttcatc tgataaagca ctctggcatc tcgccttacc catgattttc tccaatatca 9961 ccgttccgtt gctgggactg gtcgatacgg cggtaattgg tcatcttgat agcccggttt 10021 atttgggcgg cgtggcggtt ggtgcaacgg cgaccagctt tctctttatg ctgttgctgt 10081 ttttacgcat gagcaccacc gggctgactg cgcaggctta tggtgccaaa aatcctcagg 10141 cattagcccg tacgctggtg caaccgttgc tgttggcgtt gggggctggg gcgttaattg 10201 cgctgctgcg tacgccgatt atcgatctgg cgctgcatat tgttggcggt agtgaggcag 10261 tcctggaaca ggcgcggcgc tttcttgaaa tccgctggtt aagcgcaccg gcgtcgctgg 10321 cgaatctggt attactcggt tggttactcg gcgtgcaata tgcccgtgcg ccagtaattt 10381 tgttagtggt cggcaatatc ctcaacattg tgctggatgt ctggctggtg atggggctgc 10441 atatgaacgt gcagggcgcg gcgctggcga cggttattgc ggaatatgca acattgctga 10501 ttggtctgct aatggtgcgt aaaatcctca aactacgcgg aatttccggc gaaatgctga 10561 aaactgcctg gcgaggaaac ttccgtcgct tgctggcgct taaccgcgat atcatgctgc 10621 gttcgctgtt gttgcaactc tgtttcggcg cgatcaccgt acttggcgcg cgactgggga 10681 gtgacattat cgctgttaac gcggttctga tgacgctact cacctttacc gcctatgcgc 10741 tggatggttt tgcctacgcg gttgaagcgc actccggtca ggcatacggt gcgcgcgacg 10801 gtagccagtt gctggatgtc tggcgggcag cgtgccgcca gtcggggatc gtagcgttac 10861 tgttttcggt ggtttatttg ctggctgggg aacacatcat tgcgttactg acgtcgttaa 10921 cccagattca gcagctggct gaccgctatc ttatctggca ggtgattttg ccggtggttg 10981 gcgtctggtg ttatctgctg gacggcatgt ttataggcgc aacgcgtgcc accgaaatgc 11041 gtaacagtat ggcggtggcc gccgcaggtt ttgcgctgac gctccttacg ctgccgtggc 11101 tgggtaatca tgctttgtgg ctggcattaa ccgtctttct ggcgttgcgc gggctttctc 11161 tggcggctat ctggcggcgt cactggcgca atggtacctg gtttgccgca acgtgacggt 11221 taaaaattct gaataaataa tcctaagcca aattgctgac tacacttaat ctcacgttca 11281 gaagaaaagt gaacgtactc tcattcacaa ccta // >gi|2952523|gb|AF051356.1|AF051356 Streptococcus mutans YtqB (ytqB) gene, partial cds; ABC transporter (abcX), putative permease (perM), putative hemolysin (hlyX), pyruvate-formate lyase activating enzyme (pflC), D-alanine-D-alanyl carrier protein ligase (dltA), integral membrane protei> Length = 11202 Score = 559 bits (1424), Expect = e-158 Identities = 291/310 (93%), Positives = 291/310 (93%) Frame = +3 Query: 1 MSKILVFGHQNPDSDAIGSSMAYAYLKRQLGVDAQAVALGNPNEETAFVLDYFGIQAPPV 60 MSKILVFGHQNPDSDAIGSSMAYAYLKRQLGVDAQAVALGNPNEETAFVLDYFGIQAPPV Sbjct: 9780 MSKILVFGHQNPDSDAIGSSMAYAYLKRQLGVDAQAVALGNPNEETAFVLDYFGIQAPPV 9959 Query: 61 VKSAQAEGAKQVILTDHNEFQQSXXXXXXXXXXXXXXXXXXXNFETANPLYMRLEPVGSA 120 VKSAQAEGAKQVILTDHNEFQQS NFETANPLYMRLEPVGSA Sbjct: 9960 VKSAQAEGAKQVILTDHNEFQQSIADIREVEVVEVVDHHRVANFETANPLYMRLEPVGSA 10139 Query: 121 SSIVYRLYKENGVAIPKEIAGVMLSGLISDTLLLKSPTTHASDPAVAEDLAKIAGVDLQE 180 SSIVYRLYKENGVAIPKEIAGVMLSGLISDTLLLKSPTTHASDPAVAEDLAKIAGVDLQE Sbjct: 10140 SSIVYRLYKENGVAIPKEIAGVMLSGLISDTLLLKSPTTHASDPAVAEDLAKIAGVDLQE 10319 Query: 181 YGLAMLKAGTNLASKTAAQLVDIDAKTFELNGSQVRVAQVNTVDINEVLERQNEIEEAIK 240 YGLAMLKAGTNLASKTAAQLVDIDAKTFELNGSQVRVAQVNTVDINEVLERQNEIEEAIK Sbjct: 10320 YGLAMLKAGTNLASKTAAQLVDIDAKTFELNGSQVRVAQVNTVDINEVLERQNEIEEAIK 10499 Query: 241 ASQAANGYSDFVLMITDILNSNSEILALGNNTDKVEAAFNFTLKNNHAFLAGAVSRKKQV 300 ASQAANGYSDFVLMITDILNSNSEILALGNNTDKVEAAFNFTLKNNHAFLAGAVSRKKQV Sbjct: 10500 ASQAANGYSDFVLMITDILNSNSEILALGNNTDKVEAAFNFTLKNNHAFLAGAVSRKKQV 10679 Query: 301 VPQLTESFNG 310 VPQLTESFNG Sbjct: 10680 VPQLTESFNG 10709 1: AF051356. Streptococcus muta...[gi:2952523] LOCUS AF051356 11202 bp DNA BCT 09-DEC-1999 DEFINITION Streptococcus mutans YtqB (ytqB) gene, partial cds; ABC transporter (abcX), putative permease (perM), putative hemolysin (hlyX), pyruvate-formate lyase activating enzyme (pflC), D-alanine-D-alanyl carrier protein ligase (dltA), integral membrane protein (dltB), D-alanyl carrier protein (dltC), extramembranal protein (dltD), and putative exopolyphosphatase (ppx1) genes, complete cds; and unknown gene. ACCESSION AF051356 VERSION AF051356.1 GI:2952523 KEYWORDS . SOURCE Streptococcus mutans. ORGANISM Streptococcus mutans Bacteria; Firmicutes; Bacillus/Clostridium group; Streptococcaceae; Streptococcus. REFERENCE 1 (bases 1 to 11202) AUTHORS Boyd,D.A., Hamilton,I.R., Cvitkovitch,D.G. and Bleiweis,A.S. TITLE Defects in D-alanyl-lipoteichoic acid synthesis in Streptococcus mutans leads to acid sensitivity JOURNAL Unpublished REFERENCE 2 (bases 1 to 11202) AUTHORS Boyd,D.A., Hamilton,I.R., Cvitkovitch,D.G. and Bleiweis,A.S. TITLE Direct Submission JOURNAL Submitted (27-FEB-1998) Oral Biology, University of Manitoba, 780 Bannatyne Avenue, Winnipeg, MB R3E 0W2, Canada FEATURES Location/Qualifiers source 1..11202 /organism="Streptococcus mutans" /strain="LT11" /db_xref="taxon:1309" gene <1..252 /gene="ytqB" CDS <1..252 /gene="ytqB" /note="similar to Bacillus subtilis YtqB protein" /codon_start=1 /transl_table=11 /product="YtqB" /protein_id="AAC05769.1" /db_xref="GI:2952524" /translation="SADKSVITQPATTLTAIKKILERLEIGGRLAIMVYYGHEGGDKE KYAVLNFVKELDQQHFTVMLYQPLNQINTPPFLVMIEKL" terminator 269..334 /evidence=not_experimental gene complement(387..1180) /gene="abcX" CDS complement(387..1169) /gene="abcX" /codon_start=1 /transl_table=11 /product="ABC transporter" /protein_id="AAC05770.1" /db_xref="GI:2952525" /translation="MALISMKNVTLKKQGKILLNNLNWKVKKGENWVILGLNGSGKTT LLKLIMAEYWSTQGQVEILNTRFGQGDIPNMRTKIGVVGSFIAERLPANMLAEKIVLT GKYKSSILYKEYDETELNEARQMLTVIGGKHLLGRIYSSLSQGEKQLLLIARSLMEDP EIIILDEATSGLDLFAREKLLTQVEKITELPHAPTILYVTHHAEEITDKMSHILLLRR GKIVAQGPKKDIITPQVLENFYESPVNIISIDDKRFFIKPQV" RBS complement(1176..1180) /gene="abcX" RBS 1235..1239 /gene="perM" gene 1235..2340 /gene="perM" CDS 1246..2340 /gene="perM" /note="PerM" /codon_start=1 /transl_table=11 /product="putative permease" /protein_id="AAC05771.1" /db_xref="GI:2952526" /translation="MFKSSKLFFWTVEILLVTLILFIWRQMGSIFNPFFSVAKTFFLP FLLGGFLYYITNPIVTFLENRFKIKRIWGITLIFAVLLSLLVFSITSLIPNLINQLTD LISASQNIYVGLQDLFNEWKSNPAFKNIDIPVLLKQFNLSYVDILTNVLDSVTVSVSS IVYMITNTVMILVLTPVILFYLLKDKDGLMPMLDRTILKNDRHNISQLLNQMNKTISR YISGVAIDAAFIFVFALIGYQIMGVQYAFLFALVAGITNVIPYVGPYLGLTPVVLAYV VSDPKKMIIAIIYIMTLQQIDGNIVYPRVVGSTMKIHPLTIMVLLVLGGNIAGLVGML VAVPAYAIIKEIVKFLVGVYDYHKKNKIVL" RBS 2362..2369 /gene="hlyX" gene 2362..3713 /gene="hlyX" CDS 2376..3713 /gene="hlyX" /note="HlyX" /codon_start=1 /transl_table=11 /product="putative hemolysin" /protein_id="AAC05772.1" /db_xref="GI:2952527" /translation="MEDPGSQSLILQFLLLLILTLCNAFFSATEMALVSLNRARVEQK AEEGEKKYIRLLKVLENPNNFLSTIQVGITLITLLSGASLADSLGREIAVWFGNSATA RTAGSLISLAFLTYISIVLGELYPKRIAMNLKENLAVLSAPVIIFLGKVVSPFVWLLS VSTNLLSRLTPMTFDDADEKMTRDEIEYMLTNSEETLDADEIEMLQGVFSLDELMARE VMVPRTDAFMVDINDDSSDIIQTILNERFSRIPVYDDDKDKIIGIIHTKNLLNAGFKE GFDHINLRRILQEPLFVPETIVVNDLLTALKNTQNQMAILLDEYGGVAGLVTLEDLLE EIVGEIDDETDKTAISVREIADNTYIVLGTMTLNDFNEYFETDLESDNVDTIAGFYLT GVGTIPSQEEKEHFEVESNGKHLELINDKVKDGRVTKLKILVSEVEEKEDEKD" RBS 3774..3781 /gene="pflC" gene 3774..4579 /gene="pflC" CDS 3788..4579 /gene="pflC" /EC_number="1.97.1.4" /note="PflC" /codon_start=1 /transl_table=11 /product="pyruvate-formate lyase activating enzyme" /protein_id="AAC05773.1" /db_xref="GI:2952528" /translation="MIEKVDYEKVTGLVNSTESFGSVDGPGIRFVVFMQGCQMRCQYC HNPDTWAMKNDRATERTAGDVFKEALRFKDFWGDTGGITVSGGEATLQMDFLIALFSL AKEKGIHTTLDTCALTFRNTPKYLEKYEKLMAVTDLVLLDIKEINPDQHKIVTGHSNK TILACARYLSDIGKPVWIRHVLVPGLTDRDEDLIKLGEYVKTLKNVQRFEILPYHTMG EFKWRELGIPYPLEGVKPPTPDRVRNAKKLMHTETYEEYKKRINH" terminator 4998..5033 /evidence=not_experimental RBS 5386..5391 /gene="dltA" gene 5386..6947 /gene="dltA" CDS 5397..6947 /gene="dltA" /note="DltA" /codon_start=1 /transl_table=11 /product="D-alanine-D-alanyl carrier protein ligase" /protein_id="AAC05774.1" /db_xref="GI:2952529" /translation="MANKKIKDMIATIENFAQEQAEFPVYNILGEIHTYGELKADSDS LAAHLDQLDLTAKSPVVVFGGQEYAMLASFVALTKSGHAYIPIDHHSALERIEAILEV AEPSLVIAVDDFPIDNLQVPVIQYSQLEEIFKQKLSYQINHAVKGDDTYYIIFTSGTT GKPKGVQISHDNLLSFTNWMINAEAFATPHRPQMLAQPPYSFDLSVMYWAPTLALGGT LFALPKEITADFKQLFTTINQLPIGVWTSTPSFVDMAMLSDDFNAQQLPHLTHFYFDG EELTVKTAKKLRQRFPQARIVNAYGPTEATVALSALAVTDKMLETCKRLPIGYTKPDS PTFIIDESGHKLANGQQGEIIVSGPAVSKGYLNNPERTAAAFFEFEGLPAYHTGDLGS MTDEGLLLYGGRMDFQIKFNGYRIELEEVSQNLNKSQYIASAVAVPRYNKDHKVQNLL AYVVLKDGVEEQFERALDITKAIKADLQDVMMDYMMPSKFLYRKDLPLTPNGKIDIKG LMSEVNKK" RBS 6930..6945 /gene="dltB" gene 6930..8206 /gene="dltB" CDS 6944..8206 /gene="dltB" /note="DltB" /codon_start=1 /transl_table=11 /product="integral membrane protein" /protein_id="AAC05775.1" /db_xref="GI:2952530" /translation="MIDFFKNLPHLEAYGNPQYFFYIILAVLPIFIGLFFKKRFPLYE AFVSLIFIVLMLTGEKSHQIFALFFYIIWQIFCVYSYKFYRKSRDNKWIFYLHVFMSI LPLSLVKITPAIWTNQQSLFGFLGISYLTFRSVGMIMEMRDGVLTSFTFWEFIRFMLF MPTFSSGPIDRFRRFNDDYEKIPDKDELLDMLEQSVHYIMLGFFYKFVLAQILGTMIL PGLKEMALQKGGWFNWPTLGVMYVYGLDLFFDFAGYSMFAIAISNFMGIKSPTNFNQP FKSQDLKEFWNRWHMSLSFWFRDFVFMRLVKVLVKNKVFKNRNVTSSVAYIVNMLIMG FWHGVTWYYITYGLFHGVGLVLNDAWLRKKKRLNKERKAKNLSPLPENGWTRALGIVI TFNVVMLSFLIFSGFLNDLWFADQLSKK" RBS 8208..8212 /gene="dltC" gene 8208..8460 /gene="dltC" CDS 8221..8460 /gene="dltC" /note="DltC" /codon_start=1 /transl_table=11 /product="D-alanyl carrier protein" /protein_id="AAC05776.1" /db_xref="GI:2952531" /translation="MDIKSEVLKIIDELFMEDVSDMMDEDLFDAGVLDSMGTVELIVE LENHFDITVPVSEFGRDDWNTANKIIEGITELRNA" RBS 8440..8445 /gene="dltD" gene 8453..9718 /gene="dltD" CDS 8453..9718 /gene="dltD" /note="DltD" /codon_start=1 /transl_table=11 /product="extramembranal protein" /protein_id="AAC05777.1" /db_xref="GI:2952532" /translation="MLKRLWLILGPVFCALVLVFSLIMFYPAKHLSHNYNEEKNDAVA LSPSSFKSTNKKMRALSDKRHLFVPFFGSSEWQRIDSMHPSVLAERYNRSYRPYLLGQ KGSTSLSHYFGMQQIGNQIKNKKAVYVISPQWFVPKGTSPIAFQQYFSSEQLADFLLN QTGSIADRYAAKRLLDIKPSSNLQGMIKKIAAGKTLNSFDRASLRLIKSFLKKEDALF GSLTFSDNYERRVLPHVKKLPKHFSYGTLSQIASKNGQRLTKTNQFEINDHFYNKRIK GQLKRLKGFQKQLSYLQSPEYNDLQLALTQLAKSKTFVIFVIPPVNAKWVEYTGLSQD MYQKTVEKIKYQLQSQGFDNIADLSKNGDQPYFMQDTIHLGWNGWLAFDKEVNPFLSK KQLQPAYKINNHFLSKKWATYTGNPFQFK" gene 9766..10712 /gene="ppx1" RBS 9766..9771 /gene="ppx1" CDS 9780..10712 /gene="ppx1" /note="Ppx1" /codon_start=1 /transl_table=11 /product="putative exopolyphosphatase" /protein_id="AAC05778.1" /db_xref="GI:2952533" /translation="MSKILVFGHQNPDSDAIGSSMAYAYLKRQLGVDAQAVALGNPNE ETAFVLDYFGIQAPPVVKSAQAEGAKQVILTDHNEFQQSIADIREVEVVEVVDHHRVA NFETANPLYMRLEPVGSASSIVYRLYKENGVAIPKEIAGVMLSGLISDTLLLKSPTTH ASDPAVAEDLAKIAGVDLQEYGLAMLKAGTNLASKTAAQLVDIDAKTFELNGSQVRVA QVNTVDINEVLERQNEIEEAIKASQAANGYSDFVLMITDILNSNSEILALGNNTDKVE AAFNFTLKNNHAFLAGAVSRKKQVVPQLTESFNG" terminator 10725..10746 /evidence=not_experimental RBS 10889..10894 CDS 10919..>11202 /note="Orf11" /codon_start=1 /transl_table=11 /product="unknown" /protein_id="AAC05779.1" /db_xref="GI:2952534" /translation="MIKIYNGDKLTRQPFFIKLINYLQIHDDVTLRQIKRNFADTEHL ERSIEDYVQAGYVLRENKHYYNAFELLENLDGLTLDSQIFVDDQSSIYQDL" BASE COUNT 3453 a 1752 c 2160 g 3837 t ORIGIN 1 tcggcagaca agtcagtcat tactcagcct gctacaaccc tgacagctat taaaaagatt 61 ttagagagat tagaaattgg cggtcgtttg gcaattatgg tatattatgg tcatgagggt 121 ggcgataagg aaaaatatgc ggttctgaac tttgttaaag agctagatca acagcatttt 181 acagtcatgc tttatcaacc cttaaatcaa ataaataccc cacccttttt ggtgatgata 241 gagaaattat aaaaattttc attaaaaagc gatataaaaa agcttaaaat taaacgtttc 301 tgacaactta gttttaagct tttttagata tcacaatatt cttattttgg attagaaatt 361 gaaattttgc ccagactctt ttttgttcat acttgcggtt tgataaaaaa acgtttgtca 421 tcaatggaaa tgatattgac gggactttca taaaaatttt caaggacttg aggtgtaata 481 atatcttttt taggaccttg agccactatt ttacctctac gaaggaggag gatatgactc 541 attttatcag tgatttcttc agcatggtgg gtaacataaa ggatagttgg agcatgtggt 601 aactcagtaa tcttttcaac ttgtgttagc aatttttcac gggcaaaaag atccagtccg 661 ctggttgctt catccaaaat aatgatttca ggatcttcca taaggctgcg cgcaataagg 721 aggagttgtt tttcaccttg tgagaggctg ctatagatgc gaccaagcaa gtgttttccg 781 ccgatgacag taagcatttg gcgtgcttca ttaagttctg tttcgtcgta ttccttgtag 841 agaatgcttg atttgtattt accagttagc acgatctttt cagccaacat atttgcaggg 901 agtcgctcag caataaaaga gcccacgaca ccgattttag tccgcatatt gggaatatca 961 ccctgaccaa acctagtatt gagaatttca acctgtcctt gtgttgacca atactccgcc 1021 ataataagtt ttaaaagagt ggtttttcca gaaccgttaa gacctagaat aacccaattt 1081 tctccctttt tgactttcca attaagattg ttgagcaaga ttttgccctg tttttttaag 1141 gtcacatttt tcatggaaat gagtgccatt ttttacctct ttgttgctta cttgttattt 1201 attataacat aaaaatatgt taaaataagc gcaaaaggag ggagtatgtt taagagtagt 1261 aaattgtttt tctggacagt agaaatttta ctggtcactt taattctttt tatctggcgg 1321 caaatgggaa gtatctttaa tccttttttc agtgttgcta aaacgttttt tcttcctttt 1381 ttattgggag gttttcttta ttatattact aatcctattg tcactttttt agaaaaccgt 1441 tttaaaatta agcgtatttg ggggatcact cttatttttg ctgtattgct ttccttgctg 1501 gttttttcta ttaccagtct gattcccaat ttgattaatc agctaacaga tcttatttca 1561 gccagccaaa atatttatgt gggtttgcag gatttattca atgaatggaa aagcaatcct 1621 gcctttaaaa atattgatat ccctgttctt ttaaaacagt tcaatttatc ttatgttgat 1681 attttgacaa atgttttgga tagcgtgaca gttagtgtct caagtattgt ttatatgatt 1741 acaaatacgg tgatgattct ggttcttaca cccgttattc ttttttatct cctcaaggac 1801 aaagatggtt taatgcccat gttagatcgt actatattga aaaatgatag gcataatatc 1861 agtcaattac tgaatcaaat gaacaaaacc atttctcgtt atattagtgg tgtagctatt 1921 gatgctgcct tcatatttgt ttttgcttta attggttatc agattatggg cgtccagtat 1981 gctttcttat ttgctttagt tgctgggatt actaatgtta ttccttatgt tggcccttac 2041 ttaggtttga caccagttgt tttagcttat gtggtcagcg atcctaagaa gatgattatt 2101 gctattattt atattatgac cttgcagcaa attgatggga atattgtcta cccacgtgtt 2161 gtaggaagta ccatgaaaat tcatccttta acaataatgg ttctcttggt gttaggtggc 2221 aatattgctg gtttggttgg tatgttagtt gctgtaccag cttatgctat cattaaagaa 2281 attgtaaaat tccttgtagg tgtttatgat taccacaaga aaaataaaat tgtgttataa 2341 tagactaact atattaattt taggagagat aactaatgga agaccccggt agccagtcct 2401 tgatcttaca atttttatta ttactgattt taacgctttg caatgctttt ttctcagcta 2461 ctgaaatggc tcttgtatct cttaatcgtg cacgcgttga acaaaaggct gaagagggcg 2521 aaaaaaagta tatccgtctg ctaaaagttt tggaaaatcc taataatttt ctatcaacca 2581 ttcaggttgg tattactctt attacccttt tatccggagc gagtttggca gattctttag 2641 ggcgtgaaat tgctgtttgg tttggcaatt cagctactgc cagaacggca ggaagtctta 2701 tttctttagc ttttttgacc tatatttcta ttgttttggg tgaattgtat cctaaacgta 2761 ttgctatgaa tctcaaggaa aatttagctg tgctgtcagc acctgttatt atttttcttg 2821 gcaaagtggt tagccctttt gtttggctgc tttcggtctc aacaaatcta ttaagtcgcc 2881 tcactcctat gacctttgat gatgccgatg agaaaatgac ccgtgatgaa attgaataca 2941 tgttgacaaa cagtgaagag actctagatg ctgatgaaat tgaaatgctg caaggtgtct 3001 tttcactaga tgaattgatg gcacgtgaag tgatggttcc tcgtacagat gcttttatgg 3061 tggatattaa tgatgattct agtgatatta tccaaaccat tctcaatgaa agattttcgc 3121 ggattcctgt ttacgatgat gataaagata agattattgg aatcattcat actaaaaatt 3181 tattgaatgc tggtttcaag gaaggttttg atcacatcaa tcttcgccgt attttgcaag 3241 agccgctttt tgtaccagaa actattgttg taaatgacct tttgaccgct ttaaaaaata 3301 cccaaaatca gatggctatc ctacttgatg agtatggcgg tgtggcaggt cttgtcacct 3361 tagaagactt attagaagaa attgttggtg aaattgatga tgagacagac aaaacagcca 3421 tttctgttcg tgaaattgca gataatacct atattgtttt gggaacaatg acacttaatg 3481 attttaatga gtactttgaa actgaccttg aaagtgataa tgttgatacg attgctggtt 3541 tttatctgac tggtgtcgga acaattccaa gtcaagaaga aaaagaacac tttgaagtag 3601 agagtaatgg caaacacctt gaactgatta atgataaagt taaggatggt cgggttacca 3661 aactgaagat tctcgtttca gaggttgaag aaaaggaaga tgaaaaagac taaacataat 3721 gttagtcttt tttactttta aagtgttata ataataatca gaaaaacgtt tacaggaaag 3781 aaaaatcatg atagaaaaag ttgactacga aaaagtaaca ggacttgtta attctacaga 3841 atcttttggg tctgtagacg gacctggtat acgctttgtt gtttttatgc aagggtgcca 3901 aatgcgttgt caatattgcc acaatcctga tacttgggca atgaagaatg atagagcaac 3961 agaaaggact gcaggagatg tctttaaaga agctttacgt tttaaagatt tttggggaga 4021 tacaggaggt attactgttt ctggtggtga agcaacgctc cagatggatt ttttaattgc 4081 cctcttttct ttagcaaaag aaaagggaat tcatacgacc ttggatacct gtgctctgac 4141 ttttagaaac acaccaaaat atcttgaaaa atatgaaaag ttaatggctg tcactgattt 4201 agtattgtta gatattaaag agattaatcc tgaccaacat aaaattgtca ctggtcatag 4261 caataaaact attttagctt gtgcgcgtta tttatctgat attggaaaac ctgtttggat 4321 tcgccatgtc ttagtccctg gtctgactga tcgggatgaa gacttaataa agttgggtga 4381 gtatgtcaaa acactgaaga atgttcaacg gtttgaaatt cttccttatc atacaatggg 4441 tgaattcaaa tggcgtgaat tagggattcc ttatcctttg gaaggtgtta aaccgccaac 4501 accagatcgt gtgcgcaatg ctaaaaagtt aatgcatacg gaaacatatg aagaatataa 4561 aaagaggatt aatcattaaa agaaaaggat aaaagaaagt gaaagtttat tttgttatgc 4621 tactttttca agtttggtga agttaataag tcagtcagct tatccaatta aataatagtc 4681 ttgtttggaa acaaattaat ctttttatgg ctttttttaa gtaaggagag cgctaagctc 4741 atttaggtga cgttgtcatc ttagtagacc tatccgtcta caaaataaaa gcgatgggaa 4801 tactgaccag ttgctgctgt gtcacagtat caataccttt ggtaacaata ctggcaaatt 4861 agagattaca agtttcgatt ttagcaataa aatgacggag tcggttgctc gcgtgctgat 4921 cctctaaagc taactggcaa aaaaagtcaa tcaatagatg caataggctt ctaaacaaat 4981 ctaagaaaaa gcttgaaagt tgcttttaga aagagctatt tcaagctttt tctttttaaa 5041 ttgtgattag aaaaaataaa agaatagaca gtgtattgag agataattat tagagcagag 5101 cgaactgcta tttaacaatt ttttgtcaaa aaaacaataa atttcctatc tttatagcat 5161 atttttgata taatggactt tgttattgaa ttattttaag gtatatttaa ataaagtatg 5221 gcaatctgaa tgaggaggct gtttgataaa ttaagatgaa aaaacgaaga acaatctata 5281 aatttttatt acaaacttta ttttattccg ttatattttt aattttactc tatttcttta 5341 gttaccttgg tcaaggtcag ggagaattta tctacaacga attttaggaa gtaacaatgg 5401 ctaataaaaa aataaaagat atgattgcaa caattgaaaa ttttgctcaa gaacaggcag 5461 aatttccggt ttataatatt ttaggagaaa tccataccta tggagaatta aaagctgatt 5521 ctgattcgct tgcagctcat cttgatcagt tagatttaac agcaaaatca ccagtagttg 5581 tctttggagg acaggaatat gccatgctgg ctagttttgt tgctctgaca aaatcagggc 5641 atgcctatat tcctattgat catcattcag ccttagaacg tattgaggct attttagagg 5701 tagcagagcc aagtttagtt attgctgttg atgatttccc aattgacaat cttcaagtcc 5761 cagtaattca gtatagtcaa ttagaagaga tttttaaaca aaagctatct tatcaaatca 5821 atcatgcggt taagggagat gatacctact atatcatctt tacttcaggg acaactggta 5881 aacctaaagg agtacagatt tcacatgaca atctgcttag ttttactaat tggatgatta 5941 atgcagaagc ttttgcaaca cctcataggc cgcaaatgct ggcacaaccg ccttactctt 6001 ttgatttatc agtgatgtac tgggcgccaa cattggcttt aggtggaacc ctttttgctc 6061 ttcctaaaga aataactgca gatttcaaac aattatttac aactattaac caattaccca 6121 ttggtgtgtg gacatcaaca ccttcctttg ttgatatggc tatgctgtca gatgacttta 6181 atgcacagca attgcctcat ttaactcatt tctattttga cggagaagag ttgacggtta 6241 agacggctaa aaaattgcgt cagcgttttc cgcaagcaag aattgtcaac gcttatgggc 6301 caacagaagc aactgttgct ttatcagctt tggctgtcac tgataaaatg cttgaaacat 6361 gcaaacgtct gccaattggc tatacaaaac cagattcgcc aacctttatt attgatgagt 6421 caggtcataa actggcaaat ggtcagcaag gagagattat tgtttccggt ccggcagtct 6481 ctaaggggta tctcaataat cctgaacgaa cagcagcagc tttctttgaa tttgaaggtt 6541 tgccagctta tcatactggt gatttgggaa gtatgacaga tgaaggtctc ttgctctatg 6601 gcggtcgtat ggattttcag attaaattca atggctatcg tattgagttg gaagaagtct 6661 ctcaaaatct taacaaatcg caatatatcg catctgctgt agctgttccc cgttataata 6721 aagaccataa ggtgcaaaat cttttggctt atgttgtctt aaaagatggt gtagaagagc 6781 aatttgagag agcacttgac attaccaaag ctattaaggc tgatttgcaa gatgttatga 6841 tggattacat gatgccttct aagtttttgt accgtaaaga tttaccttta acacctaatg 6901 gtaaaattga tattaaaggt ttgatgagtg aggtaaacaa aaaatgattg attttttcaa 6961 gaatcttcct cacttagaag cttatggaaa tcctcaatat tttttctata ttattttggc 7021 tgtcttacca atttttatag gcctcttttt taagaaacgc tttcctcttt atgaggcttt 7081 tgttagtctg atttttattg ttttaatgtt gacaggtgaa aagtcgcatc aaatctttgc 7141 cttgtttttc tatatcattt ggcaaatttt ctgtgtctat agttataaat tttatagaaa 7201 atcacgggat aataagtgga ttttttatct tcatgtcttc atgtctatct tacctttatc 7261 tttggtaaag attactcctg cgatttggac aaatcaacaa tctttatttg gttttttggg 7321 tatatcctat cttacctttc gttcagtagg tatgattatg gaaatgcgag acggtgttct 7381 cacgtcattt acattttggg aatttatccg ttttatgctg tttatgccca ctttttcaag 7441 tgggcccatt gatcgtttca gaagatttaa tgatgattat gagaagattc ctgataaaga 7501 tgaattgcta gatatgttgg aacaatctgt tcactatatc atgcttggtt ttttctataa 7561 gtttgtttta gcgcaaatat tgggaacaat gattttaccg ggtttgaaag aaatggcctt 7621 gcaaaaaggt ggttggttca attggccgac tttaggagtc atgtatgttt atggcctaga 7681 cttatttttt gactttgctg gttattcaat gtttgccata gctatctcta actttatggg 7741 aatcaagagt ccaactaatt ttaaccaacc gtttaaatca caagatttga aggaattttg 7801 gaatcgctgg catatgagtt tgtctttttg gtttagagac tttgttttta tgcgtcttgt 7861 caaagtttta gtcaaaaata aggtcttcaa aaatcgcaat gttacatcaa gtgttgctta 7921 tattgtgaat atgctgatta tgggattttg gcatggcgta acttggtatt atattaccta 7981 tggtctattt cacggtgtcg gtttagtact caacgacgct tggcttcgta aaaagaaaag 8041 actgaacaaa gagagaaaag caaagaattt atcacctttg ccagaaaatg gctggacaag 8101 agcgttaggt attgtcatta cctttaatgt tgtcatgtta tcatttctga ttttctcagg 8161 attcttaaat gacttgtggt ttgcagatca attatcaaag aaataaaagg agtatttaca 8221 atggatataa aatcagaggt tttaaaaatt attgatgagc tttttatgga agatgtatca 8281 gatatgatgg atgaagatct atttgatgca ggtgttttag atagcatggg aactgttgaa 8341 ttaattgttg aattggaaaa tcattttgat atcactgttc ctgtttctga atttggtcgt 8401 gatgattgga atacagccaa taaaattatt gagggtataa cggagttacg aaatgcttaa 8461 acggttatgg ttgattttgg ggccagtttt ctgtgcccta gttttggttt tttctttaat 8521 tatgttctat ccagctaaac atttaagtca taattataat gaagaaaaaa acgatgctgt 8581 agctttatca cctagtagtt ttaaaagtac caataaaaaa atgcgtgcat tgtcagacaa 8641 aaggcacctc tttgtcccct tttttggttc tagtgagtgg caacgtattg atagtatgca 8701 tccatctgtt ctagcagagc gttataatcg ttcttatcgt ccctatttac tgggacaaaa 8761 agggtcaacc tctttgtcac attattttgg catgcagcaa attggtaatc agattaaaaa 8821 taaaaaggct gtctatgtta tttctcctca gtggttcgtt ccaaagggga caagtcctat 8881 cgcttttcag cagtatttta gttctgaaca attagctgat tttctcttaa atcaaacagg 8941 aagtatagcg gatcgttatg cagctaaacg tttattagac attaaaccga gttcgaattt 9001 gcaaggtatg ataaaaaaaa ttgcggctgg taaaacctta aatagctttg atagggcaag 9061 cctgcgcctt attaagagtt tcttgaaaaa agaagacgct ttatttggaa gtctgacctt 9121 tagtgataat tatgaacgtc gtgtattgcc gcatgtcaaa aaattgccca agcacttttc 9181 ttatggaacc ttaagtcaaa ttgctagcaa aaatggtcaa aggttaacaa aaacaaatca 9241 atttgaaatt aatgatcatt tttataataa acgtattaaa ggacaattga aaagactcaa 9301 aggcttccaa aagcaactgt cttatttaca gtctccagaa tacaatgatt tacagctggc 9361 gttaactcaa ttagcaaagt caaagacctt tgtcatattt gttattccgc cggttaatgc 9421 caaatgggtt gaatatacag gtctaagcca agatatgtat caaaagacgg ttgaaaaaat 9481 aaaatatcag ttacagagcc aaggttttga taacattgct gatttatcaa aaaatggtga 9541 tcagccttat ttcatgcagg ataccattca tcttggttgg aatggttggc tggcttttga 9601 taaagaggtc aatcccttct tatccaagaa acagctccaa ccggcttata agattaataa 9661 tcacttttta agtaaaaagt gggccactta tactggtaat ccttttcagt ttaagtgaga 9721 cctttttgcc taatgaaatc aaatttggta aaatgagagg aagaaaagag gtaattgtta 9781 tgtctaaaat tttagttttt ggtcatcaaa atcccgattc agatgcaatc ggttcatcaa 9841 tggcttatgc atatttaaaa cgtcaactcg gcgtagatgc tcaagcagta gctttaggaa 9901 atccaaatga ggagacagct tttgtacttg actattttgg tattcaagca ccgcctgttg 9961 ttaagtcagc tcaagcagaa ggtgccaaac aagtcatctt aacagaccat aatgaatttc 10021 agcaatctat cgcagatatc cgcgaagttg aagtcgttga agttgtagat catcatcgtg 10081 tcgctaattt tgaaacagca aatcctcttt atatgcgttt agaaccagtt ggatctgctt 10141 catcaatcgt ttatcggctt tacaaggaaa atggtgttgc tattcctaag gaaattgcag 10201 gagtgatgct gtcaggtttg atttcagata cacttctttt gaaatcacca acgactcatg 10261 cctctgatcc agctgttgca gaagatttag ccaaaattgc aggtgttgac ttgcaggaat 10321 atggtttggc tatgcttaag gctggtacca atttagcaag taaaacggct gcacaacttg 10381 ttgatattga tgctaaaaca tttgaactta atggtagtca agtacgtgta gctcaagtca 10441 atacggttga tatcaatgaa gttttggaac gtcaaaatga aattgaagaa gccattaaag 10501 catcacaagc agctaatgga tactctgatt ttgtcttgat gattactgat attttaaatt 10561 caaattcgga aatcttagca ttgggcaaca atactgataa agtagaagca gcctttaatt 10621 tcacacttaa aaataaccat gccttcttag caggtgctgt atcacgtaag aaacaagttg 10681 tacctcaatt aacggaaagt tttaatgggt aaaaaattta aatgaaagac ggtgagagct 10741 gtcttttttc tttgtaaact tttttggagt acattataga tgaatcaatg caactattgt 10801 tctaatcctc attttttata aaaaaattga agatactatc tgttgttttc gtatcttaat 10861 tggccatact tttatggaaa attagtctag aagaccgaat ttttgctata atattttcat 10921 gattaaaatt tataatggcg ataaattaac gcgtcagcca tttttcataa aacttatcaa 10981 ctatttgcaa atacatgatg atgtcacttt gcggcagatt aagaggaatt ttgctgatac 11041 tgagcatcta gaacgatcta ttgaggacta tgtgcaggcg ggttatgttt tacgtgaaaa 11101 taaacattat tacaatgctt ttgagctgtt agagaacctt gatggtttga cactggatag 11161 tcagattttt gtcgatgacc agtcatctat ttatcaagat ct // >gi|1146335|gb|L43374.1|ECOADHESIN Escherichia coli adhesin 20K (ADH20K) gene, complete cds Length = 1109 Score = 327 bits (829), Expect = 2e-88 Identities = 156/156 (100%), Positives = 156/156 (100%) Frame = +2 Query: 1 AVSFIGSTENDVGPSQGSYSSTHAMDNLPFVYNTGYNIGYQNANVWRISGGFCVGLDGKV 60 AVSFIGSTENDVGPSQGSYSSTHAMDNLPFVYNTGYNIGYQNANVWRISGGFCVGLDGKV Sbjct: 77 AVSFIGSTENDVGPSQGSYSSTHAMDNLPFVYNTGYNIGYQNANVWRISGGFCVGLDGKV 256 Query: 61 DLPVVGSLDGQSIYGLTEEVGLLIWMGDTNYSRGTAMSGNSWENVFSGWCVGNYVSTQGL 120 DLPVVGSLDGQSIYGLTEEVGLLIWMGDTNYSRGTAMSGNSWENVFSGWCVGNYVSTQGL Sbjct: 257 DLPVVGSLDGQSIYGLTEEVGLLIWMGDTNYSRGTAMSGNSWENVFSGWCVGNYVSTQGL 436 Query: 121 SVHVRPVILKRNSSAQYSVQKTSIGSIRMRPYNGSS 156 SVHVRPVILKRNSSAQYSVQKTSIGSIRMRPYNGSS Sbjct: 437 SVHVRPVILKRNSSAQYSVQKTSIGSIRMRPYNGSS 544 1: L43374. Escherichia coli a...[gi:1146335] LOCUS ECOADHESIN 1109 bp DNA BCT 03-JAN-1996 DEFINITION Escherichia coli adhesin 20K (ADH20K) gene, complete cds. ACCESSION L43374 VERSION L43374.1 GI:1146335 KEYWORDS adhesin. SOURCE Escherichia coli (strain 31A/o6) DNA. ORGANISM Escherichia coli Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 1109) AUTHORS Martin,C., Rousset,E. and De Greve,H. TITLE Human uropathogenic and bovine septicemic E.coli strains carry an identical F17-related adhesin JOURNAL Unpublished FEATURES Location/Qualifiers source 1..1109 /organism="Escherichia coli" /strain="31A/o6" /db_xref="taxon:562" gene 11..1042 /gene="ADH20K" CDS 11..1042 /gene="ADH20K" /note="putative" /codon_start=1 /transl_table=11 /product="adhesin 20K" /protein_id="AAA85086.1" /db_xref="GI:1146336" /translation="MTNFYKVFLAVFILVCCNISHAAVSFIGSTENDVGPSQGSYSST HAMDNLPFVYNTGYNIGYQNANVWRISGGFCVGLDGKVDLPVVGSLDGQSIYGLTEEV GLLIWMGDTNYSRGTAMSGNSWENVFSGWCVGNYVSTQGLSVHVRPVILKRNSSAQYS VQKTSIGSIRMRPYNGSSAGSVQTTVNFSLNPFTLNDTVTSCRLLTPSAVNVSLAAIS AGQLPSSGDEVVAGTTSLKLQCDAGVTVWATLTDATTPSNRSDILTLTGASTATGVGL RIYKNTDSTPLKFGPDSPVKGNENQWQLSTGTETSPSVRLYVKYVNTGEGINPGTVNG ISTFTFSYQ" BASE COUNT 309 a 224 c 296 g 280 t ORIGIN 1 aggcaataat atgacaaatt tttataaggt ctttctggct gtattcattc tggtttgctg 61 caatatcagc catgcggcag tttcatttat tggcagtacg gagaatgatg ttggaccgtc 121 tcagggctct tattccagca ctcatgcaat ggataacctg ccatttgtct ataataccgg 181 ttacaacatt ggatatcaga atgcaaatgt ctggcgtatt agtggcgggt tttgtgttgg 241 tctggacggg aaagtggatt tacccgtggt tggcagtctt gacgggcaga gtatttatgg 301 gctgacggag gaggtgggac tccttatatg gatgggggac acgaattatt ccaggggtac 361 cgcgatgagt ggaaactcat gggagaatgt cttttccgga tggtgcgtgg gaaattatgt 421 atcaacgcag ggactgtctg ttcacgtaag accggtaatt ttaaaaagaa attcctctgc 481 gcaatacagt gtacagaaaa ccagtatcgg gagtatcaga atgaggccct ataacggttc 541 atctgcaggc agtgttcaga ccacagtgaa tttcagcctg aatccattta cgctgaatga 601 cacagtaaca tcgtgcagat tactgacacc ttccgcagtc aatgtcagcc tggctgcaat 661 ttctgccgga caactgccat catccggtga tgaagttgtt gccgggacaa catcactgaa 721 attacagtgt gatgccggag taacagtatg ggcaacactg actgatgcga ccacaccgtc 781 caacagaagc gatatactca cactgacggg ggcatcgact gcaaccggag tcgggctgag 841 aatatacaaa aacactgaca gtacgcccct gaagtttgga cctgattcgc cggtaaaggg 901 aaatgaaaac cagtggcagt tatcgacagg aacggaaacg tcaccctcag tccggttgta 961 tgtaaagtat gtgaatactg gtgagggaat taatccgggt acggttaacg gaatatcaac 1021 atttacgttt tcctatcagt aacagcgagt tccgggaggg gagagcaggt cagacagtaa 1081 taacaaaatg atttgcgttg gcgaaagca // . . . 20000519 14.20 Leszek: Dear Alexander, I am trying to find the DNA sequence for T0089. In the meantime please run the prediction for T0090 (file attached). Leszek T0090 >gi|1789405|gb|AE000385.1|AE000385 Escherichia coli K-12 MG1655 section 275 of 400 of the complete genome Length = 10180 Score = 390 bits (990), Expect = e-107 Identities = 197/209 (94%), Positives = 197/209 (94%) Frame = -3 Query: 1 MLKPDNLPVTFGKNDVEIIARETLYRGFFSLDLYRFRHRLFNGQMSHEVRREIFERGHAA 60 MLKPDNLPVTFGKNDVEIIARETLYRGFFSLDLYRFRHRLFNGQMSHEVRREIFERGHAA Sbjct: 5570 MLKPDNLPVTFGKNDVEIIARETLYRGFFSLDLYRFRHRLFNGQMSHEVRREIFERGHAA 5391 Query: 61 VLLPFDPVRDEVVLIEQIRIAAYDTSETPWLLEMVAGMIEEGESVEDVARREAIEEAGLI 120 VLLPFDPVRDEVVLIEQIRIAAYDTSETPWLLEMVAGMIEEGESVEDVARREAIEEAGLI Sbjct: 5390 VLLPFDPVRDEVVLIEQIRIAAYDTSETPWLLEMVAGMIEEGESVEDVARREAIEEAGLI 5211 Query: 121 VKRTKPVLSFLASPGGTSERSSIMVGEVDATTASGIHGLADENEDIRVHVVSREQAYQWV 180 VKRTKPVLSFLASPGGTSERSSIMVGEVDATTASGIHGLADENEDIRVHVVSREQAYQWV Sbjct: 5210 VKRTKPVLSFLASPGGTSERSSIMVGEVDATTASGIHGLADENEDIRVHVVSREQAYQWV 5031 Query: 181 EEGKIDNAASVIXXXXXXXXXXXXKNEWA 209 EEGKIDNAASVI KNEWA Sbjct: 5030 EEGKIDNAASVIALQWLQLHHQALKNEWA 4944 1: AE000385. Escherichia coli K...[gi:1789405] LOCUS AE000385 10180 bp DNA BCT 12-NOV-1998 DEFINITION Escherichia coli K-12 MG1655 section 275 of 400 of the complete genome. ACCESSION AE000385 U00096 VERSION AE000385.1 GI:1789405 KEYWORDS . SOURCE Escherichia coli. ORGANISM Escherichia coli Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 10180) AUTHORS Blattner,F.R., Plunkett,G. III, Bloch,C.A., Perna,N.T., Burland,V., Riley,M., Collado-Vides,J., Glasner,J.D., Rode,C.K., Mayhew,G.F., Gregor,J., Davis,N.W., Kirkpatrick,H.A., Goeden,M.A., Rose,D.J., Mau,B. and Shao,Y. TITLE The complete genome sequence of Escherichia coli K-12 JOURNAL Science 277 (5331), 1453-1474 (1997) MEDLINE 97426617 PUBMED 97426617 REFERENCE 2 (bases 1 to 10180) AUTHORS Blattner,F.R. TITLE Direct Submission JOURNAL Submitted (16-JAN-1997) Guy Plunkett III, Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA. Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 608-263-7459 REFERENCE 3 (bases 1 to 10180) AUTHORS Blattner,F.R. TITLE Direct Submission JOURNAL Submitted (02-SEP-1997) Guy Plunkett III, Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA. Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 608-263-7459 REFERENCE 4 (bases 1 to 10180) AUTHORS Plunkett,G. III. TITLE Direct Submission JOURNAL Submitted (13-OCT-1998) Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA COMMENT This sequence was determined by the E. coli Genome Project at the University of Wisconsin-Madison (Frederick R. Blattner, director). Supported by NIH grants HG00301 and HG01428 (from the Human Genome Project and NCHGR). The entire sequence was independently determined from E. coli K-12 strain MG1655. Predicted open reading frames were determined using GeneMark software, kindly supplied by Mark Borodovsky, Georgia Institute of Technology, Atlanta, GA, 30332 [e-mail: mark@amber.gatech.edu]. Open reading frames that have been correlated with genetic loci are being annotated with CG Site Nos., unique ID nos. for the genes in the E. coli Genetic Stock Center (CGSC) database at Yale University, kindly supplied by Mary Berlyn. A public version of the database is accessible (http://cgsc.biology.yale.edu). Annotation of the genome is an ongoing task whose goal is to make the genome sequence more useful by correlating it with other data. Comments to the authors are appreciated. Updated information will be available at the E. coli Genome Project's World Wide Web site (http://www.genetics.wisc.edu). *** The E. coli K-12 sequence and its annotations are periodically updated; this is version M54. No sequence changes. Annotation updates: updated gene identifications and products; all new functional assignments courtesy of Monica Riley; added promoters, protein binding sites, and repeated sequences described in reference 1. The unique numeric identifiers beginning with a lowercase 'b' assigned to each gene (protein- or RNA-encoding) are now designated as gene synonyms instead of labels. This should allow them to be searched for in Entrez as gene names. FEATURES Location/Qualifiers source 1..10180 /organism="Escherichia coli" /strain="K-12" /sub_strain="MG1655" /db_xref="taxon:562" promoter complement(13..42) /note="factor Sigma70; predicted +1 start at 3170362" promoter 34..60 /note="factor Sigma70; predicted +1 start at 3170423" promoter 129..157 /note="factor Sigma70; predicted +1 start at 3170520" gene 190..771 /gene="mdaB" /note="b3028" CDS 190..771 /gene="mdaB" /function="phenotype; Not classified" /note="o193; 100 pct identical to MDAB_ECOLI SW: P40717" /codon_start=1 /transl_table=11 /product="modulator of drug activity B" /protein_id="AAC76064.1" /db_xref="GI:1789406" /db_xref="PID:g1789406" /translation="MSNILIINGAKKFAHSNGQLNDTLTEVADGTLRDLGHDVRIVRA DSDYDVKAEVQNFLWADVVIWQMPGWWMGAPWTVKKYIDDVFTEGHGTLYASDGRTRK DPSKKYGSGGLVQGKKYMLSLTWNAPMEAFTEKDQFFHGVGVDGVYLPFHKANQFLGM EPLPTFIANDVIKMPDVPRYTEEYRKHLVEIFG" gene 802..1116 /gene="ygiN" /note="b3029" CDS 802..1116 /gene="ygiN" /function="orf; Unknown" /note="o104; 100 pct identical to YGIN_ECOLI SW: P40718" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC76065.1" /db_xref="GI:1789407" /db_xref="PID:g1789407" /translation="MLTVIAEIRTRPGQHHRQAVLDQFAKIVPTVLKEEGCHGYAPMV DCAAGVSFQSMAPDSIVMIEQWESIAHLEAHLQTPHMKAYSEAVKGDVLEMNIRILQP GI" gene complement(1164..3056) /gene="parE" /note="b3030" CDS complement(1164..3056) /gene="parE" /EC_number="5.99.1.-" /function="enzyme; DNA - replication, repair, restriction/modification" /note="f630; 100 pct identical to PARE_ECOLI SW: P20083" /codon_start=1 /transl_table=11 /product="DNA topoisomerase IV subunit B" /protein_id="AAC76066.1" /db_xref="GI:1789408" /db_xref="PID:g1789408" /translation="MTQTYNADAIEVLTGLEPVRRRPGMYTDTTRPNHLGQEVIDNSV DEALAGHAKRVDVILHADQSLEVIDDGRGMPVDIHPEEGVPAVELILCRLHAGGKFSN KNYQFSGGLHGVGISVVNALSKRVEVNVRRDGQVYNIAFENGEKVQDLQVVGTCGKRN TGTSVHFWPDETFFDSPRFSVSRLTHVLKAKAVLCPGVEITFKDEINNTEQRWCYQDG LNDYLAEAVNGLPTLPEKPFIGNFAGDTEAVDWALLWLPEGGELLTESYVNLIPTMQG GTHVNGLRQGLLDAMREFCEYRNILPRGVKLSAEDIWDRCAYVLSVKMQDPQFAGQTK ERLSSRQCAAFVSGVVKDAFILWLNQNVQAAELLAEMAISSAQRRMRAAKKVVRKKLT SGPALPGKLADCTAQDLNRTELFLVEGDSAGGSAKQARDREYQAIMPLKGKILNTWEV SSDEVLASQEVHDISVAIGIDPDSDDLSQLRYGKICILADADSDGLHIATLLCALFVK HFRALVKHGHVYVALPPLYRIDLGKEVYYALTEEEKEGVLEQLKRKKGKPNVQRFKGL GEMNPMQLRETTLDPNTRRLVQLTIDDEDDQRTDAMMDMLLAKKRSEDRRNWLQEKGD MAEIEV" promoter complement(3062..3088) /note="factor Sigma70; predicted +1 start at 3173411" gene complement(3085..3666) /gene="yqiA" /note="b3031" CDS complement(3085..3666) /gene="yqiA" /function="orf; Unknown" /note="f193; 100 pct identical to 134 residues of YZZI_ECOLI SW: P36653 (137 aa) but contains 56 additional C-term residues" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC76067.1" /db_xref="GI:1789409" /db_xref="PID:g1789409" /translation="MSTLLYLHGFNSSPRSAKASLLKNWLAEHHPDVEMIIPQLPPYP SDAAELLESIVLEHGGDSLGIVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDY LGQNENPYTGQQYVLESRHIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYA SCRQTVIEGGNHAFTGFEDYFNPIVDFLGLHHL" promoter complement(3104..3131) /gene="yqiA" /note="factor Sigma70; predicted +1 start at 3173453" promoter complement(3188..3219) /gene="yqiA" /note="factor Sigma70; predicted +1 start at 3173537" gene complement(3666..4493) /gene="icc" /note="b3032" CDS complement(3666..4493) /gene="icc" /function="regulator; Degradation of small molecules: Carbon compounds" /note="f275; 100 pct identical to ICC_ECOLI SW: P36650; TTGstart" /codon_start=1 /transl_table=11 /product="regulator of lacZ" /protein_id="AAC76068.1" /db_xref="GI:1789410" /db_xref="PID:g1789410" /translation="MESLLTLPLAGEARVRILQITDTHLFAQKHEALLGVNTWESYQA VLEAIRPHQHEFDLIVATGDLAQDQSSAAYQHFAEGIASFRAPCVWLPGNHDFQPAMY SALQDAGISPAKRVFIGEQWQILLLDSQVFGVPHGELSEFQLEWLERKLADAPERHTL LLLHHHPLPAGCSWLDQHSLRNAGELDTVLAKFPHVKYLLCGHIHQELDLDWNGRRLL ATPSTCVQFKPHCSNFTLDTIAPGWRTLELHADGTLTTEVHRLADTRFQPDTASEGY" gene complement(4518..4940) /gene="yqiB" /note="b3033" CDS complement(4518..4940) /gene="yqiB" /function="putative enzyme; Not classified" /note="f140; 100 pct identical to YZZH_ECOLI SW: P36652" /codon_start=1 /transl_table=11 /product="putative enzyme" /protein_id="AAC76069.1" /db_xref="GI:1789411" /db_xref="PID:g1789411" /translation="MKRYTPDFPEMMRLCEMNFSQLRRLLPRNDAPGETVSYQVANAQ YRLTIVESTRYTTLVTIEQTAPAISYWSLPSMTVRLYHDAMVAEVCSSQQIFRFKARY DYPNKKLHQRDEKHQINQFLADWLRYCLAHGAMAIPVY" gene complement(4941..5570) /gene="yqiE" /note="b3034" CDS complement(4941..5570) /gene="yqiE" /function="orf; Unknown" /note="f209; This 209 aa ORF is 54 pct identical (3 gaps) to 198 residues of an approx. 224 aa protein YZZG_HAEIN SW: P44684" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC76070.1" /db_xref="GI:1789412" /db_xref="PID:g1789412" /translation="MLKPDNLPVTFGKNDVEIIARETLYRGFFSLDLYRFRHRLFNGQ MSHEVRREIFERGHAAVLLPFDPVRDEVVLIEQIRIAAYDTSETPWLLEMVAGMIEEG ESVEDVARREAIEEAGLIVKRTKPVLSFLASPGGTSERSSIMVGEVDATTASGIHGLA DENEDIRVHVVSREQAYQWVEEGKIDNAASVIALQWLQLHHQALKNEWA" protein_bind complement(5598..5620) /note="No predicted promoter" /bound_moiety="GlpR predicted site" promoter 5620..5652 /note="factor Sigma70; predicted +1 start at 3176015" gene 5769..7256 /gene="tolC" /note="b3035" CDS 5769..7256 /gene="tolC" /function="putative membrane; Cell division" /note="o495; 99 pct identical to TOLC_ECOLI SW: P02930; CG Site No. 97; alternate names colE1-i, mtcB, refI, tol-8" /codon_start=1 /transl_table=11 /product="outer membrane channel; specific tolerance to colicin E1; segregation of daughter chromosomes" /protein_id="AAC76071.1" /db_xref="GI:1789413" /db_xref="PID:g1789413" /translation="MQMKKLLPILIGLSLSGFSSLSQAENLMQVYQQARLSNPELRKS AADRDAAFEKINEARSPLLPQLGLGADYTYSNGYRDANGINSNATSASLQLTQSIFDM SKWRALTLQEKAAGIQDVTYQTDQQTLILNTATAYFNVLNAIDVLSYTQAQKEAIYRQ LDQTTQRFNVGLVAITDVQNARAQYDTVLANEVTARNNLDNAVEQLRQITGNYYPELA ALNVENFKTDKPQPVNALLKEAEKRNLSLLQARLSQDLAREQIRQAQDGHLPTLDLTA STGISDTSYSGSKTRGAAGTQYDDSNMGQNKVGLSFSLPIYQGGMVNSQVKQAQYNFV GASEQLESAHRSVVQTVRSSFNNINASISSINAYKQAVVSAQSSLDAMEAGYSVGTRT IVDVLDATTTLYNAKQELANARYNYLINQLNIKSALGTLNEQDLLALNNALSKPVSTN PENVAPQTPEQNAIADGYAPDSPAPVVQQTSARTTTSNGHNPFRN" gene 7256..7516 /gene="ygiA" /note="b3036" CDS 7256..7516 /gene="ygiA" /function="orf; Unknown" /note="o86; 100 pct identical to YGIA_ECOLI SW: P21862" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC76072.1" /db_xref="GI:1789414" /db_xref="PID:g1789414" /translation="MTTTGLRPRLNVRQRKDTGYLPHSSPFSLQFRPAILYSDGYLPL VPEDKNETDKIHTPRIVPQKLERTPSDTSRSRGCHCFYAGWL" gene 7371..8075 /gene="ygiB" /note="b3037" CDS 7371..8075 /gene="ygiB" /function="orf; Unknown" /note="o234; 100 pct identical to YGIB_ECOLI SW: P24195" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC76073.1" /db_xref="GI:1789415" /db_xref="PID:g1789415" /translation="MGIYHWSRKTKMKRTKSIRHASFRKNWSARHLTPVALAVATVFM LAGCEKSDETVSLYQNADDCSAANPGKSAECTTAYNNALKEAERTAPKYATREDCVAE FGEGQCQQAPAQAGMAPENQAQAQQSSGSFWMPLMAGYMMGRLMGGGAGFAQQPLFSS KNPASPAYGKYTDATGKNYGAAQPGRTMTVPKTAMAPKPATTTTVTRGGFGESVAKQS TMQRSATGTSSRSMGG" gene 8081..9241 /gene="ygiC" /note="b3038" CDS 8081..9241 /gene="ygiC" /function="putative enzyme; Not classified" /note="o386; 100 pct identical to YGIC_ECOLI SW: P24196" /codon_start=1 /transl_table=11 /product="putative synthetase/amidase" /protein_id="AAC76074.1" /db_xref="GI:1789416" /db_xref="PID:g1789416" /translation="MERVSITERPDWREKAHEYGFNFHTMYGEPYWCEDAYYKLTLAQ VEKLEEVTAELHQMCLKVVEKVIASDELMTKFRIPKHTWSFVRQSWLTHQPSLYSRLD LAWDGTGEPKLLENNADTPTSLYEAAFFQWIWLEDQLNAGNLPEGSDQFNSLQEKLID RFVELREQYGFQLLHLTCCRDTVEDRGTIQYLQDCATEAEIATEFLYIDDIGLGEKGQ FTDLQDQVISNLFKLYPWEFMLREMFSTKLEDAGVRWLEPAWKSIISNKALLPLLWEM FPNHPNLLPAYFAEDDHPQMEKYVVKPIFSREGANVSIIENGKTIEAAEGPYGEEGMI VQQFHPLPKFGDSYMLIGSWLVNDQPAGIGIREDRALITQDMSRFYPHIFVE" gene complement(9279..10094) /gene="ygiD" /note="b3039" CDS complement(9279..10094) /gene="ygiD" /function="orf; Unknown" /note="f271; 100 pct identical 242 residues from YGID_ECOLISW: P24197 (255 aa) but contains 16 additional C-term residues" /codon_start=1 /transl_table=11 /product="orf, hypothetical protein" /protein_id="AAC76075.1" /db_xref="GI:1789417" /db_xref="PID:g1789417" /translation="MTPLVKDIIMSSTRMPALFLGHGSPMNVLEDNLYTRSWQKLGMT LPRPQAIVVVSAHWFTRGTGVTAMETPPTIHDFGGFPQALYDTHYPAPGSPALAQRLV ELLAPIPVTLDKEAWGFDHGSWGVLIKMYPDADIPMVQLSIDSSKPAAWHFEMGRKLA ALRDEGIMLVASGNVVHNLRTVKWHGDSSPYPWATSFNEYVKANLTWQGPVEQHPLVN YLDHEGGTLSNPTPEHYLPLLYVLGAWDGQEPITIPVEGIEMGSLSMLSVQIG" promoter 10069..10097 /note="factor Sigma70; predicted +1 start at 3180460" promoter complement(10107..10134) /note="factor Sigma70; predicted +1 start at 3180456" BASE COUNT 2535 a 2769 c 2522 g 2354 t ORIGIN 1 tgcgagtaag aattatgagg aatggctatc agtattgtca ttttcagaaa atatttatcc 61 tgcatcggtg agtcagagta agatcagact tttgctaaat tcgcaaaaga ctttgcacat 121 tttgctaatt tcaccgtacc gctctgtgac gtactatagt cggcaaacgt ctcaccttga 181 ggttaaaaaa tgagcaacat cctgattatc aacggcgcga aaaaattcgc ccactccaat 241 ggtcaactga acgacaccct gaccgaagtc gcggatggca cactgcgcga ccttgggcat 301 gatgtccgca tcgttcgcgc cgacagcgac tacgatgtca aagcggaagt acaaaacttt 361 ctctgggctg atgtggtgat ctggcagatg ccaggctggt ggatgggcgc gccgtggaca 421 gtgaaaaaat acattgatga tgtattcacc gaaggtcacg ggacgctgta tgccagcgat 481 ggtcgtaccc gcaaagatcc gtcgaaaaaa tacggttccg gcggcctggt acagggcaaa 541 aaatatatgc tttctctgac ctggaacgca ccaatggaag ccttcaccga aaaagatcag 601 ttcttccacg gcgttggcgt tgacggtgtg tatctgccgt tccataaagc aaaccaattc 661 ctcggtatgg aaccgctgcc gacatttatc gctaatgacg tgataaaaat gcctgatgtt 721 ccccgctata ctgaagaata tcgcaagcat cttgtggaaa tttttggtta actagagctc 781 aggctttaga aggagttaac catgcttacc gtaatcgcag aaatccgtac tcgtcctggt 841 caacatcacc gtcaggcggt attggatcag tttgctaaaa tcgttccaac cgtactgaaa 901 gaagaaggtt gccacggcta tgcgccaatg gtggattgcg cagctggcgt gagtttccag 961 tctatggcac cggattctat cgtgatgatt gagcagtggg aaagcatcgc gcatcttgaa 1021 gcgcatctgc aaaccccgca catgaaggcg tatagcgaag ccgtaaaagg tgacgtgctg 1081 gagatgaata tccgtattct gcagccaggg atttaatcct gccttgtttg cccggccatc 1141 ctgaccgggc aatgttcttt cctttaaacc tcaatctccg ccatgtcgcc tttctcttgc 1201 aaccagttgc ggcgatcttc cgagcgtttc ttcgccagca gcatatccat catcgcgtca 1261 gtacgctgat cgtcttcatc atcgatagtc aactgcacca gacggcgagt gttcggatca 1321 agcgtggttt cgcgcaattg catcgggttc atttccccca gacctttaaa acgctggacg 1381 ttcggcttgc ctttcttgcg ttttaattgc tcaagtacgc cctctttctc ttcttccgtc 1441 agcgcgtaat aaacctcttt cccgagatca atacggtaga gcggtggcag tgcgacgtaa 1501 acgtgaccgt gtttcaccaa cgcgcggaaa tgttttacga acaaagcgca gagcagcgtg 1561 gcaatgtgca gaccatcaga gtccgcatcc gcgaggatac agattttgcc ataacgaagc 1621 tggctcagat cgtcgctgtc aggatcgata ccgatcgcta ccgaaatatc gtgcacttcc 1681 tgcgaagcca gcacttcgtc ggaagagact tcccaggtgt taaggatctt acctttcagt 1741 ggcatgatcg cctgatattc gcgatcgcgc gcctgcttgg cagatccgcc tgcggagtca 1801 ccttccacaa ggaacagctc ggtacggtta aggtcctgcg cggtacaatc agccagtttg 1861 ccaggcaacg ccgggccgct ggtcagcttt ttacgcacca cttttttggc cgcacgcata 1921 cggcgctggg cgctggaaat cgccatctcc gccagcagtt cagccgcctg aacgttctgg 1981 ttcagccaca ggataaaggc atctttcacc acgccagaaa cgaatgccgc gcattgacgc 2041 gaagagagac gctctttcgt ctgcccggca aactgcggat cctgcatttt tactgacagc 2101 acataggcgc agcgatccca gatatcttcc gccgacagct ttacaccgcg cggcagaata 2161 ttgcggtatt cacagaactc acgcatcgcg tccaacaggc cctgacgcag accattaaca 2221 tgggtaccgc cctgcatcgt tgggataagg ttgacgtagc tttcggtcag cagttcaccg 2281 ccttccggca gccacagtag cgcccagtcc acagcttcag tatcaccagc gaaattaccg 2341 ataaacggtt tttccggcag cgtcggcaga ccatttaccg cttccgccag gtaatcattc 2401 agaccgtcct gatagcacca gcgttgttcg gtattgttga tctcatcttt aaaagtgatc 2461 tcaacgccag ggcacaatac cgctttggct ttcagcacat gcgtcaggcg tgaaacagaa 2521 aatcgcgggc tgtcaaagaa ggtttcatcc ggccagaagt gcacactggt accagtattg 2581 cgtttaccgc aagtgccgac aacctgtaaa tcctgcacct tttcgccatt ttcaaaggcg 2641 atgttataaa cctgaccatc gcggcgcacg ttaacttcta cgcgcttcga cagggcgtta 2701 accaccgaaa tccccacgcc atgcaggccg ccagagaact ggtaattttt gttagagaat 2761 ttaccgcctg catgcagacg gcaaagaatc agttcaaccg ccggtacacc ctcttccggg 2821 tgaatatcca ccggcatccc gcgcccatcg tcaataactt ctaacgactg gtcagcatgt 2881 aaaataacgt ccacgcgttt tgcgtgaccc gccagtgctt catccacact gttatcaatg 2941 acttcttgcc ccaaatggtt agggcgagtg gtatcggtat acatccccgg acggcggcga 3001 accggctcaa gcccggtgag tacctcaatg gcatcagcgt tataagtttg cgtcatggtt 3061 taagttagta attcgagttg atcgtcagag atggtgcaga ccaagaaaat cgacgatcgg 3121 gttgaaataa tcttcgaagc ccgtgaatgc gtggttgccg ccttctatga cagtctggcg 3181 gcaggaagcg tagtacgcca ccgcctggcg gtaatccagc acttcatctc ccgtctgttg 3241 cagcagccag atcaaatccg gcgcttccag cgggtcaatc tgcatgactt taagatcgta 3301 aatatggcgt gactctagca catattgctg cccggtgtag gggttctcgt tctgaccgag 3361 atagtccgtc agcagttcaa acgggcgcac cgccgggttt accaccactg cgggcagcat 3421 aaaacattgt gacaaccagg tggcgtaata tccccccagt gacgaaccga caatacccag 3481 cgaatcaccg ccatgttcca ggacaatgga ttccagcagc tctgccgcgt cggaaggata 3541 cggcggcaac tgcggaatga tcatctcaac gtcagggtga tgttccgcca gccagttttt 3601 taacaagctc gcttttgcag agcgcggcga gctgttgaaa ccgtgtaaat aaagaagcgt 3661 agacatcagt agccttctga agcggtatca ggttggaaac gtgtgtccgc caggcgatgc 3721 acctcggtgg tcagcgtgcc atcagcatgt aactcgagag tacgccagcc gggcgcgatg 3781 gtatccagcg taaagttgga acagtgcggc ttaaactgca cacaggtcga cggcgttgcc 3841 agcaggcggc gaccattcca gtcgagatcc agctcctgat gaatatgacc gcacagcaag 3901 tatttgacgt gcggaaactt cgccagcacg gtatccagtt cgcccgcgtt acgcagactg 3961 tgttgatcga gccaactaca acccgcaggt agcggatgat gatgcagcag cagcaacgta 4021 tggcgttctg gcgcatcggc cagtttacgt tccagccact caagctgaaa ctcgctcagc 4081 tcaccgtgcg gcacgccaaa cacctggcta tccagcaaca ggatttgcca ttgctcacca 4141 ataaacacgc gcttcgccgg ggagataccc gcatcctgta acgcgctgta catcgcgggc 4201 tggaaatcgt ggttgcccgg cagccagacg cagggcgcac gaaaacttgc gatgccttca 4261 gcgaaatgct gataggccgc agaggattga tcctgcgcta aatcacctgt cgcgacaatc 4321 aggtcgaatt cgtgctggtg tggccgaatc gcctccagca ccgcctggta actctcccag 4381 gtgtttaccc ctaacagggc ttcgtgcttt tgtgcaaaca ggtgagtgtc ggtaatttgt 4441 aaaatcctga ctctggcctc accagccaga ggaagggtta acaggctttc caaatggtgt 4501 ccttaggttt cacgacgcta ataaaccgga atcgccatcg ctccatgtgc taaacagtat 4561 cgcaaccagt ccgctaaaaa ctgattaatt tgatgctttt cgtcgcgttg atgcaacttt 4621 ttattaggat aatcataccg cgctttgaag cgaaaaatct gctggcttga acacacttca 4681 gccaccatcg cgtcatgata cagacgcacc gtcattgacg gaaggctcca gtaactgatc 4741 gcgggcgcag tctgttctat tgtcaccagg gtagtgtatc gggtcgattc cacaatcgtc 4801 agccgatatt gtgcgtttgc cacctgatag cttacagttt cgccgggtgc gtcattgcgc 4861 ggtaacaaac ggcgcaattg tgaaaagttc atctcgcaca ggcgcatcat ttcaggaaag 4921 tcaggtgtgt aacgcttcat ttatgcccac tcatttttta acgcttgatg atgcagctgc 4981 agccattgca aagcgatgac cgacgctgcg ttgtcgattt tcccctcttc tacccactgg 5041 tatgcctgtt cccggcttac cacatgaacg cgaatatctt cgttttcatc agccagaccg 5101 tgaataccgc ttgcggtcgt ggcgtccact tcgcccacca taattgacga acgctcactg 5161 gtgccccccg ggcttgccag gaaacttaac accggtttgg tccgtttgac tatcagtccc 5221 gcctcttcaa tcgcttcgcg acgggcaaca tcttccacac tttcaccctc ttcaatcatc 5281 ccggcaacca tctccagtag ccaaggggtt tcgctggtgt cgtacgcggc aatccgaatc 5341 tgctcaatca gcacaacttc atcacgcact gggtcaaagg gtagcaagac tgcggcgtga 5401 ccgcgctcaa aaatttcccg ccgcacctca tgactcattt gcccgttgaa tagacgatga 5461 cgaaatctat aaagatctaa tgaaaaaaag ccgcgataaa gtgtttctcg tgcaataatt 5521 tctacatcgt ttttgccaaa tgtaacgggc aggttgtctg gcttaagcat tgttaatgtc 5581 ctggcactaa tagtgaatta aatgtgaatt tcagcgacgt ttgactgccg tttgagcagt 5641 catgtgttaa attgaggcac attaacgccc tatggcacgt aacgccaacc ttttgcggta 5701 gcggcttctg ctagaatccg caataatttt acagtttgat cgcgctaaat actgcttcac 5761 cacaaggaat gcaaatgaag aaattgctcc ccattcttat cggcctgagc ctttctgggt 5821 tcagttcgtt gagccaggcc gagaacctga tgcaagttta tcagcaagca cgccttagta 5881 acccggaatt gcgtaagtct gccgccgatc gtgatgctgc ctttgaaaaa attaatgaag 5941 cgcgcagtcc attactgcca cagctaggtt taggtgcaga ttacacctat agcaacggct 6001 accgcgacgc gaacggcatc aactctaacg cgaccagtgc gtccttgcag ttaactcaat 6061 ccatttttga tatgtcgaaa tggcgtgcgt taacgctgca ggaaaaagca gcagggattc 6121 aggacgtcac gtatcagacc gatcagcaaa ccttgatcct caacaccgcg accgcttatt 6181 tcaacgtgtt gaatgctatt gacgttcttt cctatacaca ggcacaaaaa gaagcgatct 6241 accgtcaatt agatcaaacc acccaacgtt ttaacgtggg cctggtagcg atcaccgacg 6301 tgcagaacgc ccgcgcacag tacgataccg tgctggcgaa cgaagtgacc gcacgtaata 6361 accttgataa cgcggtagag cagctgcgcc agatcaccgg taactactat ccggaactgg 6421 ctgcgctgaa tgtcgaaaac tttaaaaccg acaaaccaca gccggttaac gcgctgctga 6481 aagaagccga aaaacgcaac ctgtcgctgt tacaggcacg cttgagccag gacctggcgc 6541 gcgagcaaat tcgccaggcg caggatggtc acttaccgac tctggattta acggcttcta 6601 ccgggatttc tgacacctct tatagcggtt cgaaaacccg tggtgccgct ggtacccagt 6661 atgacgatag caatatgggc cagaacaaag ttggcctgag cttctcgctg ccgatttatc 6721 agggcggaat ggttaactcg caggtgaaac aggcacagta caactttgtc ggtgccagcg 6781 agcaactgga aagtgcccat cgtagcgtcg tgcagaccgt gcgttcctcc ttcaacaaca 6841 ttaatgcatc tatcagtagc attaacgcct acaaacaagc cgtagtttcc gctcaaagct 6901 cattagacgc gatggaagcg ggctactcgg tcggtacgcg taccattgtt gatgtgttgg 6961 atgcgaccac cacgttgtac aacgccaagc aagagctggc gaatgcgcgt tataactacc 7021 tgattaatca gctgaatatt aagtcagctc tgggtacgtt gaacgagcag gatctgctgg 7081 cactgaacaa tgcgctgagc aaaccggttt ccactaatcc ggaaaacgtt gcaccgcaaa 7141 cgccggaaca gaatgctatt gctgatggtt atgcgcctga tagcccggca ccagtcgttc 7201 agcaaacatc cgcacgcact accaccagta acggtcataa ccctttccgt aactgatgac 7261 gacgacgggg cttcggcccc gtctgaacgt aaggcaacgt aaagatacgg gttatctgcc 7321 gcattcttcc cccttctcgc ttcaatttcg accagccatc ctctattctg atgggtattt 7381 accactggtc ccggaagaca aaaatgaaac ggacaaaatc catacgccac gcatcgttcc 7441 gcaaaaactg gagcgcacgc catctgacac cagtcgctct cgcggttgcc actgttttta 7501 tgctggctgg ctgtgaaaag agtgatgaaa cagtgtctct ctatcaaaat gctgacgact 7561 gttcagctgc aaacccaggc aaaagcgccg aatgtaccac cgcgtacaac aatgcgctga 7621 aagaagccga acgtactgcg ccgaaatacg ccacccgtga agactgtgtt gctgaatttg 7681 gtgaaggtca gtgccagcag gcaccagccc aggctggcat ggcaccagaa aaccaggcgc 7741 aggcccagca atccagcggg agtttctgga tgccgctgat ggccggttac atgatggggc 7801 gtctgatggg cggcggcgcg ggatttgcac agcagccgct gttctcctcg aaaaacccag 7861 ccagtccggc ttacggtaaa tataccgacg cgacgggtaa aaactatggc gcagcccagc 7921 caggccgcac catgaccgta ccgaagacgg caatggcacc aaaaccggcg accaccacta 7981 ccgttacccg tggcggtttt ggtgaatctg ttgccaaaca aagcactatg cagcgtagtg 8041 caaccggtac ctcttctcgt tcaatgggtg gctgataccg atggaaagag tcagtattac 8101 cgagcgcccg gactggcgtg agaaagccca cgaatacggt ttcaattttc acaccatgta 8161 cggcgagccg tactggtgtg aagatgctta ctacaagttg accctcgccc aggttgaaaa 8221 gctggaagaa gtcaccgccg aactgcacca gatgtgcctg aaagtggtgg aaaaagtgat 8281 cgccagcgat gagctgatga ccaaattccg cattccaaaa cacacctgga gttttgtgcg 8341 ccagtcatgg ctgacgcacc agccatcgct ttattcgcgt cttgatctgg cgtgggatgg 8401 cactggtgaa cctaaacttc tggaaaataa cgccgatacg ccaacgtcac tatacgaggc 8461 ggcgttcttt cagtggatct ggctggaaga tcagcttaac gccggtaact tgccggaggg 8521 cagcgaccag tttaacagtc tgcaagaaaa actgatcgat cgcttcgttg agctgcgtga 8581 acagtatggc ttccagttgc tgcatctcac ctgctgtcgc gacacggtgg aagatcgcgg 8641 aaccattcag tatttgcagg actgcgcaac ggaagctgaa attgctactg agttcctcta 8701 catcgatgat atcgggttag gtgaaaaagg tcagttcacg gatttacagg atcaggtaat 8761 ttccaacctg ttcaaactgt atccgtggga atttatgttg cgtgagatgt tctcaaccaa 8821 gctggaggat gcaggcgtac gctggctgga accggcgtgg aagagcatta tctccaacaa 8881 ggcacttcta ccgctactgt gggagatgtt cccgaatcac ccgaacctgc tgcccgctta 8941 ttttgcggaa gatgatcatc cgcaaatgga aaaatatgtg gttaaaccga tcttctcccg 9001 tgaaggcgca aacgtgtcga tcattgagaa cggcaaaacc attgaagcag cggaaggtcc 9061 gtatggcgaa gaagggatga ttgttcagca attccacccg ttaccgaaat tcggcgacag 9121 ctatatgctg attggtagct ggctggtgaa cgatcaaccc gccggaattg gcattcgtga 9181 agaccgtgca ttgatcaccc aggatatgtc tcggttttat ccacatattt ttgttgaata 9241 agccacgata ccggatggca ctcgccatcc ggtaattgtt agcctatctg caccgacagc 9301 atactcaggc tgcccatttc tataccctca accggaatgg taattggctc ctgcccatcc 9361 cacgcaccta acacatacaa caacggcaaa taatgctctg gcgttgggtt cgataacgtg 9421 ccaccttcat ggtcgaggta attcaccaga ggatgttgtt ccactggccc ttgccacgtc 9481 agattcgctt tcacatactc attaaacgac gtcgcccacg gatacggtga actatcaccg 9541 tgccacttca ctgtgcgcag gttatgcacc acgttaccgc tggcgaccaa cattattcct 9601 tcatctcgca gcgctgccag tttgcgcccc atttcgaaat gccaggcggc aggtttgcta 9661 ctgtcgatac tcaactgcac catcgggata tcagcgtcag gatacatctt aatcagcacg 9721 ccccacgagc cgtggtcaaa gccccaggct tctttatcca gcgtcaccgg gatcggcgct 9781 aacagctcaa ccagacgctg tgccagcgca ggcgaacccg gagcaggata atgcgtatcg 9841 tacagcgcct gcgggaagcc accaaagtca tgaatcgtgg gcggcgtctc catcgcggtc 9901 actcctgttc cacgggtaaa ccagtgagcc gaaaccacca caatcgcttg cgggcgtggc 9961 aatgtcatcc ccaacttctg ccagctgcgg gtatacaaat tatcttccag cacgttcatc 10021 ggactaccgt gacctaaaaa caatgctggc atacgtgttg aagacatgat gatatcctta 10081 actaaaggtg tcattttgat atcctcacaa tacgcttgtt cggcggagta agaacccgga 10141 taacaatgat gatgatcatc agttattttg acgatctgcc // . . . 20000519 14.51 Leszek: Target T0089 as attachment. Leszek T0089 >gi|4981356|gb|AE001750.1|AE001750 Thermotoga maritima section 62 of 136 of the complete genome Length = 13551 Score = 724 bits (1848), Expect = 0.0 Identities = 376/419 (89%), Positives = 377/419 (89%) Frame = +1 Query: 1 MIDLSKTVFYTSIDIGSRYIKGLVLGKRDQEWEALAFSSVKSRGLDEGEIKDAIAFKESV 60 +IDLSKTVFYTSIDIGSRYIKGLVLGKRDQEWEALAFSSVKSRGLDEGEIKDAIAFKESV Sbjct: 8281 VIDLSKTVFYTSIDIGSRYIKGLVLGKRDQEWEALAFSSVKSRGLDEGEIKDAIAFKESV 8460 Query: 61 NTXXXXXXXXXXXSLRXXXXXXXXXXXXEREDTVIERDFGEEKRSITLDILSEMQSEALE 120 NT SLR EREDTVIERDFGEEKRSITLDILSEMQSEALE Sbjct: 8461 NTLLKELEEQLQKSLRSDFVISFSSVSFEREDTVIERDFGEEKRSITLDILSEMQSEALE 8640 Query: 121 KLKENGKTPLHIFSKRYLLDDERIVFNPLDMKASKIAIEYTSIVVPLKVYEMFYNFLQDT 180 KLKENGKTPLHIFSKRYLLDDERIVFNPLDMKASKIAIEYTSIVVPLKVYEMFYNFLQDT Sbjct: 8641 KLKENGKTPLHIFSKRYLLDDERIVFNPLDMKASKIAIEYTSIVVPLKVYEMFYNFLQDT 8820 Query: 181 VKSPFQLKSSLVSTAEGVLTTPEKDRGVVVVNLGYNFTGLIAYKNGVPIKISYVPVGMKH 240 VKSPFQLKSSLVSTAEGVLTTPEKDRGVVVVNLGYNFTGLIAYKNGVPIKISYVPVGMKH Sbjct: 8821 VKSPFQLKSSLVSTAEGVLTTPEKDRGVVVVNLGYNFTGLIAYKNGVPIKISYVPVGMKH 9000 Query: 241 VIKDVSAVLDTSFEESERLIITHGNAVYNDLKEEEIQYRGLDGNTIKTTTAKKLSVIIHA 300 VIKDVSAVLDTSFEESERLIITHGNAVYNDLKEEEIQYRGLDGNTIKTTTAKKLSVIIHA Sbjct: 9001 VIKDVSAVLDTSFEESERLIITHGNAVYNDLKEEEIQYRGLDGNTIKTTTAKKLSVIIHA 9180 Query: 301 RLREIMSKSKKFFREVEAKXXXXXXXXXXXXXXXXXXXAKIPRINELATEVFKSPVRTGC 360 RLREIMSKSKKFFREVEAK AKIPRINELATEVFKSPVRTGC Sbjct: 9181 RLREIMSKSKKFFREVEAKIVEEGEIGIPGGVVLTGGGAKIPRINELATEVFKSPVRTGC 9360 Query: 361 YANSDRPSIINADEVANDPSFAAAFGNVFAVSENPYEETPVKSENPLKKIFRLFKELME 419 YANSDRPSIINADEVANDPSFAAAFGNVFAVSENPYEETPVKSENPLKKIFRLFKELME Sbjct: 9361 YANSDRPSIINADEVANDPSFAAAFGNVFAVSENPYEETPVKSENPLKKIFRLFKELME 9537 1: AE001750. Thermotoga maritim...[gi:4981356] LOCUS AE001750 13551 bp DNA BCT 02-JUN-1999 DEFINITION Thermotoga maritima section 62 of 136 of the complete genome. ACCESSION AE001750 AE000512 VERSION AE001750.1 GI:4981356 KEYWORDS . SOURCE Thermotoga maritima. ORGANISM Thermotoga maritima Bacteria; Thermotogales; Thermotoga. REFERENCE 1 (bases 1 to 13551) AUTHORS Nelson,K.E., Clayton,R.A., Gill,S.R., Gwinn,M.L., Dodson,R.J., Haft,D.H., Hickey,E.K., Peterson,J.D., Nelson,W.C., Ketchum,K.A., McDonald,L., Utterback,T.R., Malek,J.A., Linher,K.D., Garrett,M.M., Stewart,A.M., Cotton,M.D., Pratt,M.S., Phillips,C.A., Richardson,D., Heidelberg,J., Sutton,G.G., Fleischmann,R.D., White,O., Salzberg,S.L., Smith,H.O., Venter,J.C. and Fraser,C.M. TITLE Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima JOURNAL Nature 399, 323-329 (1999) REFERENCE 2 (bases 1 to 13551) AUTHORS Nelson,K.E., Clayton,R.A., Gill,S.R., Gwinn,M.L., Dodson,R.J., Haft,D.H., Hickey,E.K., Peterson,J.D., Nelson,W.C., Ketchum,K.A., McDonald,L., Utterback,T.R., Malek,J.A., Linher,K.D., Garrett,M.M., Stewart,A.M., Cotton,M.D., Pratt,M.S., Phillips,C.A., Richardson,D., Heidelberg,J., Sutton,G.G., Fleischmann,R.D., White,O., Salzberg,S.L., Smith,H.O., Venter,J.C. and Fraser,C.M. TITLE Direct Submission JOURNAL Submitted (01-JUN-1999) The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD 20850, USA FEATURES Location/Qualifiers source 1..13551 /organism="Thermotoga maritima" /db_xref="taxon:2336" gene complement(72..1043) /gene="TM0825" CDS complement(72..1043) /gene="TM0825" /note="similar to GB:AE000782 percent identity: 55.52; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="astB/chuR-related protein" /protein_id="AAD35907.1" /db_xref="GI:4981357" /translation="MRNPPFIVEYELTLRCNFRCKHCYCEAGKPHLEELSFEEIKELI LDMKELGTWALDIVGGEPLLHPQILDILAFGKEVGQRLMINTNGSLATKEMVQKIKKA NPDVLIGVSLEGPDPETNDFVRGTGNFKRAVQGIKNFVDEGFQVTILNVINKRNWRKF EDMVKLALELGVNALYVDRFIPVGRGMIHARELDMNPEEWRVAIKHVLGVIENYKNHL TFYVEESISGKPCSAGITHASVLADGTVVPCGHFRYRKEFYMGNVREKKFSEIWHEYT PIPSPASCQQCPILNECGGGCKAYYLLREHEKDEAICFLNKERYNIK" gene 1171..1341 /gene="TM0826" CDS 1171..1341 /gene="TM0826" /note="similar to percent identity: 0.00; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="AAD35908.1" /db_xref="GI:4981358" /translation="MEKKVSRNPRVLSQIDLLKLMTDGDESALREIKRRLGLDEKKER DNDEKQKVSDKI" gene 1310..2014 /gene="TM0827" CDS 1310..2014 /gene="TM0827" /note="similar to PID:2108228 percent identity: 57.14; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="ABC transporter, ATP-binding protein, putative" /protein_id="AAD35909.1" /db_xref="GI:4981359" /translation="MKSRKFPIRFENFSLSVDGKSVLENITLSFVEGMNVLYGPRGSG KSSLLRSIVKLNTEIFNEISRSGSVYLFDQNVEDLDDIYVRKNALYLDTSFIDAMNQY TFDEFLKLALKRKISLENFSEKLDDLGILRMLIRGQKTPLSVFSPAEKISLLLFILEQ KKPRVILMDCLLDHLDDENLEKIMDMFLKMKEERTFVISTRILQRFLYIADLLVILNN GRINYTGSPKDFVLRM" gene 2044..3003 /gene="TM0828" CDS 2044..3003 /gene="TM0828" /note="similar to SP:P11099 GB:X14827 PID:46605 percent identity: 54.05; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="sugar kinase, pfkB family" /protein_id="AAD35910.1" /db_xref="GI:4981360" /translation="MVLTVTLNPALDREIFIEDFQVNRLYRINDLSKTQMSPGGKGIN VSIALSKLGVPSVATGFVGGYMGKILVEELRKISKLITTNFVYVEGETRENIEIIDEK NKTITAINFPGPDVTDMDVNHFLRRYKMTLSKVDCVVISGSIPPGVNEGICNELVRLA RERGVFVFVEQTPRLLERIYEGPEFPNVVKPDLRGNHASFLGVDLKTFDDYVKLAEKL AEKSQVSVVSYEVKNDIVATREGVWLIRSKEEIDTSHLLGAGDAYVAGMVYYFIKHGA NFLEMAKFGFASALAATRRKEKYMPDLEAIKKEYDHFTVERVK" gene 3004..3456 /gene="TM0829" CDS 3004..3456 /gene="TM0829" /note="similar to GB:L77117 SP:Q58821 PID:1592076 percent identity: 60.77; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="conserved hypothetical protein" /protein_id="AAD35911.1" /db_xref="GI:4981361" /translation="MRVKDAVIYDISAVFEDETVETVIKLLSRQNLSGVPVVDHDMRV VGFVSESDLIKALVPSYFSLLRSASFIPDTNQLIRNVVKIKDRPVSEFMNKPPVVVKE DDPLIVAADYLIRHGFKSLPVVDEAMQLVGIVRRIDILRVVSEGKLEI" gene 3453..4757 /gene="TM0830" CDS 3453..4757 /gene="TM0830" /note="similar to SP:P54462 PID:1303812 PID:1890061 GB:AL009126 percent identity: 63.10; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="conserved hypothetical protein" /protein_id="AAD35912.1" /db_xref="GI:4981362" /translation="MKTVRIETFGCKVNQYESEYMAEQLEKAGYVVLPDGNAAYYIVN SCAVTKEVEKKVKRLIKSIRNRNKNAKIILTGCFAQLSPDEAKNLPVDMVLGIDEKKH IVDHINSLNGKQQVVVSEPGRPVYEKVKGSFEDRTRSYIKVEDGCDNTCTYCAIRLAR GTRIRSKPLEIFKEEFAEMVMKGYKEIVITGVNLGKYGKDMGSSLAELLKVIEKVPGD YRVRLSSINVEDVNDEIVKAFKRNPRLCPHLHISVQSGSDDVLKRMGRKYKISDFMRV VDKLRSIDPDFSITTDIIVGFPGETDADFQRTLELVEKVEFSRVHIFRFSPRPGTPAS RMEGGVPESKKKERLDVLKEKAKDVSIRYRKRIIGKERKVLAEWYVMKGVLSGYDEYY VKHEFVGNRVGEFHSVRVKSLSEEGVISCRADMVEGKVPARG" gene 4717..5538 /gene="TM0831" CDS 4717..5538 /gene="TM0831" /note="similar to PID:1622791 GB:AE000512 percent identity: 99.03; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="branched-chain amino acid aminotransferase, putative" /protein_id="AAD35913.1" /db_xref="GI:4981363" /translation="MLIWWRGKFRRADEISLDFSLFEKSLQGAVYETLRTYSRAPFAA YKHYTRLKRSADFFNLPLSLSFDEFTKVLKAGADEFKQEVRIKVYLFPDSGEVLFVFS PLNIPDLETGVEVKISNVRRIPDLSTPPALKITGRTDIVLARREIVDCYDVILLGLNG QVCEGSFSNVFLVKEGKLITPSLDSGILDGITRENVIKLAKSLEIPVEERVVWVWELF EADEMFLTHTSAGVVPVRRLNEHSFFEEEPGPVTATLMENFEPFVLNLEENWVGI" gene 5535..5840 /gene="TM0832" CDS 5535..5840 /gene="TM0832" /note="similar to percent identity: 0.00; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="AAD35914.1" /db_xref="GI:4981364" /translation="MILLRQLLDELSTKQPLFFELKIKMLLSEWDKIVGPVIARHTKV EKVENGTVYIVCDDSLWMTELTMQKDRLLKILNERSGKELFRDIKFRRGKVDGKVLR" gene 5821..7731 /gene="TM0833" CDS 5821..7731 /gene="TM0833" /note="similar to PID:1622792 GB:AE000512 percent identity: 100.00; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="DNA gyrase, subunit B" /protein_id="AAD35915.1" /db_xref="GI:4981365" /translation="MEKYSAESIKVLKGLEPVRMRPGMYIGSTGKRGLHHLVYEVVDN SVDEALAGYCDWIRVTLHEDGSVEVEDNGRGIPVDIHPEEGRSALEVVFTVLHAGGKF SKDSYKISGGLHGVGVSVVNALSEWLEVRVHRDGKIYRQRYERGKPVTPVEVIGETDK HGTIVRFKPDPLIFSETEFDPDILEHRLREIAFLVPGLKIEFEDRINGEKKTFKFDGG IVEYVKYLNRGKKALHDVIHIKRTEKVKTKNGEDEVIVEIAFQYTDSYSEDIVSFANT IKTVDGGTHVTAFKSTLTRLMNEYGKKHNFLKKDDSFQGEDVREGLTAVISVYVKNPE FEGQTKSKLGNEEVKEAVTKAMREELKKIFDANPELVKTILSKIMSTKQAREAAKRAR EMVRRKNVLQNTTLPGKLADCSSTHREKTELFIVEGDSAGGSAKQARDREFQAVLPIR GKILNVEKSSLDRLLKNEQISDIIVAVGTGIGDDFDESKLRYGRIIIMTDADIDGAHI RTLLLTLFYRYMRPLIEQGRVYIALPPLYRIKAGREEFYVYSDQELAEYKEKLQGKRI EIQRYKGLGEMNPEQLWETTMNPETRKIIRVTIEDAEEADRLFEILMGNDPSSRREFI ERHALKVKELDI" gene 7734..8288 /gene="TM0834" CDS 7734..8288 /gene="TM0834" /note="similar to percent identity: 0.00; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="AAD35916.1" /db_xref="GI:4981366" /translation="MRSLRVLLITMIVLYILFFVNSFFQSRREHVRVPEKVYSYLIEN FNISPKSIIIDSKKAIGIVFYEGNYYLCAEDGSLVASLSKKDLFKFYPVFLEVNLEGL RLSKSDREILEMLIPILKSSVVSAVFFESKEVVLLKGSRIMFEEWKDLVENFQVIMEQ SEKMKAKERYFLTDDGRLMWIRGD" gene 8281..9540 /gene="TM0835" CDS 8281..9540 /gene="TM0835" /note="similar to GB:AE000657 percent identity: 51.28; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="cell division protein FtsA, putative" /protein_id="AAD35917.1" /db_xref="GI:4981367" /translation="MIDLSKTVFYTSIDIGSRYIKGLVLGKRDQEWEALAFSSVKSRG LDEGEIKDAIAFKESVNTLLKELEEQLQKSLRSDFVISFSSVSFEREDTVIERDFGEE KRSITLDILSEMQSEALEKLKENGKTPLHIFSKRYLLDDERIVFNPLDMKASKIAIEY TSIVVPLKVYEMFYNFLQDTVKSPFQLKSSLVSTAEGVLTTPEKDRGVVVVNLGYNFT GLIAYKNGVPIKISYVPVGMKHVIKDVSAVLDTSFEESERLIITHGNAVYNDLKEEEI QYRGLDGNTIKTTTAKKLSVIIHARLREIMSKSKKFFREVEAKIVEEGEIGIPGGVVL TGGGAKIPRINELATEVFKSPVRTGCYANSDRPSIINADEVANDPSFAAAFGNVFAVS ENPYEETPVKSENPLKKIFRLFKELME" gene 9557..10612 /gene="TM0836" CDS 9557..10612 /gene="TM0836" /note="similar to PID:2104497 GB:AE000512 percent identity: 99.72; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="cell division protein FtsZ" /protein_id="AAD35918.1" /db_xref="GI:4981368" /translation="MGFDLDVEKKKENRNIPQANNLKIKVIGVGGAGNNAINRMIEIG IHGVEFVAVNTDLQVLEASNADVKIQIGENITRGLGAGGRPEIGEQAALESEEKIREV LQDTHMVFITAGFGGGTGTGASPVIAKIAKEMGILTVAIVTTPFYFEGPERLKKAIEG LKKLRKHVDTLIKISNNKLMEELPRDVKIKDAFLKADETLHQGVKGISELITKRGYIN LDFADIESVMKDAGAAILGIGVGKGEHRAREAAKKAMESKLIEHPVENASSIVFNITA PSNIRMEEVHEAAMIIRQNSSEDADVKFGLIFDDEVPDDEIRVIFIATRFPDEDKILF PEGDIPAIYRYGLEGLL" gene 10612..12312 /gene="TM0837" CDS 10612..12312 /gene="TM0837" /note="similar to GB:AE000657 percent identity: 67.38; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="general secretion pathway protein E" /protein_id="AAD35919.1" /db_xref="GI:4981369" /translation="MLRRYRKLGEILLEKGFITREELDKALEIQKEERKPLGEVLIET GYITEDQLLEALSEQYGVPILKELPKNIPLNVVGSIPKNIIESLHVIPVEKKEDGTLV VVTDNGTNIPRIKQEIRFLTGKNPEIYLVTSRDFSVLYQTYVLGVPLELFEEPYVAIE ETPEQVEEEEEEEREVEEAPIVRLVNNIVNRAIEMGASDIHIEPMKRTVRVRFRIDGI LRKVLEYQKPQHNSVVARIKIMSGLDVSERRLPQDGKFYTIKGGEQYDFRVSTMPSTF GEKVVMRILKVSDANKRLEELGYSEYNLKRILSLLEKPYGIILVTGPTGSGKSTTLVA MINYLKSESVNIVTAEDPVEYTIEGVTQCQVFPEIGLTFARYLRAFLRQDPDIIMVGE IRDRETAQLAVEASLTGHLVLSTLHTNTAAGAVSRLIEMGIDPHLLGASLIGIIGQRL VRKLCDECKMPGEVRDEQVKSYFEQFFGKVPDQIYYPSEEGCPACKGMRYRGRMAIGE VLIVDEELRELISSKASETEIAKLAVKKGMRTMFQDALEKVLLGQTSIEEVFRVTTPL " gene 12309..13514 /gene="TM0838" CDS 12309..13514 /gene="TM0838" /note="similar to percent identity: 0.00; identified by sequence similarity; putative" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="AAD35920.1" /db_xref="GI:4981370" /translation="MKYFSKSPEEWQKRDEEKEKEFRNRKNRIRRRANFILVVNLVIV VFLVFFTKAFFSNKPEGVIGPFQLVIETKESYLPNDPLDVRVKVFNREKKKENLVLED FVFSIKRENDTVYEFHFPQRVEKEMEAFESVLVFDLLREEELSNLPGGNYTITVSVKL NGQRVVISKVVSVIEKWQIEVEDLKDFYFPYENVHFFVYLENISSKSRKIRVESIGLI ILKGNEAVFERDIPIEKDFVINPMMVEQIHEVSFSAPKESGDYIIKLKLKTESSLIEK SIPLFVTREYQKDLKGLSLVIEGKKFVASGERYDFSVKLLNEEKKRKYIVLKNIMIVL THKEPVFSYAYSEEYRMTIEGYSSREIFKTTSYDIIKLEDPGTYKLIVVIESEEDRLM KEMEIVVSE" BASE COUNT 4200 a 2560 c 3385 g 3406 t ORIGIN 1 ccatattttt ttacttaccg gtcggtaaaa atattctatg cacacataat aacaatcgag 61 tttcatttaa gttactttat gttgtaccgt tctttgttca ggaaacaaat agcttcgtct 121 ttttcgtgct ctctcagcaa gtagtaagcc ttgcaacctc caccgcattc gttcaggatg 181 ggacactgct ggcaggaagc cggggaagga atgggagtgt attcgtgcca gatttccgag 241 aatttctttt ctctaacgtt ccccatgtag aattcctttc tgtatctgaa gtgcccacag 301 gggacgactg tgccatcggc cagaaccgat gcatgtgtga ttcctgcgga acacggcttt 361 cctgatatag attcttccac gtaaaatgtc aaatggtttt tgtaattttc tatcactcca 421 aggacgtgtt ttatagccac tctccactcc tcgggattca tatccagttc tcttgcatgg 481 atcatgcccc ttccaaccgg tatgaaccta tccacataca aagcattaac tcccaattcc 541 agtgcgagtt ttaccatatc ttcgaacttt ctccagtttc tcttgtttat cacgttcaaa 601 atggtgacct gaaacccttc atccacgaag tttttgatgc cttgaactgc tcttttgaag 661 tttccggttc ctctcacaaa gtcgttggtc tcaggatctg gtccttcaag ggaaactcca 721 ataagcacgt cgggattggc cttctttatt ttctgtacca tttccttggt ggcgagcgaa 781 ccgttggtgt tgatcatgag acgctgtccg acctcttttc cgaaggcaag aatatccagt 841 atttgagggt gaagaagagg ttcgccaccc acaatatcga gagcccaggt gcccaattct 901 ttcatatcga gtatcagttc ctttatttct tcgaaactca gctcttcaag atggggtttt 961 ccggcctcgc agtagcagtg tttacatctg aagttgcatc gcagtgtcag ttcgtattcc 1021 acaataaaag gtgggtttcg catccaatca cctccaggat agattttaac cctggagacg 1081 ttttattaca ttctccgggt tttaattgta atgatccata caagtccaat taaaggtttg 1141 gggttagaat caaaaacgag gtgatgcctc ttggagaaga aagtatccag aaatccccgt 1201 gttttatcgc agatagattt gctcaaattg atgaccgatg gagatgaaag tgctctcagg 1261 gaaataaaga gacgtttggg ccttgacgag aagaaagaaa gagataacga tgaaaagcag 1321 aaagtttccg ataagatttg aaaacttttc cttatccgtg gatggtaaaa gtgtgcttga 1381 gaatattact ctttcctttg ttgaaggaat gaacgtcctt tacggtccaa gaggatctgg 1441 gaaatcttcc ctcctcagat caattgtcaa gttgaacacc gaaatattca atgagatttc 1501 cagaagcggt tccgtctatt tgttcgatca aaacgttgag gatcttgatg acatttacgt 1561 tagaaaaaac gctctgtatc tcgatacgag cttcatcgat gcgatgaacc aatacacatt 1621 cgatgaattt cttaaactcg ctctgaaaag aaagatttct ctggagaatt tttccgagaa 1681 actcgatgat cttggaatac tgagaatgct tattcgcggt caaaagaccc ccctttctgt 1741 gttttcgcct gccgaaaaga tctccctcct tcttttcatt ctcgagcaga aaaagcctcg 1801 cgttatttta atggactgtc tgctggatca cctggacgat gaaaatctgg agaagataat 1861 ggacatgttc ctcaagatga aggaagaacg aacttttgtc atatccaccc gcattcttca 1921 gaggtttctc tacatcgccg atttgttggt gatactgaat aatggtagaa tcaattatac 1981 ggggagtcca aaggattttg ttttgagaat gtgactcctc acttttgact tcgaggtgat 2041 cgtatggtcc tgacggtgac cctcaatccc gctttagaca gggaaatttt catagaagat 2101 ttccaggtga acaggctgta caggataaat gatctttcta aaactcaaat gtccccaggt 2161 ggaaagggta taaacgtttc tatagctctt tccaaattgg gagtgccgtc agtcgccacc 2221 ggattcgtgg gtgggtacat gggaaagatt cttgtggaag aattgagaaa aatttcaaag 2281 ctgatcacaa cgaattttgt ttacgtggag ggagaaacga gggaaaacat cgaaataatc 2341 gatgaaaaga ataaaacaat aaccgctatc aattttcccg gtccggatgt tacggatatg 2401 gatgtgaatc actttctgag aagatacaag atgacgcttt cgaaagtcga ttgcgttgtt 2461 atatcgggga gtattccacc gggagtaaac gagggaatat gcaacgaact ggtgagactc 2521 gcgcgtgaaa gaggagtctt tgtctttgtc gaacagactc ccagattgct tgaaagaatt 2581 tatgaggggc cagaatttcc caacgtcgtg aaaccagatt tgagaggtaa tcatgcttct 2641 tttcttggtg ttgatttgaa aacctttgac gattacgtga aactcgccga aaagctcgcc 2701 gaaaaatcgc aggtttccgt ggtgtcatac gaagtgaaga acgatattgt tgcaacaaga 2761 gaaggggtct ggcttataag atccaaagaa gaaatagaca catcacatct tctgggtgcg 2821 ggagacgcgt atgtagctgg tatggtctat tacttcatta aacacggagc caacttcctc 2881 gaaatggcaa aatttggttt tgcctccgca ctggctgcca ctagaagaaa ggaaaaatac 2941 atgcccgacc ttgaggcgat aaagaaggaa tacgatcact tcaccgtaga gagggtgaag 3001 taaatgaggg taaaggatgc cgtcatctat gatatttccg ctgtttttga agatgaaacg 3061 gttgaaacgg tgataaagct cctctccagg cagaatcttt ctggagttcc cgttgttgac 3121 catgatatga gggtggttgg ctttgtcagt gaaagcgacc tcataaaggc tcttgtaccc 3181 agttatttct cgcttttgag atctgcttcg ttcattccag acaccaacca gctgataaga 3241 aacgtagtca aaataaagga cagacccgta tctgaattta tgaacaagcc tcccgttgtt 3301 gtgaaagaag acgatccgct cattgtagca gctgattatc tcataagaca cggctttaaa 3361 tcgcttcctg ttgttgacga agctatgcag ctcgttggta tagtgagaag aatagatatt 3421 ctgagggtag tttcggaggg gaagctggaa atatgaaaac tgtcaggata gagaccttcg 3481 gctgtaaggt gaatcagtac gaaagtgaat acatggctga acagttagaa aaagccgggt 3541 acgtggtttt acccgacggg aatgctgctt actatatagt gaactcgtgt gccgttacca 3601 aggaagtcga gaagaaagtg aagagattga taaagagtat cagaaacagg aacaagaacg 3661 caaaaatcat cctcaccggt tgttttgccc agctttctcc ggatgaggca aagaatcttc 3721 cggtggacat ggttctggga atcgatgaga agaaacacat agtggatcat attaactcct 3781 tgaatggcaa acaacaggtt gttgtatctg aacccggccg gcccgtttac gagaaagtga 3841 aaggaagttt cgaagacaga acccgttctt acataaaggt tgaagacggg tgtgacaaca 3901 cctgcaccta ctgtgctatc aggttggccc gtggcacaag aatcagaagt aagccgctcg 3961 aaatattcaa agaggaattc gcggaaatgg taatgaaagg ttacaaagag atcgtgataa 4021 ccggggtcaa tcttggaaaa tatggaaagg acatgggttc ttcacttgct gaacttctaa 4081 aggtcatcga gaaagttcct ggggattatc gagtgcgtct cagctccata aatgtggaag 4141 acgtgaacga cgaaattgtg aaggctttca agagaaaccc cagactgtgt ccgcatcttc 4201 atatttcggt tcagagcgga tccgatgatg tgttgaaaag gatgggaaga aaatacaaaa 4261 tctctgattt catgcgtgtt gttgacaaat tgagaagtat cgatccagat ttttcaataa 4321 caacggacat aatcgttggt ttccctggag aaacggatgc cgacttccag agaactcttg 4381 aactggtgga gaaagtggag ttcagcagag tacacatttt ccgattctca ccacgccccg 4441 gaactccagc gagcagaatg gaaggcggtg ttcccgaatc gaagaagaaa gagcgtctgg 4501 atgttttaaa agagaaagcg aaagatgttt ctatcaggta caggaagaga atcattggga 4561 aagaaagaaa agtactcgct gagtggtacg ttatgaaagg tgttctttct ggctatgacg 4621 agtactacgt gaagcacgag tttgtaggaa atagagtggg agaatttcat agtgtgaggg 4681 tgaaatccct ttcagaagag ggagtgattt cctgtcgtgc tgatatggtg gaggggaaag 4741 ttccggcgcg cggatgaaat ttcactggat ttctctctgt ttgaaaagtc gctccaggga 4801 gcggtgtatg aaactctgag aacttacagt agggccccgt ttgcggctta caaacattac 4861 actaggctga aaagatctgc cgatttcttt aatcttcctc tttctctgag cttcgatgaa 4921 ttcacaaagg tgctgaaggc gggagctgat gagttcaaac aagaagtgag aatcaaagtc 4981 tatcttttcc ctgattccgg ggaggttctc tttgtgttct caccgctcaa tataccagac 5041 ctggagacag gagtcgaggt gaaaatatcg aacgttcgaa ggataccgga tctttctact 5101 ccaccagctc taaaaataac tggacgcacc gacatagtgc ttgcaagaag ggaaatagtg 5161 gactgctacg atgtgatttt gctcggtttg aacgggcagg tctgtgaggg aagtttcagc 5221 aatgtgtttc tggtgaaaga aggcaaattg attacaccat cgcttgacag cggtattctg 5281 gatggcatca cacgggaaaa cgttataaag ctcgcaaaaa gtctcgagat acctgttgaa 5341 gaaagagtag tgtgggtgtg ggaactgttc gaagcggatg aaatgtttct cacacacaca 5401 agtgcgggag tggttcctgt cagaaggttg aacgagcact cgtttttcga ggaagagcca 5461 ggacctgtga cagcgacttt gatggaaaac tttgagcctt ttgtcttaaa tctggaggaa 5521 aactgggtgg gaatatgatt cttctcagac agctgcttga tgagctttcc accaaacaac 5581 ctcttttttt tgaactgaag ataaagatgc tcctttctga atgggacaag atagttggac 5641 ctgtgatagc gaggcacacg aaggtggaaa aagtggagaa cgggacagtt tacattgttt 5701 gcgatgattc gctgtggatg acggaactca ccatgcagaa agatcgcttg ctcaaaattc 5761 tgaatgaaag atcgggcaaa gaactgttca gagatataaa gttcaggagg ggaaaagtag 5821 atggaaaagt actccgctga aagtataaag gttttaaaag gactggaacc tgtccgaatg 5881 cgacccggaa tgtacatagg atccacgggc aaacgtggat tgcatcacct cgtgtacgaa 5941 gtggttgaca acagtgttga tgaggcactc gctggatact gcgactggat acgtgtgact 6001 ctccatgaag atggaagtgt ggaagtcgag gacaacggaa ggggaatccc cgttgacata 6061 catccagaag agggaagaag cgctctggaa gtggttttca cagttcttca tgcgggcgga 6121 aaattctcca aggattccta caagataagc ggtggcctgc acggtgttgg tgtatcggtg 6181 gtgaacgctc tttcggagtg gctggaagtg cgggtacatc gtgacgggaa gatctacaga 6241 caaagatacg aaaggggtaa accggtcaca cctgtggaag tgataggaga aaccgataag 6301 cacggcacga tcgttcgatt caaacccgat cctctcatat tttcggagac agagttcgat 6361 cccgacatac tcgaacacag attgagagag atagctttcc tagtaccagg actcaaaata 6421 gaattcgaag acagaataaa cggagagaag aaaaccttca aattcgacgg aggcatagtg 6481 gagtatgtga agtatttgaa ccgtgggaaa aaagctctac acgatgtgat acacataaag 6541 agaacggaaa aagtgaaaac aaaaaatggt gaagacgaag tgatagtgga gatcgccttt 6601 caatacactg attcgtattc tgaagatatc gtgtcttttg caaacaccat aaaaactgtc 6661 gatggtggaa ctcatgttac agcctttaag agcaccctca caagactcat gaacgagtat 6721 gggaaaaaac ataattttct caagaaagac gattccttcc agggtgagga tgttagagaa 6781 ggcctcacgg cggttatcag tgtttacgtg aagaaccccg aattcgaagg gcaaacaaag 6841 tcaaaactcg gcaacgaaga agtcaaggaa gcggtaacca aggctatgag agaagagctg 6901 aaaaaaatat tcgacgcgaa tcccgagctc gtgaaaacaa ttctttcaaa gatcatgagc 6961 accaaacagg caagagaagc ggcaaagaga gctcgagaaa tggtgagaag gaaaaacgtt 7021 cttcaaaaca caacacttcc cggaaaactg gcagactgta gttccacaca tagagagaaa 7081 acggagcttt tcatcgtgga aggtgactct gccgggggat cggcaaaaca ggccagagac 7141 agagaatttc aggctgttct tcccataaga ggaaagattt tgaacgtgga aaaatcatct 7201 ctcgatagat tactgaaaaa cgagcagatc agtgacataa ttgttgcagt gggaacgggc 7261 ataggagatg atttcgacga aagcaagctc agatacggta ggattatcat catgaccgac 7321 gccgatatag acggtgcaca cataagaact ctccttctga cactcttcta tagatacatg 7381 agacccctca tcgaacaggg cagggtatac atagcactcc cacctctcta caggatcaaa 7441 gctggaagag aagagttcta cgtttacagt gaccaggagc tggcggaata caaagagaaa 7501 ctccagggaa aaaggatcga aattcaacgg tacaaaggac ttggcgagat gaatcctgag 7561 cagctctggg aaacaacgat gaatccagaa acgagaaaaa tcataagggt aactatagaa 7621 gatgccgaag aagcggatag actctttgaa attctcatgg gtaacgatcc atccagtaga 7681 agggagttca tagaaaggca cgctctgaaa gtgaaagaac tggatatcta gtcatgagat 7741 ccctgagagt tcttctcatc acaatgattg ttctttacat tctgttcttt gtgaattctt 7801 tttttcaaag tcgtagagaa cacgtgaggg ttccagaaaa agtgtattca tatttgatag 7861 aaaattttaa catttctcca aaaagtatta taatagattc gaaaaaggct atagggatag 7921 ttttctatga agggaattat tatttatgcg ctgaagatgg gagtttggta gcttcgttat 7981 cgaaaaagga cctattcaaa ttctacccag tgtttctgga agttaatctt gaagggcttc 8041 gactttcaaa gagcgacagg gaaattctgg agatgttgat tccgattctg aaaagttctg 8101 tagtgtccgc tgtttttttt gagtcaaaag aagttgttct gctgaaggga tccagaatca 8161 tgtttgaaga atggaaagat cttgtcgaga actttcaggt aataatggag caatctgaga 8221 aaatgaaagc gaaggaaagg tatttcctga cggatgatgg aagattgatg tggataaggg 8281 gtgattgact tgtcaaaaac tgtcttttat acctcgattg atataggttc acgatatata 8341 aaaggactcg ttctcggaaa acgtgatcaa gaatgggaag cactggcctt ttcaagtgtg 8401 aaatcgagag gattggatga aggtgaaata aaagacgcca tcgctttcaa agaatctgtg 8461 aacacactgc ttaaagaact cgaagaacag ctgcaaaagt ctctgagaag tgattttgtc 8521 atttctttca gcagcgtgag ttttgaaaga gaagacaccg tgatcgaaag ggacttcggt 8581 gaagagaaac gatctatcac cctggacatc ttgagtgaga tgcaatcgga agctcttgag 8641 aaattgaaag agaacgggaa aactcctctt cacatatttt cgaaaagata tctcctggac 8701 gacgaaagga tcgttttcaa tcctctcgat atgaaggctt cgaagatagc catcgagtac 8761 acatccattg tcgtaccttt gaaagtttat gagatgttct acaatttctt gcaggatacc 8821 gttaaaagtc cgtttcagtt gaagtcttct cttgtttcaa ctgccgaggg agttctcacc 8881 acgcctgaga aagaccgagg tgtggtggtt gtgaacctgg gatacaattt caccgggttg 8941 atagcgtaca aaaacggtgt tcccataaag atatcctacg ttcctgtggg gatgaagcat 9001 gttataaaag atgtttctgc agttctcgac acctcctttg aagaatcgga aagactcatt 9061 ataactcatg gaaacgcagt ttataacgat ttgaaagagg aggaaataca gtacagaggg 9121 ctcgatggaa acaccataaa aacgactact gcgaagaaac tttccgtcat tatacacgca 9181 cgtctcagag agataatgag caaatccaag aaattcttca gggaagttga agcgaagata 9241 gtagaagaag gggaaatagg gatacctggt ggtgtggttc tcaccggggg aggggccaag 9301 attccaagaa taaacgaact ggccacggag gtgtttaaat ctccggtgag aacgggttgt 9361 tacgcgaatt ccgacagacc atccatcatc aatgcggatg aagttgccaa tgacccttcc 9421 ttcgctgctg cgtttggaaa tgttttcgcc gtttcggaaa atccttatga ggaaactccg 9481 gtgaaaagtg aaaatccgct gaagaaaatt ttcagactct tcaaagaatt gatggagtga 9541 gtaggaggtt aagagtatgg gatttgacct tgatgttgag aagaaaaagg aaaatagaaa 9601 catccctcag gcgaacaatt tgaagataaa agtcataggt gtaggcggtg ccggaaacaa 9661 tgccataaac agaatgatag aaataggaat acacggtgtt gaatttgtcg cggtgaacac 9721 cgatcttcag gtactggagg cttctaacgc tgacgtgaag attcagatcg gtgagaacat 9781 cacacgaggc cttggtgcag ggggacgtcc cgagatagga gaacaagctg ctcttgagag 9841 cgaggaaaaa atcagagaag tgctccagga tacccacatg gtcttcataa ctgcagggtt 9901 cggaggaggg acgggaacag gtgcttcccc tgtcatagca aagatagcca aagaaatggg 9961 aatcctgact gtagccatag tgacgactcc cttctatttc gaaggcccgg agagactcaa 10021 aaaagcgata gaggggctca aaaaacttcg aaaacatgtg gataccctca taaagatatc 10081 caacaacaaa ctcatggaag aactcccgag ggatgtcaag ataaaggatg cttttctgaa 10141 ggctgacgag actcttcacc agggagtgaa aggaatttcg gaactgataa cgaagagggg 10201 ttacataaat ctcgactttg cagacataga gtcggtaatg aaggacgccg gtgctgcgat 10261 ccttggcata ggtgttggaa agggagaaca cagagcgaga gaagctgcta agaaggctat 10321 ggaaagcaag ttgatagagc atcctgtaga aaatgcgagt tcgattgtgt tcaatataac 10381 tgctcccagc aatataagaa tggaggaagt gcacgaagca gctatgatca taaggcagaa 10441 cagcagtgag gacgcggacg tcaaattcgg tctcatcttc gatgatgaag tacccgatga 10501 cgaaatacgt gtgatcttca tagctacaag attcccagat gaagacaaga ttctattccc 10561 ggaaggtgac ataccggcca tttacagata cggtctggag gggcttcttt aatgctgaga 10621 agatacagaa aattgggtga aattcttctg gaaaaaggat tcataaccag agaagagttg 10681 gataaggctc ttgaaattca aaaagaggaa aggaaaccgc tcggagaagt tctaattgag 10741 acagggtata tcacggagga tcagctgctg gaggctttga gtgaacaata cggagtccca 10801 atcctgaagg aactaccaaa aaacatacca ctcaacgtgg tgggttctat tccaaagaac 10861 atcatcgaat ctctgcacgt cattcctgta gaaaagaagg aagacggaac tctcgtggtg 10921 gtgacggata acggaacgaa cattccaagg atcaaacagg agatacgttt tctcacgggg 10981 aagaatccgg agatttatct ggtgacgagc agagattttt ctgttttgta tcaaacgtat 11041 gttcttggtg ttcctctcga acttttcgag gagccttacg ttgctataga agaaacacca 11101 gaacaggttg aggaagaaga ggaagaagaa agggaagtag aagaagctcc catcgtccgg 11161 ctcgtaaaca atatcgtaaa ccgtgccata gaaatgggag cgagtgatat ccatatagaa 11221 ccgatgaaaa gaaccgtgag ggtgagattc agaatagatg gtattttgag aaaagttctt 11281 gagtatcaaa agccacagca caactcagtt gtggcacgaa ttaaaattat gagtggtctc 11341 gatgtgtcgg aaagaagact tccacaggac gggaagtttt acacgataaa aggcggggaa 11401 cagtacgatt ttcgcgtgtc aaccatgcct tccactttcg gagaaaaagt cgtgatgaga 11461 atactgaagg tgtccgatgc aaacaaacgc ctcgaggaac ttggatacag tgaatacaat 11521 ctgaaacgaa ttctatcact gttagaaaaa ccctacggta tcatcctggt caccggacca 11581 acgggaagcg gaaaatccac cacacttgtc gctatgataa attatttgaa gagtgaaagt 11641 gtaaacatag taacagcgga agaccctgtg gaatacacca tcgaaggggt tacccaatgt 11701 caggttttcc cagaaatagg tctcacgttt gcacggtact tgagagcttt cctcaggcag 11761 gaccctgaca tcatcatggt gggtgagatc agagacagag aaaccgcaca gcttgcggtg 11821 gaagcttccc tcacgggaca cttagttctg agtacacttc acaccaacac cgctgctggt 11881 gctgtttcaa ggctcataga aatgggaata gatcctcacc ttctcggagc ctcccttatc 11941 ggaatcatag gtcagaggct ggtgagaaag ctctgtgatg agtgtaaaat gcctggagag 12001 gttcgagatg aacaggtaaa atcttatttc gaacaattct ttggaaaggt accagatcag 12061 atttactatc cttcagaaga aggatgtcct gcctgcaagg ggatgagata cagagggaga 12121 atggctatag gggaagtttt gattgtggat gaagagttga gagaattgat ctcttccaaa 12181 gcgagtgaaa cagagattgc aaagctcgca gtgaaaaaag gtatgcgcac catgttccag 12241 gatgccctcg aaaaggttct tcttgggcag acgagtatcg aagaggtttt cagggtgacc 12301 acaccgctat gaagtacttt tctaaaagcc ctgaagaatg gcagaagaga gacgaagaaa 12361 aggaaaaaga gttcagaaac agaaaaaata gaattcgccg cagggcgaat tttattcttg 12421 tcgtgaatct tgtaatagtg gttttcctgg tgtttttcac gaaagcgttt ttttccaaca 12481 aaccggaagg tgtgattgga cccttccagc tggtgataga aacaaaagag tcctaccttc 12541 caaacgatcc cctcgatgtg cgtgtaaaag tgttcaacag agagaagaaa aaagaaaatc 12601 ttgtgctgga agatttcgta ttttccatca aaagggagaa tgacacggtg tacgagtttc 12661 actttccaca gcgtgtggaa aaagagatgg aagctttcga aagtgtcctg gtttttgatc 12721 ttttgaggga ggaagagctg tcaaatctac ctggagggaa ttacacgatc acagtctctg 12781 tcaagttgaa cggccagagg gtggttatca gtaaggtcgt ttcggtgatt gaaaagtggc 12841 aaatcgaggt tgaagacctg aaagattttt atttccctta cgaaaacgtg catttctttg 12901 tctatctgga aaacataagc tcaaagagtc ggaaaattcg tgtggagtcg atcggcctta 12961 ttatccttaa agggaatgag gctgtctttg aaagagatat tcctatagaa aaagacttcg 13021 tgataaatcc gatgatggtc gaacagatcc acgaagtgag tttttccgcc ccaaaagaat 13081 cgggagatta catcatcaaa ttgaaattga agacagaaag cagtctgata gaaaaatcaa 13141 ttcccctttt cgtcacgcga gagtatcaaa aagatttgaa gggactctcc cttgtcatcg 13201 aagggaagaa gtttgtagca tccggtgaaa gatacgattt ctctgtgaag cttttgaacg 13261 aagaaaagaa acgaaagtac atagttttga aaaacatcat gattgttttg acgcacaaag 13321 aaccggtgtt ttcgtatgct tattctgaag agtacagaat gacgatcgaa ggatactcct 13381 caagggaaat ttttaagacg accagttacg atatcataaa gctcgaggat cccgggactt 13441 acaaactcat cgttgtgata gaaagtgaag aagacaggct gatgaaggag atggagatag 13501 tcgtttcaga gtgaggtggg accataccac acgagaacaa acgaatagac t // 20000521 13.56 Leszek: Dear Alexander, I can not send You the DNA sequence for the 3 new targets yet because the Blast search at NCBI is not working. I will try to do it later. Yours, Leszek . . . 20000521 15.01 Leszek: T0092 >emb|U32717|HI32717 Haemophilus influenzae Rd section 32 of 163 of the complete genome. Length = 11661 Score = 486 bits (1238), Expect = e-135 Identities = 241/241 (100%), Positives = 241/241 (100%) Frame = +3 Query: 1 MVKDTLFSTPIAKLGDFIFDENVAEVFPDMIQRSVPGYSNIITAIGMLAERFVTADSNVY 60 MVKDTLFSTPIAKLGDFIFDENVAEVFPDMIQRSVPGYSNIITAIGMLAERFVTADSNVY Sbjct: 4635 MVKDTLFSTPIAKLGDFIFDENVAEVFPDMIQRSVPGYSNIITAIGMLAERFVTADSNVY 4814 Query: 61 DLGCSRGAATLSARRNINQPNVKIIGIDNSQPMVERCRQHIAAYHSEIPVEILCNDIRHV 120 DLGCSRGAATLSARRNINQPNVKIIGIDNSQPMVERCRQHIAAYHSEIPVEILCNDIRHV Sbjct: 4815 DLGCSRGAATLSARRNINQPNVKIIGIDNSQPMVERCRQHIAAYHSEIPVEILCNDIRHV 4994 Query: 121 EIKNASMVILNFTLQFLPPEDRIALLTKIYEGLNPNGVLVLSEKFRFEDTKINHLLIDLH 180 EIKNASMVILNFTLQFLPPEDRIALLTKIYEGLNPNGVLVLSEKFRFEDTKINHLLIDLH Sbjct: 4995 EIKNASMVILNFTLQFLPPEDRIALLTKIYEGLNPNGVLVLSEKFRFEDTKINHLLIDLH 5174 Query: 181 HQFKRANGYSELEVSQKRTALENVMRTDSIETHKVRLKNVGFSQVELWFQCFNFGSMIAV 240 HQFKRANGYSELEVSQKRTALENVMRTDSIETHKVRLKNVGFSQVELWFQCFNFGSMIAV Sbjct: 5175 HQFKRANGYSELEVSQKRTALENVMRTDSIETHKVRLKNVGFSQVELWFQCFNFGSMIAV 5354 Query: 241 K 241 K Sbjct: 5355 K 5357 ID HI32717 standard; DNA; PRO; 11661 BP. XX AC U32717; L42023; XX SV U32717.1 XX DT 09-AUG-1995 (Rel. 44, Created) DT 15-JUN-1998 (Rel. 56, Last updated, Version 9) XX DE Haemophilus influenzae Rd section 32 of 163 of the complete genome. XX KW . XX OS Haemophilus influenzae Rd OC Bacteria; Proteobacteria; gamma subdivision; Pasteurellaceae; Haemophilus; OC Haemophilus influenzae. XX RN [1] RP 1-11661 RX MEDLINE; 95350630. RA Fleischmann R.D., Adams M.D., White O., Clayton R.A., Kirkness E.F., RA Kerlavage A.R., Bult C.J., Tomb J., Dougherty B.A., Merrick J.M., RA McKenney K., Sutton G.G., FitzHugh W., Fields C.A., Gocayne J.D., RA Scott J.D., Shirley R., Liu L.I., Glodek A., Kelley J.M., Weidman J.F., RA Phillips C.A., Spriggs T., Hedblom E., Cotton M.D., Utterback T., RA Hanna M.C., Nguyen D.T., Saudek D.M., Brandon R.C., Fine L.D., RA Fritchman J.L., Fuhrmann J.L., Geoghagen N.S., Gnehm C.L., McDonald L.A., RA Small K.V., Fraser C.M., Smith H.O., Venter J.C.; RT "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd"; RL Science 269(5223):496-512(1995). XX RN [2] RP 1-11661 RX MEDLINE; 96398784. RA Tatusov R.L., Mushegian A.R., Bork P., Brown N.P., Hayes W.S., RA Borodovsky M., Rudd K.E., Koonin E.V.; RT "Metabolism and evolution of Haemophilus influenzae deduced from a RT whole-genome comparison with Escherichia coli"; RL Curr. Biol. 6(3):279-291(1996). XX RN [3] RP 1-11661 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D.; RT ; RL Submitted (25-JUL-1995) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX RN [4] RP 1-11661 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D.; RT ; RL Submitted (27-SEP-1997) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX RN [5] RP 1-11661 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D., Peterson J., RA Hickey E., Dodson R., Gwinn M.; RT ; RL Submitted (28-MAY-1998) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX DR SWISS-PROT; P43771; EFP_HAEIN. DR SWISS-PROT; P43817; SYD_HAEIN. DR SWISS-PROT; P43984; Y318_HAEIN. DR SWISS-PROT; P43985; YECO_HAEIN. DR SWISS-PROT; P43987; Y326_HAEIN. DR SWISS-PROT; P44633; RUVC_HAEIN. DR SWISS-PROT; P44634; YEBC_HAEIN. DR SWISS-PROT; P44635; NTPA_HAEIN. DR SWISS-PROT; P44638; LGUL_HAEIN. DR SWISS-PROT; P44639; RNT_HAEIN. DR SWISS-PROT; P44640; Y325_HAEIN. DR SWISS-PROT; P44641; YJEK_HAEIN. DR SWISS-PROT; Q57122; Y322_HAEIN. DR SWISS-PROT; Q57534; Y321_HAEIN. XX FH Key Location/Qualifiers FH FT source 1..11661 FT /db_xref="taxon:71421" FT /organism="Haemophilus influenzae Rd" FT CDS complement(62..634) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44633" FT /note="similar to GB:D10165 SP:P24239 PID:216653 PID:42175 FT GB:U00096 percent identity: 78.53; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0314" FT /product="crossover junction endodeoxyribonuclease (ruvC)" FT /protein_id="AAC21978.1" FT /translation="MSIILGIDPGSRVTGYGVIRQTGKHLEYLGSGAIRTQVEDLPTRL FT KRIYAGVTEIITQFQPNMFAIEQVFMAKNADSALKLGQARGTAIVAAVNNDLPVFEYAA FT RLVKQTVVGIGSADKVQVQEMVTRILKLSDKPQADAADALAIAITHAHSIQHSLHIANS FT VKMTETQEKMTALLKTRYSRGRFRLKI" FT CDS complement(681..1421) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44634" FT /note="similar to GB:D10165 SP:P24237 PID:216652 PID:42173 FT GB:U00096 percent identity: 75.20; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0315" FT /product="conserved hypothetical protein" FT /protein_id="AAC21979.1" FT /translation="MAGHSKWANIKHRKAAQDAQRGKIFTKLIRELVTAAKIGGGDVSA FT NPRLRAAVDKALSNNMTRDTINRAIDRGVGGGDDTNMETKIYEGYGPGGTAVMVECLSD FT NANRTISQVRPSFTKCGGNLGTEGSVGYLFSKKGLILIAEADEDALTEAAIEAGADDIQ FT PQDDGSFEIYTAWEDLGSVRDGIEAAGFKVQEAEVTMIPSTTVDLDIETAPKLLRLIDM FT LEDCDDVQNVYHNGEICDEVASQL" FT CDS complement(1581..2057) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44635" FT /note="similar to GB:D10165 SP:P24236 PID:581152 PID:669113 FT PID:912430 percent identity: 48.95; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0316" FT /product="datP pyrophosphohydrolase (ntpA)" FT /protein_id="AAC21980.1" FT /translation="MRSDLTAFLMMQYKNNQSVLVVIYTKDTNRVLMLQRQDDPDFWQS FT VTGTIESDETPKKTAIRELWEEVRLDISENSTALFDCNESIEFEIFPHFRYKYAPNITH FT CKEHWFLCEVEKEFIPVLSEHLDFCWVSAKKAVEMTKSQNNAEAIKKYLFNLRR" FT CDS complement(2079..3845) FT /codon_start=1 FT /db_xref="SWISS-PROT:P43817" FT /note="similar to SP:P21889 GB:X53863 GB:X53984 PID:41015 FT PID:43085 percent identity: 76.24; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0317" FT /product="aspartyl-tRNA synthetase (aspS)" FT /protein_id="AAC21981.1" FT /translation="MMRTHYCGALNRNNIGQDVTLSGWVHRRRDLGGLIFIDMRDRDGI FT VQVCFDPKYQDALTAAAGLRNEFCIQIKGEVIARPDNQINKNMATGEVEVLAKELRIYN FT ASDVLPLDFNQNNTEEQRLKYRYLDLRRPEMAQRLKTRAKITSFVRRFMDDNGFLDIET FT PMLTKATPEGARDYLVPSRVHKGKFYALPQSPQLFKQLLMMSGFDRYYQIVKCFRDEDL FT RADRQPEFTQIDVETSFLTAPEVREIMERMVHGLWLDTIGVDLGKFPVMTWQEAMRRFG FT SDKPDLRNPLEMVDVADIVKDVEFKVFNEPANNPNGRVAVIRVPNGAEITRKQIDEYTQ FT FVGIYGAKGLAWAKVNDINAGLEGVQSPIAKFLNEDVWKGLAERVNAQTGDILFFGADK FT WQTTTDAMGALRLKLGRDLGLTRLDEWQPLWVIDFPMFERDEEGNLAAMHHPFTSPKDF FT SPEQLEADPTSAVANAYDMVINGYEVGGGSVRIFDPKMQQTVFRILGIDEEQQREKFGF FT LLDALKFGTPPHAGLAFGLDRLTMLLTGTENIRDVIAFPKTTAAACLMTEAPSFANPQA FT LEELAISVVKAE" FT CDS 4064..4582 FT /codon_start=1 FT /db_xref="SWISS-PROT:P43984" FT /note="similar to SP:P54158 PID:1256620 GB:AL009126 percent FT identity: 23.78; identified by sequence similarity; FT putative" FT /transl_table=11 FT /gene="HI0318" FT /product="conserved hypothetical protein" FT /protein_id="AAC21982.1" FT /translation="MLFINITFACILAIRFYSLSISIRHEKALIAKGAIQYGKRNSTLL FT SIAHVAFYFAAIIEANKQNLSFNSTSQIGLAILIFAIAMLFYVIYELKEIWTVKIYILP FT EHQINRSFLFKYVRHPNYFLNIIPELIGLSLFCQAKYTALVGLPIYLLILAVRIKQEES FT AMSHLFPKS" FT CDS 4635..5360 FT /codon_start=1 FT /db_xref="SWISS-PROT:P43985" FT /note="similar to GB:U00096 PID:1788177 percent identity: FT 70.71; identified by sequence similarity; putative" FT /transl_table=11 FT /gene="HI0319" FT /product="conserved hypothetical protein" FT /protein_id="AAC21983.1" FT /translation="MVKDTLFSTPIAKLGDFIFDENVAEVFPDMIQRSVPGYSNIITAI FT GMLAERFVTADSNVYDLGCSRGAATLSARRNINQPNVKIIGIDNSQPMVERCRQHIAAY FT HSEIPVEILCNDIRHVEIKNASMVILNFTLQFLPPEDRIALLTKIYEGLNPNGVLVLSE FT KFRFEDTKINHLLIDLHHQFKRANGYSELEVSQKRTALENVMRTDSIETHKVRLKNVGF FT SQVELWFQCFNFGSMIAVK" FT CDS 5454..5690 FT /codon_start=1 FT /db_xref="SWISS-PROT:Q57534" FT /note="similar to GB:L22308 GB:L31763 GB:M74565 PID:145069 FT PID:438535 percent identity: 42.37; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0321" FT /product="virulence associated protein B (vapB)" FT /protein_id="AAC21984.1" FT /translation="MLTKVFQSGNSQAVRIPMDFRFDVDTVEIFRKENGDVVLRPVSKK FT TDDFLALFEGFDETFIQALEARDDLPPQERENL" FT CDS 5687..6091 FT /codon_start=1 FT /db_xref="SWISS-PROT:Q57122" FT /note="similar to GB:L22308 GB:L31763 GB:M74565 PID:145068 FT PID:438534 percent identity: 35.38; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0322" FT /product="virulence associated protein C, putative" FT /protein_id="AAC21985.1" FT /translation="MIYMLDTNIIIYLMKNRPKIIAERVSQLLPNDRLVMSFITYAELI FT KGAFGSQNYEQSIRAIELLTERVNVLYPNEQICLHYGKWANTLKKQGRPIGNNDLWFAC FT HALSLNAVLITHNVKEFQRITDLQWQDWTK" FT CDS 6158..6565 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44638" FT /note="similar to GB:U00096 SP:Q59384 PID:1354845 FT PID:1711245 PID:1787940 percent identity: 74.07; identified FT by sequence similarity; putative" FT /transl_table=11 FT /gene="HI0323" FT /product="lactoylglutathione lyase (gloA)" FT /protein_id="AAC21986.1" FT /translation="MQILHTMLRVGDLDRSIKFYQDVLGMRLLRTSENPEYKYTLAFLG FT YEDGESAAEIELTYNWGVDKYEHGTAYGHIAIGVDDIYATCEAVRASGGNVTREAGPVK FT GGSTVIAFVEDPDGYKIEFIENKSTKSGLGN" FT CDS 6639..7328 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44639" FT /note="similar to GB:L01622 SP:P30014 PID:147688 GB:U00096 FT PID:1742725 percent identity: 66.18; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0324" FT /product="ribonuclease T (rnt)" FT /protein_id="AAC21987.1" FT /translation="MSDSQEIPYHNQLKNRFRGYFPVIIDVETAGFDAKKDALLELAAI FT TLKMDENGYLHPDQKCHFHIKPFEGANINPESLKFNGIDIHNPLRGAVSELDAITGLFQ FT MVRRGQKDADCQRSIIVAHNAAFDQSFVMAAAERTGVKRNPFHPFGMFDTASLAGLMFG FT QTVLVKACQAAKIPFDGKQAHSALYDTERTAKLFCYMVNHLKDLGGFPHIASELEQEKT FT TEKETAL" FT CDS 7642..8994 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44640" FT /note="similar to GB:AL009126 percent identity: 52.27; FT identified by sequence similarity; putative" FT /transl_table=11 FT /gene="HI0325" FT /product="conserved hypothetical protein" FT /protein_id="AAC21988.1" FT /translation="MLSNPVVISIIVLLALSLLRINVIIALVIAALTAGFIGDLGLTKT FT IETFTGGLGGGAEVAMNYAILGAFAIAISKSGITDLIAYKIITKMNKTPTAGNLTWFKY FT FIFAVLALFAISSQNLLPVHIAFIPIVVPPLLSIFNRLKIDRRAVACVLTFGLTATYIL FT LPVGFGKIFIESILVKNINQAGATLGLQTNVAQVSLAMLLPVIGMILGLLTAIFITYRK FT PREYNINVEEATTKDIEAHIANIKPKQIVASLIAIVATFATQLVTSSTIIGGLIGLIIF FT VLCGIFKLKESNDIFQQGLRLMAMIGFVMIAASGFANVINATTGVTDLVQSLSSGVVQS FT KGIAALLMLVVGLLITMGIGSSFSTVPIITSIYVPLCLSFGFSPLATISIVGVAAALGD FT AGSPASDSTLGPTSGLNMDGKHDHIWDSVVPTFIHYNIPLLVFGWIAAMYL" FT CDS 9027..9290 FT /codon_start=1 FT /db_xref="SWISS-PROT:P43987" FT /note="hypothetical protein; identified by GeneMark; FT putative" FT /transl_table=11 FT /gene="HI0326" FT /product="H. influenzae predicted coding region HI0326" FT /protein_id="AAC21991.1" FT /translation="MTVQQLIQRLDQKVQQLYQAHLSKREEKIFAKFDRTLFSENGQNV FT SFYLKEINQTLDRIKTLESNDSNHYNFLAERLLANVPFFRKL" FT CDS complement(9964..10530) FT /codon_start=1 FT /db_xref="SWISS-PROT:P43771" FT /note="similar to GB:U14003 SP:P33398 GB:X61676 PID:433670 FT PID:536991 percent identity: 75.00; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0328" FT /product="elongation factor P (efp)" FT /protein_id="AAC21989.1" FT /translation="MATYTTSDFKPGLKFMQDGEPCVIVENEFVKPGKGQAFTRTRIRK FT LISGKVLDVNFKSGTSVEAADVMDLNLTYSYKDDAFWYFMHPETFEQYSADAKAVGDAE FT KWLLDQADCIVTLWNGAPITVTPPNFVELEIVDTDPGLKGDTAGTGGKPATLSTGAVVK FT VPLFVQIGEVIRVDTRSGEYVSRVK" FT CDS 10568..11584 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44641" FT /note="similar to GB:U14003 SP:P39280 PID:536990 GB:U00096 FT PID:1790589 percent identity: 62.04; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0329" FT /product="conserved hypothetical protein" FT /protein_id="AAC21990.1" FT /translation="MRILPQEPVIREEQNWLTILKNAISDPKLLLKALNLPEDDFEQSI FT AARKLFSLRVPQPFIDKIEKGNPQDPLFLQVMCSDLEFVQAEGFSTDPLEEKNANAVPN FT ILHKYRNRLLFMAKGGCAVNCRYCFRRHFPYDENPGNKKSWQLALDYIAAHSEIEEVIF FT SGGDPLMAKDHELAWLIKHLENIPHLQRLRIHTRLPVVIPQRITDEFCTLLAETRLQTV FT MVTHINHPNEIDQIFAHAMQKLNAVNVTLLNQSVLLKGVNDDAQILKILSDKLFQTGIL FT PYYLHLLDKVQGASHFLISDIEAMQIYKTLQSLTSGYLVPKLAREIAGEPNKTLYAE" XX SQ Sequence 11661 BP; 3571 A; 2316 C; 2104 G; 3670 T; 0 other; attcttccga atgataaaaa aattaacgca tagaataaaa taaaactgga cgaacatcca 60 gttaaatttt taatctaaag cgtcctcggc tatatctggt ctttaaaagt gcggtcattt 120 tttcttgcgt ttctgtcatt ttcacagaat tggcaatatg taaagaatgt tgaatagaat 180 gcgcgtgtgt aatcgcaata gccaacgcat ccgctgcatc ggcttgaggt ttatctgaca 240 acttcaaaat acgagtcacc atttcttgca cttgcacttt atcagcggaa ccaatgccca 300 ccactgtttg ttttactaaa cgtgcggcat attcaaaaac aggtaaatca ttatttaccg 360 ctgcaacaat cgccgtgccg cgagcctgcc caagtttcaa tgctgaatcc gcattcttcg 420 ccataaacac ttgctcaatc gcaaacatat taggttgaaa ttgcgtgatg atttcagtta 480 ctccagcata aatgcgtttc aaacgggtgg gtaaatcttc aacttgagta cgaattgcgc 540 cactaccgag atattctaaa tgttttcctg tttgacgaat cacaccgtaa ccagttacgc 600 gagagcctgg gtcaatacct aaaataatgc tcataatttc tataaaaaat gggtaattaa 660 attacccatt tattgcatca ttaaagttga gatgccactt cgtcacagat ttctccgttg 720 tgatatacgt tttgtacgtc atcacaatct tccaacatat caattaaacg aagtaatttt 780 ggtgcagttt caatatcaag atcgacggtt gttgatggaa tcatcgttac ttcagcttct 840 tgtactttaa aaccagctgc ttcaatgcca tcgcgcactg aacctaaatc ttcccaagca 900 gtatagattt caaatgaacc atcatcttgt ggttgaatat catcagcacc cgcttcgatt 960 gccgcttcag ttaaggcatc ttcatccgct tctgcgatta agattaaacc ttttttgcta 1020 aataaataac ccacagaacc ttctgttccc aagttaccac cacatttagt aaaacttggg 1080 cgtacttgtg agatcgtacg gtttgcatta tcacttaaac attccaccat aaccgctgta 1140 ccgcctgggc cataaccttc atagattttg gtttccatat tggtatcatc gccaccgcct 1200 acaccacgat caatagcgcg gttgatagta tcgcgcgtca tattgttgct aagtgcttta 1260 tctactgctg cacgtaaacg tgggttagca ctcacatcgc caccaccaat tttagctgcg 1320 gtaacaagtt cacgaattaa tttagtaaaa attttaccgc gttgtgcatc ttgtgctgct 1380 ttgcggtgtt taatattagc ccacttacta tgacctgcca ttttatttcc ttctgttttg 1440 atcttattga tcttgattta attaaacgtt ttgacctagg ctaggtatag cctaggttag 1500 taaaagtgtt accccgttgg gagtggatgt agcttttccc aaaagagaac cgctatcacc 1560 ataattagct gtgattatac ttacctacgg agattaaaaa gatatttttt aatagcttcc 1620 gcattattct gggatttcgt catttctact gcttttttcg ccgacaccca gcaaaaatct 1680 aaatgctcac tcaatactgg aataaattct ttttccacct cacacaaaaa ccaatgttct 1740 ttgcagtgag taatattcgg tgcgtattta tagcggaaat gtggaaaaat ttcaaattct 1800 atgctctcat tgcaatcaaa aagtgcggtg gaattttccg aaatgtctaa ccgcacttct 1860 tcccatagct cacgaattgc tgtttttttt ggtgtctcat cactttcaat agtgccagta 1920 acagactgcc aaaaatcagg atcatcttgg cgttggagca ttaaaactct gtttgtatct 1980 ttagtgtaaa ttacaacgag aacagattga ttatttttgt attgcatcat taaaaatgcg 2040 gtcaaatctg accgcacttc attcctaata tttcaaactt attctgcttt aacgacagaa 2100 attgccaact cttccaatgc ttgcggattt gcaaaactcg gggcttccgt cattaagcac 2160 gctgctgccg tggttttcgg aaaggcaatc acatcacgaa tgttttccgt gcctgttaaa 2220 agcatggtta agcggtctaa accgaatgct aaaccagcat gtggtggagt accaaatttt 2280 aacgcatcta ataagaaacc gaatttctct cgttgttgtt cttcgtcaat accaagaata 2340 cggaacacag tttgttgcat tttcggatca aaaatacgca cagaaccacc acccacttcg 2400 tagccgttga tgaccatatc gtaagcattg gctaccgcac ttgttggatc agcctctaat 2460 tgctctgggc tgaaatcttt tggtgaagtg aatggatggt gcattgctgc aagattacct 2520 tcttcatcac gttcaaacat tgggaaatca atgacccaaa gcggttgcca ttcgtctaaa 2580 cgagttaagc caagatcacg acctaatttc aaacgtaacg cacccattgc atcagtggta 2640 gtttgccatt tgtctgcacc aaagaataaa atatcgccag tttgtgcatt cacacgttct 2700 gctaaccctt tccatacatc ttcatttaag aatttcgcaa tcggactttg cacgccttca 2760 agaccagcat taatatcgtt tacttttgcc caagccaaac ctttcgcacc gtagatgcct 2820 acaaattgtg tatattcatc aatttgttta cgagtaattt ctgcaccatt tggtactcga 2880 atgactgcaa cacggccatt tggattgttt gctggctcat taaatacttt aaattcaaca 2940 tctttgacaa tatctgctac atctaccatt tctaatggat tacgcaaatc tggcttatca 3000 gaaccaaaac gacgcattgc ttcttgccaa gtcatcactg ggaatttacc taagtccaca 3060 ccaatggtat caagccataa gccgtgcacc atgcgttcca taatttcgcg gacttctggc 3120 gcagttagga aagaagtttc cacatcgatt tgagtaaact caggctgacg atctgctcgt 3180 aaatcttcat cacggaaaca ttttacgatt tgataatagc ggtcaaaacc agacatcatt 3240 aaaagctgtt tgaaaagctg tggtgattgc ggcaatgcat agaatttgcc tttatgcaca 3300 cggcttggca ctaaatagtc gcgcgcgcct tctggcgttg ctttggttag cattggggtt 3360 tcaatatcaa gaaaaccatt atcatccata aagcgacgca caaagctggt gatttttgca 3420 cgggttttca aacgctgcgc catttcagga cgacgtaaat ctaaataacg gtattttaaa 3480 cgttgttctt cggtgttatt ttggttgaag tctaatggta aaacgtcaga agcattgtag 3540 atgcgtaatt ctttcgctaa cacttccact tcgcctgttg ccatattttt attaatttga 3600 ttatcaggac gagcgatcac ctcgccttta atttgaatac agaattcatt acgtaaccca 3660 gcagccgctg tcaatgcatc ttgatattta ggatcaaaac aaacttgcac aataccatca 3720 cgatcgcgca tatcaataaa aatcaagcca cctaaatcac gacggcgatg aacccaaccg 3780 cttaatgtta cgtcttgtcc gatattgtta cggtttaatg ctccgcaata atgtgtacgc 3840 atcatttcaa aatcctttgt actcttttaa ttatgctcaa aaatctgtcg caattatagc 3900 ggaaaaccaa gagagataaa aaggttattt atgctgacaa atcaatcaaa aaagcaaaga 3960 aaagtaataa atattcaaaa taaaggcttt tttgtgccat atcatcagat tttttgttat 4020 attatgccgc ttgttgagat taaattacgt taaggataac cacatgttat ttatcaatat 4080 tacttttgcc tgtattttag cgatccgttt ttacagtctg tctatttcaa ttcgtcatga 4140 aaaagcactc attgcaaagg gtgcaataca gtatggtaaa cgcaattcca cactgttgtc 4200 cattgcccat gttgcatttt attttgcggc gatcattgag gctaataagc aaaacctgtc 4260 atttaatagc acttcacaaa tagggttggc gattttaatt tttgctattg caatgctatt 4320 ttatgtgatt tatgagctaa aagagatttg gacagtaaaa atttatattt taccagaaca 4380 tcaaattaat cgatctttct tgtttaaata cgtacgtcac ccaaattatt tcttaaacat 4440 tattcccgaa ttaattggtt tatcgctttt ctgtcaggct aaatataccg cattggttgg 4500 actacctatt tatttgctta tcttagccgt gcgtattaaa caagaagaga gtgccatgtc 4560 gcatttattt cctaaatcat aatttataaa tttcttctaa aagaacgact ctttctctct 4620 gcctttaaca cattatggta aaagatactc tattttctac ccccattgct aaattggggg 4680 atttcatctt tgacgaaaac gttgctgaag tctttccaga tatgattcaa cgttccgtgc 4740 cgggctattc taacattatt actgcaatcg gtatgctggc tgaacgtttc gtcacggctg 4800 atagtaacgt ttatgatcta ggttgctcac gaggagctgc cacactttct gcacgtcgaa 4860 atattaatca acccaacgta aaaattattg gtatcgataa ttctcaaccg atggttgaac 4920 gttgtcgcca acatattgcg gcatatcata gtgagatacc agtagaaatt ctctgtaatg 4980 atattcgcca cgttgaaatt aaaaatgcct caatggtcat tctcaacttc accttgcaat 5040 ttttaccgcc tgaagatcgc atcgcattgc ttaccaaaat ctatgaaggt ttaaatccaa 5100 atggcgtatt agtactttct gaaaaattcc gttttgaaga tactaaaatt aatcatttac 5160 tcattgactt gcaccatcaa ttcaaacgtg ccaatggtta tagcgaatta gaagtgagcc 5220 aaaaacgcac cgcacttgaa aatgtgatgc gtacagattc tatcgagaca cacaaagtgc 5280 ggttaaaaaa cgtaggattt tcacaagtag aactttggtt ccaatgcttt aattttggct 5340 cgatgattgc ggttaaataa aacaacggta atttgatctt cttgcttgca tacagcaatt 5400 gaaatgatta gtatatactt atttaaatac atagtatata cgagagggta aatatgctta 5460 ctaaagtgtt tcaaagtggt aacagccaag ctgttcggat cccgatggac tttcgttttg 5520 acgtcgatac cgtagaaatt ttccgaaagg aaaatgggga tgtggtatta cgcccagttt 5580 ctaaaaaaac agatgatttt cttgcgttat ttgaaggatt tgatgagacc tttattcaag 5640 cacttgaagc gcgtgatgat ttaccgcctc aggagcgaga aaatttatga tttatatgtt 5700 agacaccaat atcattattt atttaatgaa aaatcgcccc aaaattattg ccgaacgagt 5760 atcacaatta ttgcctaatg atcgcttagt tatgagcttt attacttatg ctgaacttat 5820 taaaggcgcc tttggtagtc aaaattatga gcaatcaata cgagcaatag aattacttac 5880 tgaacgagtg aatgtactat atcccaatga acaaatctgt ttacattatg gcaaatgggc 5940 aaatacactc aaaaaacaag ggcgacctat cggaaataat gatctatggt tcgcttgtca 6000 cgcattgagt ttaaatgccg ttcttattac acataatgta aaagaatttc agcgaattac 6060 agatcttcag tggcaagatt ggacaaaata gataattctg attttttatt ttacttatga 6120 cataatctgt ttaactcaac taaaaaagga caaaccaatg caaattttac acactatgtt 6180 acgcgtgggc gatttagatc gttcaatcaa attttatcaa gatgttttag gtatgcgctt 6240 attacgcacc agtgaaaacc cagaatacaa atatacgctg gcttttttag gttacgaaga 6300 tggcgaaagt gcggcagaaa ttgaattaac atacaactgg ggcgtagata aatatgaaca 6360 tggcacagca tatggacata tcgctatcgg cgtggatgat atttatgcga cttgtgaagc 6420 cgttcgcgcg agcggtggta acgttactcg cgaagcaggg ccagttaaag gtggctcaac 6480 cgtcattgct tttgttgaag atccagatgg ttataaaatt gaatttatcg aaaacaaaag 6540 tactaaatct ggtttaggca actaaagtgc ggtcaatttt tattgtattt ttaataacgt 6600 agggattcct acgttatttt atttttaaga gatagattat gtccgattct caagaaatcc 6660 cttatcacaa ccaattaaaa aatcgctttc gtggctattt ccccgtcatt attgatgttg 6720 aaactgcagg ttttgatgca aaaaaagatg cgttactcga actagccgcc atcacattaa 6780 aaatggatga aaacggttat ttgcatcctg atcagaaatg ccattttcat atcaagcctt 6840 ttgaaggcgc aaatattaat ccagaatcct tgaaatttaa tggtattgat attcacaatc 6900 cactacgtgg cgcggtgtct gaacttgatg caatcacggg attatttcaa atggtgcgtc 6960 gaggacaaaa agatgcggat tgccaacgtt ctatcattgt ggcgcacaat gccgcttttg 7020 accaaagttt tgtgatggct gccgctgaac gcactggcgt aaagcgtaat cccttccacc 7080 cttttggcat gtttgatacc gcaagtttag cgggcttaat gtttggtcag actgttctcg 7140 ttaaagcctg ccaagcggca aaaattcctt ttgatggtaa acaagcccat tcagcgttat 7200 atgataccga acgcacggcg aaattattct gctatatggt caatcattta aaagatttag 7260 gtggtttccc acatattgcg tctgaactag aacaggaaaa aacaactgaa aaagagaccg 7320 cactttaaga aaaataaaaa actggttgca atcgttttat ttttccagta gtttaaaagc 7380 aattcttatt tggagataaa aaatgaacat tatccgtcgt caccatcatc accattcaac 7440 tgaataatct ttcaggtttt tggtgtgctt ggaagatttc ttccgagctg acaggataat 7500 aaaaatctta cccctcggaa ggcaactttc gaggggtttt gttttattta cgatatagga 7560 ggcattaatt tagcgttatt ctttattcaa taaaaatttt aaagtaaagt atttatcatt 7620 taatttacaa ggaaaaacac aatgttatca aaccctgttg ttatctcaat catcgttttg 7680 ctggcactta gtttgctacg catcaacgtt attatcgcac tggtgattgc tgcgcttacc 7740 gcaggattta tcggtgattt aggtttaact aaaaccatcg aaacctttac aggcggctta 7800 ggtggcggtg ctgaagtcgc aatgaactat gcaatacttg gagcgttcgc cattgctatt 7860 tctaaatcag gaattactga tttaatcgct tataaaatca tcacaaaaat gaataaaaca 7920 ccaactgctg gcaatttaac ttggtttaaa tatttcattt ttgcggtatt ggcattgttt 7980 gccatctcct ctcaaaactt acttccagtg catattgctt ttattccaat cgttgtacca 8040 ccactacttt caatctttaa ccgtttaaaa attgaccgcc gtgcagtagc ttgcgtactt 8100 acatttggtt taactgctac ctatatatta ttaccagtag gctttggaaa aatctttatt 8160 gaaagtattt tagttaaaaa cattaatcaa gctggcgcaa ccttaggttt acaaacaaat 8220 gtagcacaag tttctttagc aatgttgtta cctgttattg gaatgatttt aggcttactc 8280 acagccatat ttattaccta ccgtaaacca cgtgaataca atatcaatgt tgaagaggca 8340 acaactaaag atattgaagc tcacattgct aatattaaac caaaacaaat tgttgcaagt 8400 ttaatcgcga ttgtcgcgac cttcgcaaca caattagtca caagttcaac tatcattggc 8460 ggtttaattg gcttaattat tttcgtacta tgtggcattt tcaaactaaa agaaagcaac 8520 gatatttttc aacaaggctt gcgcttaatg gcaatgattg gctttgtaat gattgctgct 8580 tcaggttttg caaatgtgat taatgccacc actggcgtaa cagatttagt acaaagccta 8640 agttctggtg ttgtgcaaag taaaggtatt gccgcattat taatgcttgt cgttggttta 8700 ttaattacaa tgggtattgg ctcatcattc tctacagtac cgattattac atcaatttac 8760 gtaccactct gtttatcttt tggtttttct ccattagcga caatttctat tgtaggcgta 8820 gcagcggctt taggagatgc aggttcgccc gcatcagact caacattagg cccaacatct 8880 ggcttgaata tggatggcaa acacgatcac atttgggatt ctgttgtacc tacctttatt 8940 cactataata ttccattatt ggtattcggc tggattgcag caatgtacct ttaacgatta 9000 taatccccca cttgggggat ttttttatga ctgttcaaca acttatccaa cgcttagatc 9060 aaaaagttca acaactttat caagctcatt tatccaaacg agaagaaaaa atcttcgcca 9120 aatttgaccg cactttgttc agtgaaaatg gacaaaatgt ttccttttat ctaaaagaaa 9180 tcaatcaaac gctagataga ataaagacat tagagtctaa cgattctaat cactataatt 9240 ttctagctga gcgtttactc gccaatgttc cgttctttcg gaagctttag ttcgcaaaaa 9300 tactcatcta actgaatccc aaacaacaac gaaacaaacc attcaaaaat ctcaacatag 9360 cattcataaa ttaccaccaa gagaacgcct agaaaaatat tacgaagcac gagaacaatt 9420 gaataatctt tatcgacagc ataaagattt agcacaagct gaaaaaaata atgatgagaa 9480 aatacgttat gctcaacttg cagaagttta taaaaaacgc cagcaaaaat gccaagatgc 9540 aatcgatctt ttagaagaat atttggtgtt taaagaagaa gtggaaaacc gtgaaaatac 9600 ggaaaacaaa tgattagaaa taaaaaacta attcttccga attagttagc tcgtaaaaat 9660 tgatggcgga ggaataggga ttcgaaccct aggagggcgt aaaccctcgc cggttttcaa 9720 gaccggtgcc ttcaaccact cgaccattcc tccgtgattt aacgagcgtg aataatacgt 9780 tctctaataa aaaccgtcaa gtagaaataa tctcattttt atcaagtgat tatttttacc 9840 acagcttata aaaaatgcct ttcctattta tgaacggaaa ggcatttttg ttgaatgaat 9900 ttcagtgctt tttattacta caatatatcg cagaaattaa ccgcactttt gtcatctaaa 9960 caattatttt acgcgagaga cgtattcgcc agagcgggta tcaactctga taacttcgcc 10020 gatttggacg aaaagcggca cttttacaac tgcgcctgtg cttaatgttg caggtttacc 10080 gccagtgcct gcggtatcgc ctttaagacc tgggtctgta tcgacgattt ctaattcaac 10140 aaagtttggt ggcgtaacag taattggtgc gccattccat aaagtcacga tacaatctgc 10200 ttggtctaat aaccattttt ctgcatcgcc tacggctttt gcatctgcag agtactgttc 10260 aaatgtttct gggtgcataa agtaccagaa tgcatcgtct ttgtatgaat aagttaagtt 10320 aagatccata acatcagctg cttcaaccga agtgccagat ttaaagttta cgtctaatac 10380 tttgcctgaa attaatttac gaatacgagt acgagtgaat gcctgacctt tgcctggttt 10440 tacaaattca ttttcaacga tgacacaagg ctcgccgtct tgcataaatt ttagacctgg 10500 tttgaaatca ctggtagtat atgtagccat atttgaaata tcctaaaaat gagagatggt 10560 ttgaaaagtg cgtattttac cccaagaacc cgtcattaga gaagaacaaa attggctcac 10620 aattctaaaa aatgccattt cagatcctaa attattacta aaagccttaa atttaccaga 10680 agatgatttt gagcaatcca ttgctgcgcg gaaacttttt tcgctccgcg tgccacaacc 10740 tttcattgat aaaatagaaa aaggtaatcc gcaagatccc cttttcttgc aagtgatgtg 10800 ttctgattta gagtttgtgc aagcggaggg atttagtacg gatcccttag aagaaaaaaa 10860 tgccaatgcg gtgccaaata ttcttcataa atatagaaat cgcttgctct ttatggcaaa 10920 aggcggttgt gcggtgaatt gtcgttattg ctttcgccga cattttcctt acgatgaaaa 10980 cccaggaaat aaaaaaagct ggcaactggc gttagattac attgcggcac attctgaaat 11040 agaagaagtg attttttcag gtggcgatcc tttaatggcg aaagatcacg aattagcgtg 11100 gttaataaaa catttggaaa atataccgca cttacaacgt ttgcgtattc acacccgttt 11160 gcctgttgtg attccgcaac ggattactga tgaattttgc actttattag cagaaactcg 11220 tttgcaaaca gttatggtga cacacattaa tcacccgaat gaaattgatc aaatttttgc 11280 tcatgcgatg caaaaattaa acgccgtgaa tgtcacgctt ttgaatcaat ctgttttgct 11340 aaaaggcgtg aatgatgatg cgcaaattct aaaaatattg agcgataaac tttttcaaac 11400 aggcattttg ccttattact tgcatttgct ggataaagtt caaggggcga gccatttttt 11460 gattagcgat attgaagcta tgcaaatcta taaaaccttg caatctctga cttctggcta 11520 tcttgttcct aaacttgcac gagaaattgc gggcgagcca aataagactt tatacgcaga 11580 ataagatccg ataaaataca cataattttt tcgccgcact tttgagcttc tcaattttgt 11640 ggctaattaa tataaataag a 11661 // T0093 >emb|U32760|HI32760 Haemophilus influenzae Rd section 75 of 163 of the complete genome. Length = 13532 Score = 332 bits (842), Expect = 3e-89 Identities = 160/160 (100%), Positives = 160/160 (100%) Frame = -2 Query: 1 MLDIVLYEPEIPQNTGNIIRLCANTGFRLHLIEPLGFTWDDKRLRRSGLDYHEFAEIKRH 60 MLDIVLYEPEIPQNTGNIIRLCANTGFRLHLIEPLGFTWDDKRLRRSGLDYHEFAEIKRH Sbjct: 9949 MLDIVLYEPEIPQNTGNIIRLCANTGFRLHLIEPLGFTWDDKRLRRSGLDYHEFAEIKRH 9770 Query: 61 KTFEAFLESEKPKRLFALTTKGCPAHSQVKFKLGDYLMFGPETRGIPMSILNEMPMEQKI 120 KTFEAFLESEKPKRLFALTTKGCPAHSQVKFKLGDYLMFGPETRGIPMSILNEMPMEQKI Sbjct: 9769 KTFEAFLESEKPKRLFALTTKGCPAHSQVKFKLGDYLMFGPETRGIPMSILNEMPMEQKI 9590 Query: 121 RIPMTANSRSMNLSNSVAVTVYEAWRQLGYKGAVNLPEVK 160 RIPMTANSRSMNLSNSVAVTVYEAWRQLGYKGAVNLPEVK Sbjct: 9589 RIPMTANSRSMNLSNSVAVTVYEAWRQLGYKGAVNLPEVK 9470 ID HI32760 standard; DNA; PRO; 13532 BP. XX AC U32760; L42023; XX SV U32760.1 XX DT 09-AUG-1995 (Rel. 44, Created) DT 15-JUN-1998 (Rel. 56, Last updated, Version 9) XX DE Haemophilus influenzae Rd section 75 of 163 of the complete genome. XX KW . XX OS Haemophilus influenzae Rd OC Bacteria; Proteobacteria; gamma subdivision; Pasteurellaceae; Haemophilus; OC Haemophilus influenzae. XX RN [1] RP 1-13532 RX MEDLINE; 95350630. RA Fleischmann R.D., Adams M.D., White O., Clayton R.A., Kirkness E.F., RA Kerlavage A.R., Bult C.J., Tomb J., Dougherty B.A., Merrick J.M., RA McKenney K., Sutton G.G., FitzHugh W., Fields C.A., Gocayne J.D., RA Scott J.D., Shirley R., Liu L.I., Glodek A., Kelley J.M., Weidman J.F., RA Phillips C.A., Spriggs T., Hedblom E., Cotton M.D., Utterback T., RA Hanna M.C., Nguyen D.T., Saudek D.M., Brandon R.C., Fine L.D., RA Fritchman J.L., Fuhrmann J.L., Geoghagen N.S., Gnehm C.L., McDonald L.A., RA Small K.V., Fraser C.M., Smith H.O., Venter J.C.; RT "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd"; RL Science 269(5223):496-512(1995). XX RN [2] RP 1-13532 RX MEDLINE; 96398784. RA Tatusov R.L., Mushegian A.R., Bork P., Brown N.P., Hayes W.S., RA Borodovsky M., Rudd K.E., Koonin E.V.; RT "Metabolism and evolution of Haemophilus influenzae deduced from a RT whole-genome comparison with Escherichia coli"; RL Curr. Biol. 6(3):279-291(1996). XX RN [3] RP 1-13532 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D.; RT ; RL Submitted (25-JUL-1995) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX RN [4] RP 1-13532 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D.; RT ; RL Submitted (27-SEP-1997) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX RN [5] RP 1-13532 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D., Peterson J., RA Hickey E., Dodson R., Gwinn M.; RT ; RL Submitted (28-MAY-1998) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX DR SWISS-PROT; P44048; YGGX_HAEIN. DR SWISS-PROT; P44049; MLTC_HAEIN. DR SWISS-PROT; P44050; Y762_HAEIN. DR SWISS-PROT; P44308; NADR_HAEIN. DR SWISS-PROT; P44320; MUTY_HAEIN. DR SWISS-PROT; P44367; RL31_HAEIN. DR SWISS-PROT; P44864; YIBP_HAEIN. DR SWISS-PROT; P44865; PMG_HAEIN. DR SWISS-PROT; P44866; RIBB_HAEIN. DR SWISS-PROT; P44868; YIBK_HAEIN. DR SWISS-PROT; P44869; YHHF_HAEIN. DR SWISS-PROT; P44870; FTSY_HAEIN. DR SWISS-PROT; P44871; FTSE_HAEIN. DR SWISS-PROT; P44872; FTSX_HAEIN. DR SWISS-PROT; Q57125; Y765_HAEIN. XX FH Key Location/Qualifiers FH FT source 1..13532 FT /db_xref="taxon:71421" FT /organism="Haemophilus influenzae Rd" FT CDS complement(104..1336) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44864" FT /note="similar to SP:P37690 PID:466751 GB:U00096 FT PID:1790042 percent identity: 40.55; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0756" FT /product="conserved hypothetical protein" FT /protein_id="AAC22415.1" FT /translation="MLRFGVNQKTSLLLTALLSCGLLIFSPVSQSSDLNQIQKQIKQQE FT SKIEKQKREQAKLQANLKKHESKINSVEGELLETEISLKEIRKQIADADKQLKQLEKQE FT REQKARLTKQIDIIYRSGINPSLIERMFAQDPTKAERMKVYYQHLNQVRIEMINNLKAT FT QAQIAVQKKAILSQQKNHRNQLSTQKKQQQALQKAQQEHQSTLNELNKNLALDQDKLNT FT LKANEQALRQEIQRAEQAAREQEKREREALAQRQKAEEKRTSKPYQPTVQERQLLNSTS FT GLGAAKKQYSLPVSGSILHTFGSIQAGEVRWKGMVIGASAGTPVKAIAAGRVILAGYLN FT GYGYMVIVKHGETDLSLYGFNQAVSVKVGQLVSAGQVIAQVGNTGEISRSALYFGISRK FT GTPVNPAGWVR" FT CDS 1512..2195 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44865" FT /note="similar to GB:L09651 SP:P30798 PID:155611 percent FT identity: 59.19; identified by sequence similarity; FT putative" FT /transl_table=11 FT /gene="HI0757" FT /product="phosphoglycerate mutase (gpmA)" FT /protein_id="AAC22416.1" FT /translation="MELVFIRHGFSEWNAKNLFTGWRDVNLTERGVEEAKTAGKKLLDK FT GYEFDIAFTSVLTRAIKTCNIVLEESHQLWIPQVKNWRLNERHYGALQGLDKKATAEQY FT GDEQVHIWRRSYDISPPDLDPQDPNSAHNDRRYANIPSDVVPNAENLKLTLERALPFWE FT DQIAPAMLSGKRVLVVAHGNSLRALAKHIIGISDAEIMDFEIPTGQPLVLKLDDKLNYV FT EHYYL" FT CDS complement(2273..2485) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44367" FT /note="similar to SP:Q59450 PID:1388150 PID:2198845 percent FT identity: 91.43; identified by sequence similarity; FT putative" FT /transl_table=11 FT /gene="HI0758" FT /product="ribosomal protein L31 (rpL31)" FT /protein_id="AAC22417.1" FT /translation="MKQGIHPEYKEITATCSCGNVIKTRSTLGKDINLDVCGNCHPFYT FT GKQRVVDTGGRVERFNSRFKIPSTK" FT CDS 2662..3798 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44320" FT /note="similar to GB:M59471 SP:P17802 GB:X52391 PID:146864 FT PID:42073 percent identity: 61.58; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0759" FT /product="A/G-specific adenine glycosylase (mutY)" FT /protein_id="AAC22418.1" FT /translation="MLAKSSINAPFAKSVLAWYDKFGRKHLPWQQNKTLYGVWLSEVML FT QQTQVATVIPYFERFIKTFPNITALANASQDEVLHLWTGLGYYARARNLHKAAQKVRDE FT FNGNFPTNFEQVWALSGVGRSTAGAILSSVLNQPYPILDGNVKRVLARYFAVEGWSGEK FT KVENRLWALTEQVTPTTRVADFNQAMMDIGAMVCMRTKPKCDLCPLNIDCLAYKNTNWE FT KFPAKKPKKAMPEKTTYFLILSKNGKVCLEQRENSGLWGGLFCFPQFEDKSSLLHFLAQ FT EKVTHYQEWPSFRHTFSHFHLDIHPIYAEMESTLCVEQANLDWRKVMESTKEYQSNLSS FT AVKYWYDPQNPEPIGLAQPVKNLLIQFVRNHYGKNSIL" FT CDS 3776..4048 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44048" FT /note="similar to PID:882491 SP:P52065 GB:U00096 FT PID:1789332 percent identity: 78.16; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0760" FT /product="conserved hypothetical protein" FT /protein_id="AAC22419.1" FT /translation="MARTVFCEYLKKEAEGLDFQLYPGELGKRIFDSVSKQAWGEWIKK FT QTMLVNEKKLNMMNAEHRKLLEQEMVNFLFEGKDVHIEGYVPPSN" FT CDS 4063..5136 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44049" FT /note="similar to PID:882493 SP:P52066 GB:U00096 FT PID:1495826 PID:2367180 percent identity: 55.99; identified FT by sequence similarity; putative" FT /transl_table=11 FT /gene="HI0761" FT /product="membrane-bound lytic murein transglycosylase C FT (mltC)" FT /protein_id="AAC22420.1" FT /translation="MKKYLLLALLPFLYACSNSSNQGINYDEAFAKDTQGLDILTGQFS FT HNIDRIWGVNELLVASRKDYVKYTDSFYTRSHVSFDEGNIVIETQQDLNRLHNAIVHTL FT LMGADAKGIDLFASGDVPISSRPFLLGQVVDHQGQHIANQVIASNFATYLIQNKLQTRR FT LQNGHTVQFVSVPMIANHVEVRARKYLPLIRKAAQRYGIDESLILGIMQTESSFNPYAI FT SYANAIGLMQVVPHTAGRDVFAMKGKGGQPSTRYLYDPANNIDAGVSYLWILQNQYLDG FT ITNPTSKRFAMISAYNSGAGAVLRVFDNDKDTAIYKINQMYPEQVYRILTTVHPSSQAR FT NYLLKVDKAQKKFRVRR" FT CDS complement(5747..6427) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44050" FT /note="hypothetical protein; identified by GeneMark; FT putative" FT /transl_table=11 FT /gene="HI0762" FT /product="H. influenzae predicted coding region HI0762" FT /protein_id="AAC22429.1" FT /translation="MILFAGDPHGSYDHIYPFIKEQENVALIILGDLQLTTSDELDKLA FT KHCDIWFIHGNHDSKTISAFDSIWGSEWQSRNLHNRVVDIQGTRIAGLGGVFRGQIWMP FT PNRPMFFDPIHYCQYSPQEKIWRGGVPLRHRTSIFPSDIEILENQQADVLICHEAPKPH FT PMGFQVINDLAMKMGVKLVFHGHHHENFTYRTKYPYKITNVGFRSLADAEGNYLLQTID FT DREK" FT CDS complement(6424..7689) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44308" FT /note="similar to GB:U14003 SP:P27278 PID:537230 GB:U00096 FT PID:1790851 percent identity: 54.57; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0763" FT /product="transcriptional regulator (nadR)" FT /protein_id="AAC22421.1" FT /translation="MGFTTGREFHPALRMRAKYNAKYLGTKSEREKYFHLAYNKHTQFL FT RYQEQIMSKTKEKKVGVIFGKFYPVHTGHINMIYEAFSKVDELHVIVCSDTVRDLKLFY FT DSKMKRMPTVQDRLRWMQQIFKYQKNQIFIHHLVEDGIPSYPNGWQSWSEAVKTLFHEK FT HFEPSIVFSSEPQDKAPYEKYLGLEVSLVDPDRTFFNVSATKIRTTPFQYWKFIPKEAR FT PFFAKTVAILGGESSGKSVLVNKLAAVFNTTSAWEYGREFVFEKLGGDEQAMQYSDYPQ FT MALGHQRYIDYAVRHSHKIAFIDTDFITTQAFCIQYEGKAHPFLDSMIKEYPFDVTILL FT KNNTEWVDDGLRSLGSQKQRQQFQQLLKKLLDKYKVPYIEIESPSYLDRYNQVKAVIEK FT VLNEEEISELQNTTFPIKGTSQ" FT CDS 7908..8555 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44866" FT /note="similar to SP:P24199 GB:X66720 PID:455174 PID:49100 FT PID:882571 percent identity: 70.09; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0764" FT /product="3,4-dihydroxy-2-butanone 4-phosphate synthase FT (ribB)" FT /protein_id="AAC22422.1" FT /translation="MNQSILSPFGNTAEERVLNAINAFKNGTGVLVLDDEDRENEGDLI FT FPAETITPEQMAKLIRYGSGIVCLCITDERCQQLDLPPMVEHNNSVNKTAFTVTIEAAK FT GVSTGVSAADRVTTIQTAIADNAVLTDLHRPGHVFPLRAANGGVLTRRGHTEASVDLAR FT LAGFKEAGVICEITNDDGTMARAPEIVEFAKKFGYSVLTIEDLVEYRLAHNI" FT CDS 8570..9418 FT /codon_start=1 FT /db_xref="SWISS-PROT:Q57125" FT /note="similar to GB:M37913 GB:L19441 PID:1573535 percent FT identity: 26.32; identified by sequence similarity; FT putative" FT /transl_table=11 FT /gene="HI0765" FT /product="lipooligosaccharide biosynthesis protein" FT /protein_id="AAC22423.1" FT /translation="MKISMIFLPHFLYYTVPTFYLFGLLIMHNAAQHNYVISLTTEQKR FT RKHITEEFGKQNIPFEFFDAITPDIIEETAKKFNITLDRSPKAKLSDGEIGCALSHIVL FT WDLALENNLNYINIFEDDIHLGENAKELLEIDYISDDIHVLKLEANGKMFFKQPKSVKC FT DRNVYPMTVKQSGCAGYTVTAKGAKYLLELVKNKPLDVAVDSLVFEDFLHFKDYKIVQL FT SPGICVQDFVLHPDNPFESSLQEGRDRVHGNQRKSSILEKIKNEFGRVKIKMFGKQVPF FT K" FT CDS complement(9467..9949) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44868" FT /note="similar to SP:P33899 PID:466744 GB:U00096 FT PID:1790034 percent identity: 76.62; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0766" FT /product="rRNA methylase, putative" FT /protein_id="AAC22424.1" FT /translation="MLDIVLYEPEIPQNTGNIIRLCANTGFRLHLIEPLGFTWDDKRLR FT RSGLDYHEFAEIKRHKTFEAFLESEKPKRLFALTTKGCPAHSQVKFKLGDYLMFGPETR FT GIPMSILNEMPMEQKIRIPMTANSRSMNLSNSVAVTVYEAWRQLGYKGAVNLPEVK" FT CDS complement(9960..10541) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44869" FT /note="similar to SP:P10120 GB:U00039 GB:X04398 PID:41497 FT PID:466601 percent identity: 55.26; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0767" FT /product="conserved hypothetical protein" FT /protein_id="AAC22425.1" FT /translation="MKKIQTPNAKGEVRIIAGLWRGRKLPVLNSEGLRPTGDRVKETLF FT NWLMPYIHQSECLDGFAGSGSLGFEALSRQAKKVTFLELDKTVANQLKKNLQTLKCSSE FT QAEVINQSSLDFLKQPQNQPHFDVVFLDPPFHFNLAEQAISLLCENNWLKPNALIYVET FT EKDKPLITPENWTLLKEKTTGIVSYRLYQN" FT CDS 10596..11840 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44870" FT /note="similar to SP:P10121 GB:U00039 GB:X04398 PID:41498 FT PID:466600 percent identity: 66.00; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0768" FT /product="cell division protein (ftsY)" FT /protein_id="AAC22426.1" FT /translation="MNDIFIGLQLREHFMAEENKKGGFWASLFGRNKKQDEPKIEPIIE FT EEKIKDIEPSIEKFEANDLVEEEKIQEISTALEPIEEIIEAKNLEDEFQPVVEIETREK FT PSEGGFFSRLVKGLLKTKQNIGAGFRGFFLGKKIDDELFEELEEQLLIADIGVPTTSKI FT IKNLTEHASRKELQDAELLYQQLKVEMADILEPVAQPLEIDSTKKPYVILMVGVNGVGK FT TTTIGKLARKFQAEGKSVMLAAGDTFRAAAVEQLQVWGERNHIPVVAQSTGSDSASVIF FT DAMQSAAARNIDILIADTAGRLQNKNNLMDELKKIVRVMKKYDETAPHEIMLTLDAGTG FT QNAISQAKLFNEAVGLTGISLTKLDGTAKGGVIFAIADQFKLPIRYIGVGEKIEDLREF FT NAKEFIEALFVHEEE" FT CDS 11859..12515 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44871" FT /note="similar to SP:P10115 GB:U00039 GB:X04398 PID:41499 FT PID:466599 percent identity: 64.06; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0769" FT /product="cell division ATP-binding protein (ftsE)" FT /protein_id="AAC22427.1" FT /translation="MIKFSNVSKAYHGATQPALQGLNFHLPVGSMTYLVGHSGAGKSTL FT LKLIMGMEKANAGQIWFNGHDITRLSKYEIPFLRRQIGMVHQDYRLLTDRTVVENVALP FT LIIAGMHPKDANTRAMASLDRVGLRNKAHYLPPQISGGEQQRVDIARAIVHKPQLLLAD FT EPTGNLDDELSLGIFNLFEEFNRLGMTVLIATHDINLIQQKPKPCLVLEQGYLRY" FT CDS 12525..13457 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44872" FT /note="similar to SP:P10122 GB:U00039 GB:X04398 PID:41500 FT PID:466598 percent identity: 43.49; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0770" FT /product="cell division protein (ftsX)" FT /protein_id="AAC22428.1" FT /translation="MSRSTDASVFVQTAYTLRAVWADLWQRKFGTLLTILVIAVSLTIP FT TVSYLMWKNLHLATTQFYPESELTIYLHKNLSEENANLVVEKIRQQKGVESLNYVSRQE FT SLKEFKSWSGFGEELEILDDNPLPAVVIVKPTSEFNVSEKRDELRTNLNKIKGVQEVRL FT DNDWMEKLTALSWLIAHVAIFCTVLMTIAVFLVIGNSIRSDVYSSRSSIDVMKLLGATD FT QFILRPYLYTGMIYALLGGLVAAIFSSFIISYFTSAVKYVTDIFAVQFSLNGLGVGEFV FT FLLVCCLIMGYVGAWIAATRHIAMMERKE" XX SQ Sequence 13532 BP; 4140 A; 2347 C; 2701 G; 4344 T; 0 other; tcaatcacaa tagcaagttt actttgagca aaagaaaaag aggtataaag tgcggtagaa 60 aaaacaatga aattttttac cgcacttttt atcagaatat tcatcaacga acccaccctg 120 caggatttac tggcgttcct ttacggctaa taccaaaata aagcgcagaa cgtgatattt 180 cccctgtatt tcctacttga gcaataacct gccctgctga aacaagctga ccaactttca 240 ctgatacagc ttgattgaag ccatataaac ttaaatcagt ttcgccgtgt ttaacaataa 300 ccatataacc ataaccattt aaatatcccg ctaaaatgac gcgtccagca gcaattgctt 360 taacaggcgt gcctgctgat gcgccaatta ccataccttt ccaacgtact tcgcctgctt 420 ggatagaacc aaaagtatgc aaaattgaac cagaaactgg taaggaatat tgtttttttg 480 ccgcccctaa accgcttgta ctattaagta attggcgttc ttgcacagtt ggttgataag 540 gttttgatgt tcgtttttct tcagcttttt ggcgttgagc aagtgcctct ctttcacgtt 600 tttcttgttc gcgcgctgct tgttcagctc gttgaatttc ttgacgaagt gcttgttcgt 660 ttgcttttag tgtattcaat ttatcttgat ctagggctaa atttttattg agttcattca 720 gcgtagattg atgctcttgc tgtgcttttt gcaatgcttg ttgttgtttt ttttgtgtgg 780 aaagttgatt tcggtgattc ttttgttgag agagaatcgc ctttttttgt actgcaattt 840 gtgcttgcgt tgcttttaaa ttattaatca tttcaatccg aacttgattt aaatgctgat 900 aataaacttt cattcgctct gcttttgtcg gatcttgggc aaacattcgt tcaatcagcg 960 atggattaat gcctgaacga taaattatat ctatttgttt ggttaatcgt gctttttgct 1020 cacgttcctg tttttctaat tgtttgagtt gcttatctgc atcggcaatt tgcttacgaa 1080 tttcctttaa acttatttct gtttcaagca gttcgccctc aacactgtta attttactct 1140 cgtgtttttt taaatttgct tgtaacttag cttgctcacg tttttgcttc tctattttag 1200 attcttgttg cttaatttgt ttttgaattt gattgagatc ggaagattgg ctgacaggcg 1260 aaaatattaa taaaccgcag cttaaaagtg cggttaataa taatgatgtt ttttgattaa 1320 cgccaaaacg caacataatt catcctaaat tcgtaaaact cgcaaatctt atcatatttg 1380 ttaaaaggcg aaaaagtttg atggttattt tttgctaaag atcaggaaat catccctttt 1440 aacttttaaa gctaaggatt ttttgctata aaatgtgcgt tctttttatt tgaaaaaatt 1500 aggagattct tatggaatta gtatttatcc gtcacggttt tagtgaatgg aatgcgaaaa 1560 acttattcac aggctggcgt gatgtgaatt taactgaacg tggtgtggaa gaagcaaaaa 1620 ctgcgggtaa aaaactgtta gacaaaggtt atgaatttga tatcgcattt acctctgttt 1680 taactcgagc aatcaaaact tgtaacatcg tgttagaaga atcccatcaa ttatggattc 1740 cgcaagtaaa aaactggcgt ttaaatgaac gtcattatgg tgctttacaa ggtttagata 1800 aaaaagcgac tgcggagcaa tacggtgacg aacaagttca tatttggcgt cgttcttacg 1860 acatttctcc accagattta gatccacaag atccaaattc tgcacacaat gaccgtcgct 1920 acgcaaatat tccatctgat gttgtgccaa acgcagaaaa tttaaaatta acattagaac 1980 gtgcattacc attctgggaa gatcaaattg caccagcaat gctttctggc aaacgtgttt 2040 tagtggttgc tcacggaaat tcacttcgtg cgttggcaaa acatattatc ggtatttctg 2100 atgctgaaat tatggatttt gaaattccaa caggccaacc tttagtatta aaactagatg 2160 ataaattaaa ttacgtagaa cattactatc tttaattaag ttcaacgtat tgttgttcta 2220 actgtataaa acaaaacccc gcaaattagc ggggttttta agcaagtcca aattattttg 2280 tgcttggaat tttgaaacgg ctgttgaaac gctcgacacg accaccagta tcaacaacac 2340 gttgtttacc agtatagaat gggtggcagt taccacacac atcaagattg atgtctttgc 2400 ctaaagttga acgtgttttg atcacattac cgcaagaaca agtcgcagta atttctttat 2460 attctggatg aataccttgt ttcatagaaa aacctcaaaa tgaagccacg ccgctatagg 2520 aactttctcc ccacaccgcg tgagtgaata atacgccata attggcacca aataggttgc 2580 gaattatact gaaaaaatgc agataaacaa gctctatttt cgtgtaaaat catcagcatt 2640 tttaaccgca cttttctttt tatgttagca aaatcttcca tcaatgcgcc atttgccaaa 2700 tctgttttag cttggtacga caagtttgga cgcaagcatt taccttggca gcaaaataaa 2760 acgctttatg gtgtatggct ttctgaagtg atgttgcaac aaacgcaagt tgcgacggta 2820 attccttatt ttgagcggtt tatcaaaaca tttccaaata tcaccgcact tgccaatgct 2880 tcacaagatg aagtgttaca tttatggacg ggcttaggct attatgcacg agcgcgtaat 2940 ttacataaag ccgctcaaaa agtgcgggat gaatttaatg gaaatttccc aacaaatttt 3000 gagcaagttt gggcattatc tggtgtggga cgttcaactg ctggtgccat tttatcttct 3060 gttttaaatc agccttatcc aattttagat ggtaatgtga aacgtgtgtt ggcacgttat 3120 tttgctgttg agggatggag tggcgagaaa aaagtagaaa atcgcttgtg ggcattaacg 3180 gagcaagtta cacctacgac gcgtgtggca gattttaatc aagcaatgat ggacattggt 3240 gcaatggttt gcatgcgaac taagcccaaa tgcgatcttt gtccgttaaa tatagattgt 3300 ttggcttata aaaatacaaa ctgggaaaaa tttccagcta aaaaacctaa aaaagcgatg 3360 ccagaaaaaa ctacctattt tttaattttg tcgaaaaatg gcaaagtatg tttggaacag 3420 cgagaaaact caggattatg gggcggatta ttttgttttc cacaatttga agataaatcc 3480 tcgttgcttc attttttagc ccaagaaaaa gtcacgcatt atcaagaatg gccgagtttt 3540 cgacatacat tcagccattt tcatttagat attcatccaa tttacgccga aatggagagt 3600 actttatgcg tagagcaggc taatttagac tggcgaaaag tgatggaaag cacaaaagaa 3660 taccaatcaa acctatcaag tgcggtcaaa tattggtatg atccacaaaa tcctgaaccg 3720 atcgggctgg ctcagcctgt gaaaaatctt ttaatacaat ttgtaaggaa tcattatggc 3780 aagaacagta ttttgtgaat atctcaaaaa agaagcggaa ggcttagatt ttcaacttta 3840 tcctggagag ttgggcaagc gtatttttga ttcagtaagt aaacaggctt ggggcgaatg 3900 gattaagaaa caaacgatgc ttgtgaatga aaaaaaactt aatatgatga atgcagagca 3960 ccgtaaatta ttagaacaag aaatggttaa ttttttgttt gaaggtaaag atgttcatat 4020 tgaaggttat gttccaccat caaattagtt atctatccta caatgaaaaa atatttatta 4080 ttggcattat tgcctttttt gtatgcttgt agtaattcat cgaatcaagg tattaactat 4140 gatgaagcct ttgctaaaga cacgcaaggg ttagatattc tcacagggca attctcgcat 4200 aatattgacc gtatttgggg cgtcaatgaa ttgttagtgg caagccgtaa agattatgtg 4260 aaatatacag attcttttta tacgcgtagc catgtgagtt ttgatgaagg taatatcgtt 4320 attgaaaccc agcaagattt aaatcgatta cataatgcta ttgttcatac cttgttaatg 4380 ggagcggatg caaaaggtat tgatttattt gcatctggtg atgtgccgat tagctctcgc 4440 ccattccttt tggggcaggt tgtagatcat caagggcaac acattgctaa tcaagttatc 4500 gcaagtaatt tcgccactta tttgattcaa aataaattgc aaacacgtcg attacaaaac 4560 gggcataccg tgcaatttgt ctctgttcct atgattgcaa accacgtaga agtgcgtgca 4620 cgaaaatatt taccattgat tcgtaaagct gcacaacgtt atggcattga tgaaagtttg 4680 attttgggca ttatgcaaac agaatcaagt tttaacccat atgcgattag ctacgcaaat 4740 gctattggtt taatgcaagt cgtgcctcat acagcaggtc gagatgtatt tgcaatgaaa 4800 ggcaaaggtg gacagccatc aacgcgttat ttatatgatc ctgcgaataa tattgatgct 4860 ggtgtatctt atttgtggat tttacaaaat caatatttag atggaattac gaatccaacc 4920 tcaaaacgtt ttgccatgat ttctgcgtat aatagtggtg caggcgcagt gttacgtgtt 4980 tttgataatg ataaggatac ggcgatttac aaaatcaacc aaatgtatcc agaacaagtt 5040 tatcgcattc taacgacggt tcacccatca tcacaagcac gcaattattt gttgaaagta 5100 gataaagcac agaaaaaatt ccgtgttaga cgataattga attttttcag ataatcaaaa 5160 atgctcgctg gctaacacca ttgagcattt tttattaaaa aagtgcggta aattttaaaa 5220 tagtttttat tatttcgttg ttttaataaa catattgatg attaactatt caaatgaaat 5280 gttttttcca ttttttttgc atttagttat tgaccgaaaa ccgaaaaaat agtttaatac 5340 gccccgttgt caggcagtag cctataacga taacgcctcg atagctcagt cggtagagca 5400 ggggattgaa aatccccgtg tcggtggttc gattccgcct cgaggcacca tttcctcctt 5460 agttcagtcg gtagaacggt ggactgttaa tccatatgtc gcaggttcga gtcccgcagg 5520 aggagccaaa tttaaaaaag ccgctaaaga gatttagcgg ctttttgctt tcttatatat 5580 aaaatatata atcttctcta gtcttgtaac tataactagg catcaaaatt caaaatggcg 5640 tatcaatgag cgagagcctt cttatttttg atgttagcaa gcgaccttag tcgcttgcgt 5700 ttaacttaaa gaagaaaaat gagtcgactt ttattttgga atagatttat ttttcacgat 5760 catcgatggt ttgcagcaaa taatttccct ctgcatcagc aagactgcgg aaacctacat 5820 tcgtgatttt gtagggatac tttgttcgat aagtaaagtt ttcgtgatga tgtccgtgga 5880 atactagctt aacccccatc ttcattgcca aatcattaat tacttggaaa cccatcggat 5940 gtggtttagg ggcttcgtga caaatcaaca catctgcttg ttggttttct aagatttcaa 6000 tatctgatgg aaaaattgac gtgcgatgac gcaaaggcac accgcctcgc caaatcttct 6060 cctgtgggct atattggcaa taatgaatag gatcaaaaaa catggggcga tttggtggca 6120 tccaaatttg cccgcggaat acaccgccta atcccgcaat gcgagttcct tgaatatcaa 6180 ctacgcgatt gtgtaaattg cgtgattgcc actcagagcc ccaaattgaa tcaaaagcac 6240 taatggtttt gctatcatga ttaccatgaa taaaccaaat atcacaatgt ttcgcaagtt 6300 tatccaactc atctgatgtc gtgagctgta aatcgcctaa aattatcagt gcgacatttt 6360 cctgttcttt tataaaaggg taaatgtgat catagcttcc gtgcggatca ccagcaaata 6420 aaatcattga gatgtccctt ttataggaaa ggttgtgttt tgtaattcac tgatttcctc 6480 ttcatttaac actttctcaa tgactgcttt aacttggtta tagcgatcaa gataacttgg 6540 tgattcaatc tctatataag gaactttata tttatctaac agttttttga gtagttgttg 6600 aaattgttgg cgttgttttt gtgagcctaa gctacgcaag ccatcatcca cccattcagt 6660 attgttttta agtaaaatag tgacatcgaa gggatattct ttaatcattg agtctaaaaa 6720 tggatgggct tttccttcat attgaatgca gaatgcttgc gtggtgatga aatccgtatc 6780 aataaatgca attttatgag aatggcgcac ggcataatca atgtatcgtt gatgaccaag 6840 cgccatttgc ggatagtcag aatattgcat cgcttgctcg tcgccaccga gcttttcaaa 6900 tacaaattca cgcccgtatt cccacgcaga agtggtatta aatacggcgg ctaacttatt 6960 aactagcacg cttttaccac tgctttctcc ccctaaaatc gccaccgttt tggcaaagaa 7020 aggacgagct tctttcggaa taaacttcca atattggaat ggagtggtgc gaattttggt 7080 ggcggacaca ttaaagaaag tgcggtcagg atcgactaac gaaacttcta aacctaagta 7140 tttctcgtaa ggcgctttat cttgaggttc gctactaaat acgattgaag gctcaaaatg 7200 tttttcatga aatagggttt taactgcttc actccaagat tgccagccgt ttggataact 7260 cggaataccg tcttcaacca aatgatgaat aaaaatctga tttttttgat atttgaaaat 7320 ttgctgcatc caacgcaaac gatcttgcac ggttggcatg cgtttcattt tactatcgta 7380 aaataatttc aaatcgcgca cagtgtcact acacacgata acgtgtagtt catcgacttt 7440 actgaacgct tcataaatca tatttatatg acctgtgtgt acaggataaa atttcccgaa 7500 aatgacaccg acttttttct cttttgtttt tgacataatc tgctcttgat agcgtaagaa 7560 ttgagtgtgt ttattgtagg ctaaatgaaa atatttttca cgttctgatt tagtgcctag 7620 gtattttgcg ttatacttag ctcgcattct cagggcaggg tgaaattccc taccggtggt 7680 aaagcccacg agcgtttaaa agtgcggtca atttttggca aatttttcct gcgaaattgt 7740 cctgattttc acttttaaag tcagcagatt tggtgaaatt ccaaagccga cagtaaagtc 7800 tggatgaaag agaataaaac gagttgggtt cccaaatcgt ttattttatt tgcatttctc 7860 agccctgatt ctggtattta attgaaatct caaattagga aattactatg aatcagtcaa 7920 ttttatctcc attcggcaac accgctgaag aacgcgtact taatgcaatt aatgccttta 7980 aaaatggaac tggcgtatta gttttagatg atgaagatcg tgaaaatgaa ggcgatttaa 8040 ttttcccagc agaaacaatc acgccagaac aaatggcaaa attaattcgt tatggcagtg 8100 gcattgtttg tttatgcatt actgatgaac gttgccaaca actcgattta ccgccaatgg 8160 tagaacacaa taatagcgta aacaaaacgg cttttactgt aacgattgaa gccgcaaaag 8220 gtgtttctac aggcgtatct gccgcagacc gagtaacaac cattcaaacg gcgattgcgg 8280 ataatgctgt tctgacagat ttacatcgtc ctggtcatgt tttcccactt cgtgcagcaa 8340 atggtggcgt acttactcgc cgcggacaca ctgaagcttc cgttgattta gcacgtctgg 8400 caggatttaa agaagcaggc gttatttgtg aaattactaa cgatgatggc acaatggctc 8460 gcgcccctga gattgtagaa tttgcgaaaa aatttggtta ttccgtgcta accattgaag 8520 atttagttga atatcgttta gcacataata tttaaaatgc aaaagtgcgg tgaaaatttc 8580 tatgattttt ttaccgcact ttttatatta tactgttcca actttttatt tatttgggtt 8640 acttattatg cataatgcag ctcagcacaa ttatgttatc agtttaacta ctgaacaaaa 8700 acgccgaaaa catattaccg aagaattcgg taagcagaat attcctttcg aattttttga 8760 tgctattacg cccgacatta ttgaagaaac cgctaaaaaa tttaatatta cattagatcg 8820 ctctcctaaa gccaagttgt cggatgggga aattggttgt gcattaagcc atattgtttt 8880 atgggattta gcattagaaa ataatttaaa ctatatcaat atctttgaag atgatattca 8940 tttgggggaa aatgccaaag aattattaga aattgattat atttctgatg atattcatgt 9000 tttaaaatta gaagcaaatg gcaagatgtt ctttaaacaa ccaaaatctg taaaatgcga 9060 tagaaatgtt tatcccatga cggtaaagca atcaggatgt gcaggatata ctgttacagc 9120 aaaaggggct aaatatttgc ttgaattagt aaaaaataaa ccacttgacg tggcggttga 9180 ttcacttgtt tttgaggatt ttttacattt taaagattat aaaatagtac aactttctcc 9240 tggtatttgc gttcaagatt ttgtgttaca tccagataat ccttttgaaa gcagtttaca 9300 agaaggacga gatagagtac acggaaatca acgcaagtcc tctattttag aaaaaataaa 9360 aaatgaattt ggacgagtaa aaataaaaat gtttggaaaa caagttccat ttaaataata 9420 aaagtgcggt taaaaatctt attaaatttt aaccgcactt ttttgtttat tttacttctg 9480 gtaaatttac cgcaccttta tagcctaatt gtcgccacgc ttcataaacc gttactgcaa 9540 cagaattaga caagttcata ctgcggctat tggcagtcat tggaatgcgg attttttgtt 9600 ccatcggcat ttcatttaaa attgacatcg gaatgccacg agtttctggg ccaaacatta 9660 aataatcccc aagtttaaat ttcacttggc tatgtgctgg acagcctttt gttgtaagtg 9720 caaaaaggcg tttgggtttt tcgctttcta aaaaggcttc aaaggttttg tggcgtttga 9780 tttccgcaaa ttcgtgataa tccaaacctg agcggcgtaa gcgtttatca tcccaagtga 9840 aacctagtgg ttcaattaag tgcaagcgaa atcctgtgtt ggcacaaagt cgaataatgt 9900 taccagtatt ttgtgggatt tctggttcat ataaaacgat atctaacata attttctctc 9960 taattctgat ataaacgata gctcactatt cccgtcgttt tttctttcaa taacgtccag 10020 ttttcaggag tgatgagcgg tttatctttc tctgtttcta cgtaaattaa cgcattgggt 10080 tttagccagt tattttcaca aagcaaacta attgcttgtt ctgccaaatt aaaatgaaat 10140 ggaggatcta aaaataccac atcaaagtgc ggttgatttt gtggctgttt tagaaaatcc 10200 aaactacttt gattaattac ttcagcctgt tctgatgagc atttgagtgt ttgtaaattc 10260 ttttttaatt gattggcgac agttttatct agttctaaaa aagtcacttt tttcgcttgg 10320 cgagaaagtg cttcaaaacc taatgaacca ctccccgcaa agccatctaa acattcagat 10380 tggtggatat aaggcattaa ccaattaaaa agcgtttctt tcactctatc gccagttggg 10440 cgtaagcctt cagaatttaa tacgggaagt tttcgccctc gccatagacc tgcaataatg 10500 cgaacctcgc cctttgcatt tggcgtttgt atttttttca taaagattcg atttttattg 10560 aaaattgtgc ttagtttaga cgattttttg ctagaatgaa cgacattttt atcggattac 10620 agttaagaga gcattttatg gctgaagaaa ataaaaaagg cggattttgg gcatcgttat 10680 ttgggcgtaa taaaaagcaa gatgaaccta aaattgagcc gataattgaa gaagaaaaga 10740 taaaagatat cgagccgtct attgaaaaat ttgaagctaa tgatttagtc gaagaagaaa 10800 aaatacagga aatttcaacc gcacttgagc cgattgaaga aatcatcgaa gcgaaaaatt 10860 tagaagatga gtttcagcct gttgttgaaa ttgaaactcg cgaaaaacca agtgaagggg 10920 gcttttttag tcgtttagtt aagggattac tgaaaactaa acaaaatatt ggcgcgggtt 10980 ttcgtggttt tttcttagga aagaaaattg atgatgagtt gtttgaagaa ttagaagaac 11040 aacttttaat tgcggatatt ggtgtgccga caacaagtaa aatcatcaaa aatttgactg 11100 aacacgcaag ccgcaaagaa ttacaagatg ccgagctttt atatcaacaa ttaaaagtag 11160 aaatggcgga tattcttgag ccagtagcac aaccgttaga gattgatagc acgaaaaaac 11220 cttatgtgat tttaatggtg ggcgtaaatg gcgtgggtaa aacaacgaca attggtaagc 11280 tcgcacgtaa atttcaggct gaaggaaaat cagtgatgct tgcagcgggc gatactttcc 11340 gagcggcggc tgttgagcag cttcaagttt ggggagaacg caatcatatt ccagttgttg 11400 cgcaaagcac gggttctgat tctgcgtctg tgatttttga tgcgatgcaa tcagcggctg 11460 cacgcaatat tgatattctt attgcggata cggcgggtcg tttacaaaat aaaaataact 11520 taatggacga actgaaaaaa attgttcgtg ttatgaaaaa atatgatgaa actgcgccac 11580 atgaaattat gcttacgctg gatgctggca cgggacaaaa tgctattagc caagcgaagt 11640 tatttaatga ggctgtgggt ttaacaggaa tctctctaac taaattagat ggcacggcaa 11700 aaggcggcgt gatttttgcc atagccgatc agtttaagtt gcctattcgc tatatcggtg 11760 taggcgagaa aattgaagat ttacgtgagt ttaatgcgaa agaatttatt gaggcattat 11820 tcgttcatga agaagaataa aaagaataag gaagaattgt gattaagttc tcaaatgttt 11880 ctaaagccta tcatggcgca acacagcccg ccttacaagg cttgaatttt catcttcctg 11940 tgggaagtat gacttactta gttgggcatt caggcgcagg gaaaagtaca ttactcaagc 12000 ttattatggg aatggaaaaa gccaatgcgg gtcaaatttg gtttaacggg catgatatta 12060 ctcgcttgtc gaaatatgaa attccatttc tgcgtcgcca aattggtatg gttcaccaag 12120 attatcgttt attaacagat cgtactgtag tggaaaatgt ggcattgccg ttgattattg 12180 caggtatgca tccaaaagat gcgaatactc gagcaatggc atctttagat cgtgtaggat 12240 tgcgtaataa agcccactat ttgccaccgc aaatttctgg cggagaacag caacgcgttg 12300 atattgcgcg tgcgattgtg cataaacctc aactcttatt ggcagatgaa ccaacgggta 12360 atttagacga tgaactttct ttagggattt ttaatttgtt tgaagaattt aatcgtctcg 12420 gaatgactgt gttaattgct actcacgaca ttaatttaat tcaacaaaaa ccaaaacctt 12480 gtcttgtgct tgaacaaggt tatttacgtt attaaggaat aatcatgagc cgatcaacgg 12540 atgcatctgt ttttgttcaa acagcctata ctttgcgtgc agtatgggcg gatttgtggc 12600 aacgtaaatt tggcacgctg ctcacgattt tagtgattgc cgtttcgctt acgatcccga 12660 ctgtgagcta tttaatgtgg aaaaatttgc atttagccac cactcaattt tatcctgaaa 12720 gtgaattgac tatttatctt cataaaaatt taagtgagga aaacgcaaat ttagtggtag 12780 agaaaatccg tcagcaaaaa ggcgtggaat cactgaatta tgtctctcgt caagaaagtt 12840 taaaagaatt caaaagttgg tctggctttg gtgaagaatt ggaaatttta gatgataatc 12900 cattgcctgc ggtcgttata gtaaaaccga cgagcgaatt taatgtttca gaaaaacgtg 12960 acgaattacg aaccaattta aataaaatta aaggcgtgca agaagttcgc ttagataacg 13020 attggatgga aaaattgacc gcactttctt ggttaatcgc ccatgtggca attttctgca 13080 ctgtattaat gacgatagcg gtatttttag ttattggtaa cagtattcgc tctgatgtgt 13140 atagtagccg atcaagtatt gatgtgatga aattgcttgg tgcaacagat caatttatcc 13200 ttcgtcctta tctttataca gggatgattt atgccctgtt aggtgggtta gttgcagcaa 13260 tttttagtag ctttattatc agctatttta cttctgcggt gaaatatgta acggatattt 13320 ttgccgtcca attcagtttg aatggattag gcgttggcga atttgtattc ttattagtgt 13380 gctgtttaat tatgggctat gttggggcgt ggattgccgc gacaaggcat attgcgatga 13440 tggagaggaa agaataattg gatatcaaaa gaaaagcgaa tttcttaaaa gaaattcgct 13500 tttttagtta gaagtgaaat atttcgctgt tg 13532 // 20000521 15.54 Leszek: T0091 >emb|U32727|HI32727 Haemophilus influenzae Rd section 42 of 163 of the complete genome. Length = 11878 Score = 219 bits (553), Expect = 1e-55 Identities = 109/109 (100%), Positives = 109/109 (100%) Frame = +1 Query: 1 MFGKGGLGGLMKQAQQMQEKMQKMQEEIAQLEVTGESGAGLVKITINGAHNCRRIDIDPS 60 MFGKGGLGGLMKQAQQMQEKMQKMQEEIAQLEVTGESGAGLVKITINGAHNCRRIDIDPS Sbjct: 3760 MFGKGGLGGLMKQAQQMQEKMQKMQEEIAQLEVTGESGAGLVKITINGAHNCRRIDIDPS 3939 Query: 61 LMEDDKEMLEDLIAAAFNDAVRRAEELQKEKMASVTAGMPLPPGMKFPF 109 LMEDDKEMLEDLIAAAFNDAVRRAEELQKEKMASVTAGMPLPPGMKFPF Sbjct: 3940 LMEDDKEMLEDLIAAAFNDAVRRAEELQKEKMASVTAGMPLPPGMKFPF 4086 ID HI32727 standard; DNA; PRO; 11878 BP. XX AC U32727; L42023; XX SV U32727.1 XX DT 09-AUG-1995 (Rel. 44, Created) DT 15-JUN-1998 (Rel. 56, Last updated, Version 9) XX DE Haemophilus influenzae Rd section 42 of 163 of the complete genome. XX KW . XX OS Haemophilus influenzae Rd OC Bacteria; Proteobacteria; gamma subdivision; Pasteurellaceae; Haemophilus; OC Haemophilus influenzae. XX RN [1] RP 1-11878 RX MEDLINE; 95350630. RA Fleischmann R.D., Adams M.D., White O., Clayton R.A., Kirkness E.F., RA Kerlavage A.R., Bult C.J., Tomb J., Dougherty B.A., Merrick J.M., RA McKenney K., Sutton G.G., FitzHugh W., Fields C.A., Gocayne J.D., RA Scott J.D., Shirley R., Liu L.I., Glodek A., Kelley J.M., Weidman J.F., RA Phillips C.A., Spriggs T., Hedblom E., Cotton M.D., Utterback T., RA Hanna M.C., Nguyen D.T., Saudek D.M., Brandon R.C., Fine L.D., RA Fritchman J.L., Fuhrmann J.L., Geoghagen N.S., Gnehm C.L., McDonald L.A., RA Small K.V., Fraser C.M., Smith H.O., Venter J.C.; RT "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd"; RL Science 269(5223):496-512(1995). XX RN [2] RP 1-11878 RX MEDLINE; 96398784. RA Tatusov R.L., Mushegian A.R., Bork P., Brown N.P., Hayes W.S., RA Borodovsky M., Rudd K.E., Koonin E.V.; RT "Metabolism and evolution of Haemophilus influenzae deduced from a RT whole-genome comparison with Escherichia coli"; RL Curr. Biol. 6(3):279-291(1996). XX RN [3] RP 1-11878 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D.; RT ; RL Submitted (25-JUL-1995) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX RN [4] RP 1-11878 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D.; RT ; RL Submitted (27-SEP-1997) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX RN [5] RP 1-11878 RA White O., Clayton R.A., Kerlavage A.R., Fleischmann R.D., Peterson J., RA Hickey E., Dodson R., Gwinn M.; RT ; RL Submitted (28-MAY-1998) to the EMBL/GenBank/DDBJ databases. RL The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD RL 20850, USA XX DR SWISS-PROT; P31776; PBPA_HAEIN. DR SWISS-PROT; P31777; YHIR_HAEIN. DR SWISS-PROT; P43704; TOP3_HAEIN. DR SWISS-PROT; P44330; K1PF_HAEIN. DR SWISS-PROT; P44711; YBAB_HAEIN. DR SWISS-PROT; P44712; RECR_HAEIN. DR SWISS-PROT; P44713; SECG_HAEIN. DR SWISS-PROT; P44714; PTFB_HAEIN. DR SWISS-PROT; P44715; PTFA_HAEIN. XX FH Key Location/Qualifiers FH FT source 1..11878 FT /db_xref="taxon:71421" FT /organism="Haemophilus influenzae Rd" FT CDS 80..2674 FT /codon_start=1 FT /db_xref="SWISS-PROT:P31776" FT /note="similar to SP:P02918 GB:X02164 PID:581194 PID:606330 FT GB:U00096 percent identity: 55.24; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0440" FT /product="penicillin-binding protein 1A (ponA)" FT /protein_id="AAC22099.1" FT /translation="MRIAKLILNTLLTLCILGLVAGGMLYFHLKSELPSVETLKTVELQ FT QPMQIYTADGKLIGEVGEQRRIPVKLADVPQRLIDAFLATEDSRFYDHHGLDPIGIARA FT LFVAVSNGGASQGASTITQQLARNFFLTSEKTIIRKAREAVLAVEIENTLNKQEILELY FT LNKIFLGYRSYGVAAAAQTYFGKSLNELTLSEMAIIAGLPKAPSTMNPLYSLKRSEERR FT NVVLSRMLDEKYISKEEYDAALKEPIVASYHGAKFEFRADYVTEMVRQEMVRRFGEENA FT YTSGYKVFTTVLSKDQAEAQKAVRNNLIDYDMRHGYRGGAPLWQKNEAAWDNDRIVGFL FT RKLPDSEPFIPAAVIGIVKGGADILLASGEKMTLSTNAMRWTGRSNPVKVGEQIWIHQR FT ANGEWQLGQIPAANSALVSLNSDNGAIEAVVGGFSYEQSKFNRATQSLVQVGSSIKPFI FT YAAALEKGLTLSSVLQDSPISIQKPGQKMWQPKNSPDRYDGPMRLRVGLGQSKNIIAIR FT AIQTAGIDFTAEFLQRFGFKRDQYFASEALALGAASFTPLEMARAYAVFDNGGFLIEPY FT IIEKIQDNTGKDLFIANPKIACIECNDIPVIYGETKDKINGFANIPLGENALKPTDDST FT NGEELDQQPETVPELPELQSNMTALKEDAIDLMAAAKNASSKIEYAPRVISGELAFLIR FT SALNTAIYGEQGLDWKGTSWRIAQSIKRSDIGGKTGTTNSSKVAWYAGFGANLVTTTYV FT GFDDNKRVLGRGEAGAKTAMPAWITYMKTALSDKPERKLSLPPKIVEKNIDTLTGLLSP FT NGGRKEYFIAGTEPTRTYLSEMQERGYYVPTELQQRLNNEGNTPATQPQELF" FT CDS 2762..3607 FT /codon_start=1 FT /db_xref="SWISS-PROT:P31777" FT /note="similar to GB:M62809 SP:P31777 PID:1573417 percent FT identity: 100.00; identified by sequence similarity; FT putative" FT /transl_table=11 FT /gene="HI0441" FT /product="orfJ protein" FT /protein_id="AAC22100.1" FT /translation="MLSYHHSFHAGNHADVLKHIVLMLILENLKLKEKGFFYLDTHSGV FT GRYRLSSNESEKTGEYKEGIGRLWDQTDLPEDIARYVKMIKKLNYGGKELRYYAGSPLI FT AAELLRSQDRALLTELHPSDYPILRNNFSDDKNVTVKCDNGFQQVKATLPPKERRGLVL FT IDPPYELKDDYDLVVKAIEEGYKRFATGTYAIWYPVVLRQQTKRIFKGLEATGIRKILK FT IELAVRPDSDQRGMTASGMVVINPPWTLETQMKEILPYLTKTLVPEGTGSWTVEWITPE FT " FT CDS 3724..4089 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44711" FT /note="similar to GB:M38777 SP:P17577 GB:X04487 PID:145298 FT PID:43322 percent identity: 87.16; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0442" FT /product="conserved hypothetical protein" FT /protein_id="AAC22101.1" FT /translation="MHAIFDLIKDNIMFGKGGLGGLMKQAQQMQEKMQKMQEEIAQLEV FT TGESGAGLVKITINGAHNCRRIDIDPSLMEDDKEMLEDLIAAAFNDAVRRAEELQKEKM FT ASVTAGMPLPPGMKFPF" FT CDS 4151..4753 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44712" FT /note="similar to GB:M38777 SP:P12727 GB:X15761 PID:145299 FT PID:42697 percent identity: 75.38; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0443" FT /product="recombination protein (recR)" FT /protein_id="AAC22102.1" FT /translation="MQSSPLLEHLIENLRCLPGVGPKSAQRMAYHLLQRNRSGGMNLAR FT ALTEAMSKIGHCSQCRDFTEEDTCNICNNPRRQNSGLLCVVEMPADIQAIEQTGQFSGR FT YFVLMGHLSPLDGIGPREIGLDLLQKRLVEESFHEVILATNPTVEGDATANYIAEMCRQ FT HNIKVSRIAHGIPVGGELETVDGTTLTHSFLGRRQID" FT CDS 4769..6724 FT /codon_start=1 FT /db_xref="SWISS-PROT:P43704" FT /note="similar to GB:J05076 SP:P14294 PID:148026 GB:U00096 FT PID:1742870 percent identity: 65.94; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0444" FT /product="DNA topoisomerase III (topB)" FT /protein_id="AAC22103.1" FT /translation="MRLFIAEKPSLARAIADVLPKPHQRGDGFIKCGDNDVVTWCVGHL FT LEQAEPDAYDPKFKQWRLEHLPIIPEKWQLLPRKEVKKQLSVVEKLIHQADTLVNAGDP FT DREGQLLVDEVFSYANLSAEKRDKILRCLISDLNPSAVEKAVKKLQPNRNFIPLATSAL FT ARARADWLYGINMTRAYTIRGRQTGYDGVLSVGRVQTPVLGLIVRRDLEIEHFQPKDFF FT EVQAWVNPESKEEKTPEKSTALFSALWQPSKACEDYQDDDGRVLSKGLAEKVVKRITNQ FT PAEVTEYKDVREKETAPLPYSLSALQIDAAKRFGMSAQAVLDTCQRLYETHRLITYPRS FT DCRYLPEEHFAERHNVLNAISTHCEAYQVLPNVILTEQRNRCWNDKKVEAHHAIIPTAK FT NRPVNLTQEERNIYSLIARQYLMQFCPDAEYRKSKITLNIAGGTFIAQARNLQTAGWKE FT LLGKEDDTENQEPLLPIVKKGQILHCERGEVMSKKTQPPKPFTDATLLSAMTGIARFVQ FT DKELKKILRETDGLGTEATRAGIIELLFKRGFLTKKGRNIHSTETGRILIQALPNIATQ FT PDMTAHWESQLTDISQKQATYQQFMHNLNQILPDLVRFVDLNALRQLSRIKMIKSDRAK FT PKSAVKKSSKSNGETD" FT CDS 6833..7171 FT /codon_start=1 FT /db_xref="SWISS-PROT:P44713" FT /note="similar to GB:D16463 SP:P33582 GB:U01376 PID:431135 FT PID:606113 percent identity: 58.93; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0445" FT /product="protein-export membrane protein (secG)" FT /protein_id="AAC22104.1" FT /translation="MYQVLLFIYVVVAIALIGFILVQQGKGANAGASFGGGASGTMFGS FT AGAGNFLTRTSAILATAFFVIALVLGNMNSHKGNVQKGTFDDLSQAAEQVQQQAAPAKD FT NKNSDIPQ" FT tRNA 7195..7277 FT /gene="tRNA-Leu-3" FT /product="Leu GAG" FT CDS complement(7700..9370) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44714" FT /note="similar to SP:P20966 GB:M23196 PID:405893 PID:450372 FT GB:U00096 percent identity: 57.17; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0446" FT /product="PTS system, fructose-specific IIBC component FT (fruA)" FT /protein_id="AAC22105.1" FT /translation="MKLFLTQSANVGDVKAYLLHEVFRAAAQKANVSIVGTPAEADLVL FT VFGSVLPNNPDLVGKKVFIIGEAIAMISPEVTLANALANGADYVAPKSAVSFTGVSGVK FT NIVAVTACPTGVAHTFMSAEAIEAYAKKQGWNVKVETRGQVGAGNEITVEEVAAADLVF FT VAADIDVPLDKFKGKPMYRTSTGLALKKTEQEFDKAFKEAKIFDGGNNAGTKEESREKK FT GVYKHLMTGVSHMLPLVVAGGLLIAISFMFSFNVIENTGVFQDLPNMLINIGSGVAFKL FT MIAVFAGYVAFSIADRPGLAVGLIAGMLASEAGAGILGGIIAGFLAGYVVKGLNVIIRL FT PASLTSLKPILILPLLGSMIVGLTMIYLINPPVAEIMKELSNWLTSMGEVNAIVLGAII FT GAMMCIDMGGPVNKAAYTFSVGLIASQVYTPMAAAMAAGMVPPIGMTVATWIARNKFTV FT SQCDAGKASFVLGLCFISEGALPFVAADPIRVIISSVIGGAVAGAISMGLNITLQAPHG FT GLFVIPFVSEPLKYLGAIAIGALSTGVVYAIIKSKNNAE" FT CDS complement(9372..10313) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44330" FT /note="similar to SP:P23539 GB:X53948 PID:405894 PID:41487 FT GB:U00096 percent identity: 55.41; identified by sequence FT similarity; putative" FT /transl_table=11 FT /gene="HI0447" FT /product="1-phosphofructokinase (fruK)" FT /protein_id="AAC22106.1" FT /translation="MASVVTITLNAAYDLVGRLNRIQLGEVNTVETLGLFPAGKGINVA FT KVLKDLGVNVAVGGFLGKDNSADFEQMFNQHGLEDKFHRVDGKTRINVKITETEADVTD FT LNFLGYQISPQVWQQFVTDSLAYCLNYDIVAVCGSLPRGVSPELFADWLNQLHQAGVKV FT VLDSSNAALTAGLKAKPWLVKPNHRELEAWVGHPLNSLEEIIAAAQQLKAEGIENVIIS FT MGAKGSLWINNEGVLKAEPAQCENVVSTVGAGDSMVAGLIYGFEKGLSKTETLAFATAV FT SAFAVSQSNVGVSDLSLLDPILEKVQITMIEG" FT CDS complement(10315..11814) FT /codon_start=1 FT /db_xref="SWISS-PROT:P44715" FT /note="similar to PID:619247 SP:P24217 GB:U00096 FT PID:1736835 PID:1788494 percent identity: 51.47; identified FT by sequence similarity; putative" FT /transl_table=11 FT /gene="HI0448" FT /product="PTS system, fructose-specific IIA/FPr component FT (fruB)" FT /protein_id="AAC22107.1" FT /translation="MLELSESNIHLNANAIDKQQAIEMAVSALVQAGNVENGYLQGMLA FT RELQTSTFLGNGIAIPHGTLDTRLMVKKTGVQVFQFPQGIEWGEGNIAYVVIGIAARSD FT EHLSLLRQLTHVLSDEDTAAKLAKITDVAEFCAILMGETIDPFEIPAANISLDVNTQSL FT LTLVAINAGQLQVQSAVENRFISEVINNAALPLGKGLWVTDSVVGNVKNALAFSRAKTI FT FSHNGKAVKGVITVSAVGDQINPTLVRLLDDDVQTTLLNGNSTEILTALLGSSSDVETQ FT SVEGAVVGTFTIRNEHGLHARPSANLVNEVKKFTSKITMQNLTRESEVVSAKSLMKIVA FT LGVTQGHRLRFVAEGEDAKQAIESLGKAIANGLGENVSAVPPSEPDTIEIMGDQIHTPA FT VTEDDNLPANAIEAVFVIKNEQGLHARPSAILVNEVKKYNASVAVQNLDRNSQLVSAKS FT LMKIVALGVVKGTRLRFVATGEEAQQAIDGIGAVIESGLGE" XX SQ Sequence 11878 BP; 3756 A; 2532 C; 2274 G; 3316 T; 0 other; aggctgaatc ttttatactt aacagccaaa tcataggtct cttgacaaga atattcatta 60 acgaagcgag aattttacga tgcggatcgc aaaattaata ttaaacaccc tattaacttt 120 atgtatttta ggtttagtag caggcggaat gttgtatttc cacttaaaat ctgaattgcc 180 ctcagtagaa acattaaaaa ccgttgaatt acagcaacca atgcagattt atacggctga 240 cggtaaatta attggcgaag tgggtgagca acgccgtatt ccagtgaaat tagccgatgt 300 gccacaacgc ttaattgacg catttttagc gacggaagac agtcgttttt acgatcatca 360 cggattagac cctatcggca ttgcccgtgc attgtttgtc gcagtgagta atggcggtgc 420 atcacaaggc gcaagtacga ttactcaaca attagcgcgt aactttttct taacctcaga 480 aaaaaccatt attcgtaaag ctcgtgaagc cgtgcttgcg gtagaaatcg aaaatactct 540 caacaaacaa gaaatattag agctttattt aaacaaaatc tttttaggct atcgttctta 600 tggtgttgca gcggcagcac aaacctattt cggtaaatca ttgaatgaat tgaccttatc 660 ggaaatggcg attattgctg gtttacctaa agcaccttca acaatgaacc cgctttattc 720 tttaaaacgt tcagaagaac gccgcaatgt ggtgctaagc cgtatgttag atgaaaaata 780 catcagcaaa gaagaatatg atgctgcatt gaaagagccg attgtggcga gctatcacgg 840 cgcaaaattt gaatttcgag ccgattatgt cactgaaatg gtgcgtcaag aaatggtgcg 900 tcgttttggc gaagaaaatg cttacaccag tggttataaa gtatttacca ctgtactttc 960 aaaagaccaa gctgaagccc aaaaagctgt gcgtaataac ttgattgatt acgatatgcg 1020 tcacggttat cgcggtggcg cgccattatg gcaaaaaaat gaagccgctt gggacaatga 1080 tcgcattgtc ggttttctac gcaaactacc tgattcagag ccatttattc ctgcggcagt 1140 gattggaatt gtaaaaggcg gtgctgatat attgctcgct tctggggaaa aaatgacctt 1200 atcaaccaat gcaatgcgtt ggacaggcag aagcaatcct gtgaaagtcg gcgagcaaat 1260 ttggattcat cagcgtgcta atggggaatg gcaattagga caaattcccg cagcaaattc 1320 agcattagtt tctcttaatt cagataatgg tgcgattgaa gcagtggtcg gtggctttag 1380 ctatgaacaa agtaaattca atcgagccac acagtcttta gttcaagtgg gttcttctat 1440 caaaccattt atttacgcgg cagcattaga aaaaggctta acactttcaa gcgtattaca 1500 agacagcccg atttctattc aaaaaccggg acaaaaaatg tggcaaccga aaaactcgcc 1560 tgatcgttat gatggcccga tgcgtttacg cgtaggatta ggtcaatcca aaaatataat 1620 tgctattcgt gctatccaaa cggcaggtat tgatttcaca gcagaatttt tacaacgttt 1680 tggttttaaa cgtgatcaat attttgccag tgaagcctta gcacttggcg cagcctcttt 1740 cacaccatta gaaatggcgc gagcttatgc ggtgtttgat aatggtggct tcctcattga 1800 accttatatc attgaaaaaa ttcaagataa cacgggtaaa gacttattta ttgcaaaccc 1860 taaaattgct tgcattgaat gtaatgatat acctgtaatt tatggcgaaa ccaaagacaa 1920 aatcaacggc tttgccaata ttcctttagg cgagaatgcc ttaaaaccaa cagatgacag 1980 caccaatggc gaagaattag atcaacaacc tgaaactgtg cctgaactgc cagaattaca 2040 atcaaatatg accgcactta aagaagatgc gattgattta atggctgctg caaaaaatgc 2100 ttcgtcgaaa atagaatatg cgccacgtgt cattagtggc gaacttgctt ttctcattcg 2160 tagtgcctta aatacggcaa tttatggcga acaaggttta gactggaaag gcaccagctg 2220 gcgtattgca caaagcatta aacgtagcga tataggcggt aaaacaggta ctaccaacag 2280 ttcaaaagtg gcttggtatg cgggatttgg tgcaaactta gtaaccacaa cttatgtcgg 2340 gtttgatgat aacaaacgag tacttgggcg tggagaagca ggagcaaaaa cagcaatgcc 2400 tgcttggatc acttatatga aaacggcttt gagtgataag ccagaacgta aattgtcgct 2460 accgccaaaa attgtggaaa aaaacattga tactttgaca ggtttgcttt ctccaaacgg 2520 tgggcgcaaa gaatatttta ttgcgggaac agaaccgaca cgtacctatc tttcggaaat 2580 gcaagagcgt ggttattatg ttcctactga gttacagcaa cgcttaaata atgaaggcaa 2640 tacgccagcg acgcaaccac aagaactctt ctaaaaatga ccgcacttta gtgtgtaatc 2700 gggcgagtca attggctcgc cttcgttatt aaacaaataa ggaatatata aggaatatat 2760 tatgctgagt tatcatcact catttcacgc tggcaatcat gccgatgtct tgaaacatat 2820 tgttttaatg ctcattttgg aaaatcttaa actcaaagaa aaaggctttt tttatttgga 2880 tacgcactct ggtgtggggc gttatcgttt atcctcaaat gaatcagaaa aaacggggga 2940 atataaagaa ggtattggac gcctgtggga tcaaacagat ttacccgaag atattgctcg 3000 ttatgtaaaa atgatcaaaa aactcaatta tggtggcaaa gaactacgtt attacgcggg 3060 ttctccatta attgccgcgg aattgttgcg ctcacaagat cgcgcactat tgaccgagct 3120 tcatcctagc gattatccaa ttcttcgcaa taattttagc gacgacaaaa atgtcaccgt 3180 aaaatgtgac aatggctttc aacaagtcaa agcaacgctt ccgccaaaag aacgccgagg 3240 cttagtactc atcgatccgc cttatgaatt aaaagatgat tatgatctcg ttgttaaagc 3300 cattgaagag ggctataaac gttttgccac tggcacttat gcgatttggt atcctgttgt 3360 attacgccaa caaactaaac gtatttttaa gggtttagaa gcaacgggaa ttagaaaaat 3420 tctaaaaatt gaactcgccg ttcgcccaga tagcgatcaa cgaggaatga ctgcgagcgg 3480 tatggtggta attaatccac cttggacatt agaaacccaa atgaaagaaa ttttgcctta 3540 tctcacaaaa acattagttc cagaaggtac tggaagctgg acagtcgaat ggattacacc 3600 agagtaattc tccattataa acgccactaa atttcaccgc actttatagc ttaaaagtgc 3660 ggtaagattt tctaagattt taaataaatc ccacagttcc caaaagccca ttttttcgct 3720 agaatgcacg caatttttga cttaataaag gataacatta tgtttggaaa aggcggttta 3780 ggcggcttaa tgaaacaagc ccaacaaatg caagaaaaaa tgcaaaaaat gcaggaagaa 3840 atcgcgcaac tagaagtaac gggtgaatct ggtgcaggtt tagtgaaaat cacaattaat 3900 ggtgcacata actgccgtcg cattgacatt gatccgtctt taatggaaga cgataaagaa 3960 atgttagaag atttaatcgc cgccgctttt aacgatgcgg tacgtcgtgc agaagaatta 4020 caaaaagaaa aaatggcatc agttaccgct ggaatgccat taccaccagg aatgaagttt 4080 ccgttttaat tattaggcgc gtgaatataa gattcacgcg tcatctctca gtttcctctc 4140 attaatcact atgcaaagca gtccactttt agaacacctt attgaaaact tacgttgtct 4200 tccaggcgta gggcctaaat ctgcgcaacg tatggcttat catcttttac agcgtaatcg 4260 tagcggtgga atgaatttag ctcgagcact cacagaagcc atgtctaaaa ttggtcattg 4320 ttcacagtgt cgagacttta cggaagaaga cacttgcaac atttgcaata atccacgccg 4380 tcaaaattca ggtttgcttt gtgtcgttga aatgccagca gatattcaag cgattgagca 4440 aacggggcaa ttttcaggac gttattttgt tttaatggga catttgtctc cacttgatgg 4500 tattggacct cgtgaaattg gcttagattt actgcaaaaa cgcttagtag aagaatcttt 4560 ccacgaagtg attcttgcaa caaatccaac ggtggaaggc gatgcgacgg caaactacat 4620 tgctgaaatg tgccgccaac ataatatcaa agtgagtcgt atcgctcacg gtatccctgt 4680 cggtggtgaa ctggaaactg tggacggcac aacgcttact cactcttttc ttggtcgtcg 4740 tcaaatcgac taatcttctc ataccattat gcgtttattt atcgccgaaa aaccaagctt 4800 agctcgcgct attgctgatg tgctcccgaa accccatcaa cgtggggatg gttttattaa 4860 atgcggcgat aatgatgtgg taacttggtg tgtagggcat ttgctcgagc aagcagaacc 4920 cgatgcctat gatcctaaat tcaaacaatg gcgtttagaa catctcccta tcattcctga 4980 aaaatggcaa cttttaccgc gaaaagaagt aaaaaaacaa ctttctgtgg tagaaaaact 5040 gatccatcaa gcggatacac ttgttaatgc aggagaccct gatagagaag gacaattact 5100 tgtagatgaa gtgttcagtt atgccaattt atctgccgaa aaacgcgata aaattttacg 5160 ctgtttgatt agcgacctaa atccaagtgc ggtagaaaaa gcggttaaaa aattacaacc 5220 caaccgtaat tttattcctc ttgccacatc cgctctcgca cgagctcgcg ccgattggct 5280 ttatggcatt aatatgacca gagcttatac cattcgaggt agacaaactg gctatgatgg 5340 cgtgctttct gttggacgag tgcaaacccc tgtgctaggt ttaattgtac gtcgagattt 5400 agaaattgag catttccagc ccaaagactt ctttgaagta caggcttggg ttaatccaga 5460 gagcaaagaa gaaaaaactc cagaaaaatc aaccgcactt ttcagtgcct tatggcaacc 5520 tagcaaagct tgcgaggatt accaagatga tgacggcagg gtgctctcta aaggattagc 5580 agaaaaagtt gtaaaacgta ttacaaatca acctgcagaa gtcaccgaat acaaagacgt 5640 tcgtgaaaaa gaaacggcgc ccttgcctta ttcactttct gcgttacaaa ttgatgcagc 5700 aaagcgattt ggtatgtccg ctcaagccgt attagatacg tgccaacgcc tatatgaaac 5760 tcatcgtttg attacttatc cgcgttctga ttgtcgctat ttgcccgaag aacattttgc 5820 cgaacgccat aatgtattaa atgcaatttc aacccattgt gaagcttatc aagtcttacc 5880 aaatgttatt ttaactgaac agagaaatcg ctgttggaat gataaaaaag tagaggctca 5940 ccatgcaatc attcctactg ccaaaaatcg tccagtaaat ctaacacaag aagaacgtaa 6000 tatttacagc cttatcgctc gccaatattt aatgcaattt tgccctgacg cagaatatcg 6060 taaaagtaaa atcacattaa atatcgctgg cggtactttt atagctcaag cacgaaattt 6120 acaaacggct gggtggaaag aattgttagg caaagaagac gacacagaaa atcaagaacc 6180 cttattacca atagtaaaga aaggacaaat tttgcattgt gaacgcggag aagtgatgag 6240 taaaaaaact cagccaccaa agcccttcac tgatgcaacc ttgctttcag caatgacggg 6300 cattgcgcga tttgtgcagg ataaagaact caaaaaaatt ctgcgagaaa ccgatggatt 6360 gggtacagag gctactcgcg ctggcatcat tgaattactt tttaaacgcg gttttttaac 6420 taaaaaaggg cgaaatattc atagcacaga aacaggaaga attttaattc aggcattacc 6480 aaatattgcc acccaaccag atatgacggc tcattgggaa tcgcaactca cagacattag 6540 tcaaaaacaa gcaacctatc aacaatttat gcataacttg aatcaaatat tgcctgattt 6600 agtccgtttt gtagatctta atgcattaag acaattaagc cgcattaaaa tgattaaatc 6660 cgatagagcg aaacctaaaa gtgcggtaaa aaaatcgagt aaatcaaacg gtgagactga 6720 ttaaagttaa accaattaat atttttttta tgcaaagcac ttgcaagaaa attgaatttt 6780 ctataacatt cacgaccagt tttcttgcaa aattgcaaat tagagagaaa aaatgtatca 6840 agttctttta tttatttatg ttgttgtcgc gattgccttg atcgggttta ttctcgttca 6900 acaaggtaaa ggagcgaacg cgggcgcatc ttttggtggt ggtgcatcag gtacaatgtt 6960 tggctctgct ggtgcaggta actttttaac acgtactagt gcgattttag caaccgcatt 7020 ttttgttatt gctcttgttt taggtaatat gaattcacat aaaggcaatg ttcaaaaagg 7080 tacttttgac gatttatcac aagctgcaga gcaagttcaa caacaagcag ctccagcgaa 7140 agacaataaa aatagcgaca tcccacaata aaataggatt agcataaaat taacgctctg 7200 gtggtggaat tggtagacac gctatcttga gggggtagtg accacaggtc gtgcgagttc 7260 aagtctcgcc cagagcacca aattaaaaat atacagcact aaaagcatta caaagccttg 7320 attctactga aatcaaggct tttttattcc ctatactcca ctataaaata ctatcgaacc 7380 ttagaatttt agtaatagac ttaataatat attcacgcta aggcaaaaaa atcatagaga 7440 cttgtaaata ccaggattat tacctagttt gctatattac taagtagggc tattttatgg 7500 catacttaac aaatccgctt tctgacaccc aaattaaaaa agcaaaactg aaagaaaaag 7560 attgctccta cttttacaaa cgattcacga atacatccat aactaaaagg actaagccct 7620 gtagattaca gagcctagtc cttgatttca tagtctaact ttttcaatga aaattatttt 7680 ttacattcaa agtgcggtat tattccgcat tgtttttcga tttaataatg gcgtaaacca 7740 cacctgttga taaagcacca atcgcaattg caccaagata tttcaacggc tcagacacaa 7800 atggaatgac aaataatccg ccgtgcggag cttgtaaggt aatatttaat cccattgaaa 7860 tcgcaccagc caccgcacca ccaatcacag agctaataat gactcgaatt ggatctgctg 7920 ccacaaatgg taatgctcct tcagaaataa aacataaacc aagaacaaat gatgctttcc 7980 ccgcatcgca ttggctaacg gtaaatttat tgcgagcaat ccaagtcgca acagtcatac 8040 caattggagg aaccatacct gcggccattg ccgcagccat tggtgtataa acttgggaag 8100 caatcaatcc aaccgaaaaa gtgtacgctg ctttatttac tggcccgccc atatcaatac 8160 acatcattgc gccgataatt gcgcctaaca caatcgcatt cacttcgccc attgaagtaa 8220 gccaattact tagctctttc ataatttcag ccactggtgg attgatgagg taaatcatcg 8280 tcaaaccaac aatcatcgaa cctaagagcg gtaaaattaa aatcggtttt aatgaagtca 8340 gactcgctgg caatcgaata atcacattta gccctttcac cacatagccc gctaagaaac 8400 cagcaataat accgccaaga atccctgcac ccgcttcact tgccagcatc cctgcaatca 8460 aaccaactgc aagtcctgga cgatcagcga tagaaaatgc aacgtagcca gcaaacacgg 8520 caatcattaa cttgaacgcc acaccactac cgatatttat caacatattt ggaagatctt 8580 gaaaaactcc tgtattttca attacattaa aactgaacat aaaggaaatc gcgatcaaca 8640 atccaccagc aaccactaac ggtaacatat gagaaacacc agtcattaag tgtttgtaca 8700 cacctttttt ctcacggctt tcttctttcg ttcccgcatt atttccacca tcaaaaattt 8760 tggcttcttt aaatgcttta tcaaattctt gctcagtttt ctttaaggct aagcctgttg 8820 aagtgcggta cattggttta cctttaaatt tatctaaagg cacgtcaata tctgctgcaa 8880 caaataccaa atcagccgct gcgacttctt ccacagtaat ttcattgcca gccccaactt 8940 gaccacgggt ttcaactttt acattccaac cttgcttttt cgcataggct tcaatcgctt 9000 cagccgacat aaacgtatgt gcaacgccag ttggacaagc agttaccgca acaatatttt 9060 ttacaccaga aacacctgta aaactcaccg cacttttcgg tgcaacataa tcagccccat 9120 tggctaaagc atttgccagc gtgacttcag gagaaatcat ggcgatagct tcgccgataa 9180 taaagacttt tttaccaact aaatcggggt tatttggtaa tacagaacca aatacgagaa 9240 caagatctgc ttcggctggt gttcctacaa tagagacatt tgctttttgt gcagctgctc 9300 gaaaaacttc atgtaataaa taggctttca catcgcctac attagctgat tgggttaaaa 9360 ataacttcat actatccttc aatcatcgtt atttgtactt tttctaaaat agggtcaagt 9420 aaactcaaat cactcacacc gacgttactc tgagatacag caaaagcgga cacagccgta 9480 gcaaaagcta aagtttctgt tttcgataaa cctttttcaa aaccatagat taagcctgct 9540 accattgaat cccctgctcc aacggtactc acgacatttt cacattgtgc tggttcggct 9600 ttgagcacgc cttcattatt aatccataaa gaaccttttg cgcccattga aataatcacg 9660 ttttcaatgc cttccgcttt gagttgctgt gcagcggcaa taatttcttc aagactattt 9720 aatggatgac caacccaagc ctcaagttct cgatgattgg gttttactag ccaaggtttc 9780 gcttttaatc cagcggtaag cgcagcattg ctactatcaa gcaccacttt aacgccagct 9840 tgatgaagct gattcaacca atcagcaaac aattcaggag acacacctcg aggtaaactg 9900 ccacaaactg cgacaatgtc ataatttaaa caataagcca aagaatccgt cacaaattgt 9960 tgccaaactt gtgggcttat ttgatacccc aagaaattta aatctgtcac atccgcttca 10020 gtctcggtaa tttttacatt gatacgcgtt ttgccatcaa cacggtggaa tttatcttcc 10080 aaaccgtgct gattaaacat ttgctcaaaa tcagctgaat tatccttgcc caagaaaccg 10140 cctactgcaa cattaacgcc taaatctttt agtacttttg cgacattaat gccttttcca 10200 gcagggaata agcccaatgt ttctaccgta tttacttcgc caagttgaat acgatttaaa 10260 cgcccaacta aatcataggc ggcatttaag gtaatggtta cgacgcttgc catactattc 10320 ccctaaaccc gattcgatca cagcaccgat tccgtcaata gcttgttgcg cttcttcgcc 10380 cgtcgcaaca aaacgtaaac gagtcccttt gactacgcct aatgccacaa ttttcatcaa 10440 acttttagca ctgactaatt gagaattacg atcaaggttt tgaactgcta cggaagcatt 10500 atatttcttc acttcgttta ccaatattgc acttgggcga gcgtgcaaac cttgctcgtt 10560 tttaatcaca aatacggctt caatggcatt tgctggcaaa ttatcatctt ctgttaccgc 10620 aggtgtgtga atttgatcgc ccataatctc aatggtatcg ggttcagacg gtggcactgc 10680 cgaaacattc tctcccaagc cattggctat cgctttgccc aatgattcaa ttgcttgttt 10740 tgcatcttcc ccctctgcca caaaacgtaa acgatgtcct tgtgttacac caagcgcaac 10800 aatcttcatc aaactttttg cgctgacgac ttcactttca cgagtaagat tttgcatggt 10860 tattttagaa gtgaattttt tcacttcatt aaccaaattt gcacttgggc gtgcatgtaa 10920 gccgtgttca ttacgaattg taaaagttcc tacaacagca ccttctacag attgagtttc 10980 aacatcactc gatgaaccca aaagtgcggt caaaatttct gttgaatttc cattgagcaa 11040 ggttgtttgg acatcgtcat ctaataaacg caccaaagtt ggattaattt gatcaccgac 11100 ggctgaaacg gttatcacgc cttttacggc cttgccattg tggctaaaaa ttgttttagc 11160 acgactaaat gctaaggcat ttttcacatt tcctacgaca gagtccgtta cccataaccc 11220 tttgccaagc ggtaatgccg cattatttat cacttcagaa ataaagcgat tttctaccgc 11280 actttgcacc tgtaattgcc ccgcattaat cgcaactaaa gttaataaac tttgggtatt 11340 tacatctaaa ctgatattgg cagcaggaat ttcaaatgga tcaattgtct cgcccattaa 11400 aatcgcgcaa aattcagcca catcagttat ttttgctaat tttgctgcgg tatcttcatc 11460 actcaaaacg tgggtaagtt gacgtaataa ggacaaatgc tcatcagatc gcgccgcaat 11520 accaatcacg acataagcaa tatttccttc gccccattct atgccttgag gaaattgaaa 11580 cacttgcacg cccgttttct ttaccatcaa acgagtatcc aaggtgccgt gaggaatagc 11640 gatgccattg cctaaaaagg tcgaggtttg cagctcgcgc gctaacattc cttgcaagta 11700 accattttca acattgcctg cttgaaccaa tgcggacact gccatttcga tcgcttgttg 11760 tttatcaatt gcattagcat ttaaatgaat gttgctttcc gaaagttcta acattcttgc 11820 tccttaataa tttaaaaaga aagctgcgga tgaaacaaat cgaccagcag cttattga 11878 // 20000525 22.55 Leszek: T0094 >emb|Y11650.1|ATCYCPHOS A.thaliana mRNA for cyclic phosphodiesterase Length = 741 Score = 375 bits (952), Expect = e-103 Identities = 181/181 (100%), Positives = 181/181 (100%) Frame = +1 Query: 1 MEEVKKDVYSVWALPDEESEPRFKKLMEALRSEFTGPRFVPHVTVAVSAYLTADEAKKMF 60 MEEVKKDVYSVWALPDEESEPRFKKLMEALRSEFTGPRFVPHVTVAVSAYLTADEAKKMF Sbjct: 94 MEEVKKDVYSVWALPDEESEPRFKKLMEALRSEFTGPRFVPHVTVAVSAYLTADEAKKMF 273 Query: 61 ESACDGLKAYTATVDRVSTGTFFFQCVFLLLQTTPEVMEAGEHCKNHFNCSTTTPYMPHL 120 ESACDGLKAYTATVDRVSTGTFFFQCVFLLLQTTPEVMEAGEHCKNHFNCSTTTPYMPHL Sbjct: 274 ESACDGLKAYTATVDRVSTGTFFFQCVFLLLQTTPEVMEAGEHCKNHFNCSTTTPYMPHL 453 Query: 121 SLLYAELTEEEKKNAQEKAYTLDSSLDGLSFRLNRLALCKTDTEDKTLETWETVAVCNLN 180 SLLYAELTEEEKKNAQEKAYTLDSSLDGLSFRLNRLALCKTDTEDKTLETWETVAVCNLN Sbjct: 454 SLLYAELTEEEKKNAQEKAYTLDSSLDGLSFRLNRLALCKTDTEDKTLETWETVAVCNLN 633 Query: 181 P 181 P Sbjct: 634 P 636 1: Y11650. A.thaliana mRNA fo...[gi:2065012] LOCUS ATCYCPHOS 741 bp mRNA PLN 27-MAY-1997 DEFINITION A.thaliana mRNA for cyclic phosphodiesterase. ACCESSION Y11650 VERSION Y11650.1 GI:2065012 KEYWORDS cyclic phosphodiesterase. SOURCE thale cress. ORGANISM Arabidopsis thaliana Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; Rosidae; Capparales; Brassicaceae; Arabidopsis. REFERENCE 1 (bases 1 to 741) AUTHORS Genschik,P., Hall,J. and Filipowicz,W. TITLE Cloning and characterization of the Arabidopsis cyclic phosphodiesterase which hydrolyzes ADP-ribose 1'',2''-cyclic phosphate and nucleoside 2',3'-cyclic phosphates JOURNAL J. Biol. Chem. 272 (20), 13211-13219 (1997) MEDLINE 97294733 REFERENCE 2 (bases 1 to 741) AUTHORS Filipowicz,W. TITLE Direct Submission JOURNAL Submitted (05-MAR-1997) W. Filipowicz, Friedrich Miescher Institut, PO Box 2543, 4002 Basel, SWITZERLAND FEATURES Location/Qualifiers source 1..741 /organism="Arabidopsis thaliana" /sub_species="Columbia C0" /db_xref="taxon:3702" /cell_type="leaf strip culture" /tissue_type="leaf" /clone_lib="lambda ZAP" CDS 94..639 /function="hydrolysis of ADP-ribose 1,2-cyclic phosphate and nucleoside 2,3-cyclic phosphates" /codon_start=1 /product="cyclic phosphodiesterase" /protein_id="CAA72363.1" /db_xref="GI:2065013" /db_xref="SPTREMBL:O04147" /translation="MEEVKKDVYSVWALPDEESEPRFKKLMEALRSEFTGPRFVPHVT VAVSAYLTADEAKKMFESACDGLKAYTATVDRVSTGTFFFQCVFLLLQTTPEVMEAGE HCKNHFNCSTTTPYMPHLSLLYAELTEEEKKNAQEKAYTLDSSLDGLSFRLNRLALCK TDTEDKTLETWETVAVCNLNP" polyA_signal 725..730 /note="putative" polyA_site 741 BASE COUNT 222 a 168 c 152 g 199 t ORIGIN 1 attaacgtat tccctgatta gtacaactaa tataaaatgt taatcttctt ctcataatat 61 tcttgaagca aattagcact gaccgatcga tccatggaag aggtgaagaa ggatgtatac 121 tcggtttggg cattaccaga tgaggaatcg gagccccgat tcaaaaagct aatggaagct 181 ttgagatccg aattcactgg cccaagattc gttcctcacg tcaccgtcgc cgtatctgct 241 tatctgacgg cagacgaagc caagaagatg ttcgaatcag cttgcgacgg tcttaaagct 301 tacaccgcca ccgttgatcg cgtctccacc ggaactttct tctttcaatg cgttttcttg 361 cttctccaaa ccacccctga ggtaatggaa gctggtgaac actgtaagaa ccatttcaat 421 tgttccacta ccacacctta catgccgcat ttgagcctgc tttacgctga gttgacagag 481 gaagagaaga agaatgcgca ggagaaagct tacacgctcg atagcagcct cgatggactc 541 agtttccggt taaaccgact tgctctatgc aaaaccgata ccgaggacaa gactctagag 601 acatgggaga cagtggctgt atgcaatctc aatccttaag aaagtcaaaa ctttggttct 661 gaatctgtaa caaaagatca taatgaactt gcttttgaat aatgtatcgt tttctctaaa 721 actgaataat aatgttacaa t // . . . 20000526 17.56 Leszek: Nanoworld wrote: The first atomic number must be 1...We corrected "0" to "1"See attachment, please! Dear Alexander. I have added the data to the files (I can not replace them). The new data is not important however because You are still obliged to submit the CASP formatted results to CASP before the CASP deadline. The data stored on the meta server does not have to have any format and will be used only as validation of Your later CASP formatted submission. Yours, Leszek . . . 20000530 19.12 Leszek: Nanoworld wrote: Ok!My programmers doing Perl-variant for server...Yours,Alexander Great, but I still have to translate the amino acid sequence into DNA so I will not be able to automate the submission. I will still have to send You E-mails. Leszek . . . 20000605 21.12 Leszek: new target T0095 >dbj|D13866.1|HUMACA Human mRNA for alpha-catenin, complete cds Length = 3429 Score = 476 bits (1212), Expect = e-133 Identities = 244/244 (100%), Positives = 244/244 (100%) Frame = +1 Query: 1 DHVSDSFLETNVPLLVLIEAAKNGNEKEVKEYAQVFREHANKLIEVANLACSISNNEEGV 60 DHVSDSFLETNVPLLVLIEAAKNGNEKEVKEYAQVFREHANKLIEVANLACSISNNEEGV Sbjct: 1228 DHVSDSFLETNVPLLVLIEAAKNGNEKEVKEYAQVFREHANKLIEVANLACSISNNEEGV 1407 Query: 61 KLVRMSASQLEALCPQVINAALALAAKPQSKLAQENMDLFKEQWEKQVRVLTDAVDDITS 120 KLVRMSASQLEALCPQVINAALALAAKPQSKLAQENMDLFKEQWEKQVRVLTDAVDDITS Sbjct: 1408 KLVRMSASQLEALCPQVINAALALAAKPQSKLAQENMDLFKEQWEKQVRVLTDAVDDITS 1587 Query: 121 IDDFLAVSENHILEDVNKCVIALQEKDVDGLDRTAGAIRGRAARVIHVVTSEMDNYEPGV 180 IDDFLAVSENHILEDVNKCVIALQEKDVDGLDRTAGAIRGRAARVIHVVTSEMDNYEPGV Sbjct: 1588 IDDFLAVSENHILEDVNKCVIALQEKDVDGLDRTAGAIRGRAARVIHVVTSEMDNYEPGV 1767 Query: 181 YTEKVLEATKLLSNTVMPRFTEQVEAAVEALSSDPAQPMDENEFIDASRLVYDGIRDIRK 240 YTEKVLEATKLLSNTVMPRFTEQVEAAVEALSSDPAQPMDENEFIDASRLVYDGIRDIRK Sbjct: 1768 YTEKVLEATKLLSNTVMPRFTEQVEAAVEALSSDPAQPMDENEFIDASRLVYDGIRDIRK 1947 Query: 241 AVLM 244 AVLM Sbjct: 1948 AVLM 1959 1: D13866. Human mRNA for alp...[gi:433410] LOCUS HUMACA 3429 bp mRNA PRI 03-FEB-1999 DEFINITION Human mRNA for alpha-catenin, complete cds. ACCESSION D13866 VERSION D13866.1 GI:433410 KEYWORDS alpha-catenin. SOURCE Homo sapiens (library: lambda ZAP II cDNA library of T. Oda) adult colon tumor epitherial cell cell line SW 1116 cDNA to mRNA, clone HAC-b2. ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Vertebrata; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. REFERENCE 1 (bases 1 to 3429) AUTHORS Hirohashi,S. TITLE Direct Submission JOURNAL Submitted (08-DEC-1992) to the DDBJ/EMBL/GenBank databases. Setsuo Hirohashi, National Cancer Center, Research Institute, Pathology Division; 5-1-1 Tsukiji, Chuo-ku, Tokyo 104, Japan (Tel:03-3542-2511(ex.4200), Fax:03-3248-2737) REFERENCE 2 (bases 1 to 3429) AUTHORS Oda,T., Kanai,Y., Shimoyama,Y., Nagafuchi,A., Tsukita,S. and Hirohashi,S. TITLE Cloning of the human alpha-catenin cDNA and its aberrant mRNA in a human cancer cell line JOURNAL Biochem. Biophys. Res. Comm. 193, 897-904 (1993) COMMENT Submitted (08-DEC-1992) to DDBJ by: Setsuo Hirohashi Department of Pathology Division National Cancer Center Research Institute 5-1-1 Tsukiji, Chuo-ku Tokyo 104 Japan Phone: 03-3542-2511 Fax: 03-3248-2737. FEATURES Location/Qualifiers source 1..3429 /organism="Homo sapiens" /db_xref="taxon:9606" /cell_line="SW 1116" /cell_type="epithelial cell" /tissue_type="colon tumor" /clone_lib="lambda ZAP II cDNA library of T. Oda" /dev_stage="adult" CDS 67..2787 /codon_start=1 /product="alpha-catenin" /protein_id="BAA02979.1" /db_xref="GI:433411" /translation="MTAVHAGNINFKWDPKSLEIRTLAVERLLEPLVTQVTTLVNTNS KGPSNKKRGRSKKAHVLAASVEQATENFLEKGDKIAKESQFLKEELVAAVEDVRKQGD LMKAAAGEFADDPCSSVKRGNMVRAARALLSAVTRLLILADMADVYKLLVQLKVVEDG ILKLRNAGNEQDLGIQYKALKPEVDKLNIMAAKRQQELKDVGHRDQMAAARGILQKNV PILYTASQACLQHPDVAAYKANRDLIYKQLQQAVTGISNAAQATASDDASQHQGGGGG ELAYALNNFDKQIIVDPLSFSEERFRPSLEERLESIISGAALMADSSCTRDDRRERIV AECNAVRQALQDLLSEYMGNAGRKERSDALNSAIDKMTKKTRDLRRQLRKAVMDHVSD SFLETNVPLLVLIEAAKNGNEKEVKEYAQVFREHANKLIEVANLACSISNNEEGVKLV RMSASQLEALCPQVINAALALAAKPQSKLAQENMDLFKEQWEKQVRVLTDAVDDITSI DDFLAVSENHILEDVNKCVIALQEKDVDGLDRTAGAIRGRAARVIHVVTSEMDNYEPG VYTEKVLEATKLLSNTVMPRFTEQVEAAVEALSSDPAQPMDENEFIDASRLVYDGIRD IRKAVLMIRTPEELDDSDFETEDFDVRSRTSVQTEDDQLIAGQSARAIMAQLPQEQKA KIAEQVASFQEEKSKLDAEVSKWDDSGNDIIVLAKQMCMIMMEMTDFTRGKGPLKNTS DVISAAKKIAEAGSRMDKLGRTIADHCPDSACKQDLLAYLQRIALYCHQLNICSKVKA EVQNLGGELVVSGVDSAMSLIQAAKNLMNAVVQTVKASYVASTKYQKSQGMASLNLPA VSWKMKAPEKKPLVKREKQDETQTKIKRASQKKHVNPVQALSEFKAMDSI" polyA_signal 3390..3395 polyA_site 3429 BASE COUNT 982 a 773 c 894 g 780 t ORIGIN 1 gagacaaagc agcgcccgtc tgcttcgggc ctctggaatt tagcgctcgc ccagctagcc 61 gcagaaatga ctgctgtcca tgcaggcaac ataaacttca agtgggatcc taaaagtcta 121 gagatcagga ctctggcagt tgagagactg ttggagcctc ttgttacaca ggttacaacc 181 cttgtaaaca ccaatagtaa agggccctct aataagaaga gaggtcgttc taagaaggcc 241 catgttttgg ctgcatctgt tgaacaagca actgagaatt tcttggagaa gggggataaa 301 attgcgaagg agagccagtt tctcaaggag gagcttgtgg ctgctgtaga agatgttcga 361 aaacaaggtg atttgatgaa ggctgctgca ggagagttcg cagatgatcc ctgctcttct 421 gtgaagcgag gcaacatggt tcgggcagct cgagctttgc tctctgctgt tacccggttg 481 ctgattttgg ctgacatggc agatgtctac aaattacttg ttcagctgaa agttgtggaa 541 gatggtatct tgaagttgag gaatgctggc aatgaacaag acttaggaat ccagtataaa 601 gccctaaaac ctgaagtgga taagctgaac attatggcag ccaaaagaca acaggaattg 661 aaagatgttg gccatcgtga tcagatggct gcagctagag gaatcctgca gaagaacgtt 721 ccgatcctct atactgcatc ccaggcatgc ctacagcacc ctgatgtcgc agcctataag 781 gccaacaggg acctgatata caagcagctg cagcaggcgg tcacaggcat ttccaatgca 841 gcccaggcca ctgcctcaga cgatgcctca cagcaccagg gtggaggagg aggagaactg 901 gcatatgcac tcaataactt tgacaaacaa atcattgtgg accccttgag cttcagcgag 961 gagcgcttta ggccttccct ggaggagcgt ctggaaagca tcattagtgg ggctgccttg 1021 atggccgact cgtcctgcac gcgtgatgac cgtcgtgagc gaattgtggc agagtgtaat 1081 gctgtccgcc aggccctgca ggacctgctt tcggagtaca tgggcaatgc tggacgtaaa 1141 gaaagaagtg atgcactcaa ttctgcaata gataaaatga ccaagaagac cagggacttg 1201 cgtagacagc tccgcaaagc tgtcatggac cacgtttcag attctttcct ggaaaccaat 1261 gttccacttt tggtattgat tgaagctgca aagaatggaa atgagaaaga agttaaggag 1321 tatgcccaag ttttccgtga acatgccaac aaattgattg aggttgccaa cttggcctgt 1381 tccatctcaa ataatgaaga aggtgtaaag cttgttcgaa tgtctgcaag ccagttagaa 1441 gccctctgtc ctcaggttat taatgctgca ctggctttag cagcaaaacc acagagtaaa 1501 ctggcccaag agaacatgga tctttttaaa gaacaatggg aaaaacaagt ccgtgttctc 1561 acagatgctg tcgatgacat tacttccatt gatgacttct tggctgtctc agagaatcac 1621 attttggaag atgtgaacaa atgtgtcatt gctctccaag agaaggatgt ggatggcctg 1681 gaccgcacag ctggtgcaat tcgaggccgg gcagcccggg tcattcacgt agtcacctca 1741 gagatggaca actatgagcc aggagtctac acagagaagg ttctggaagc cactaagctg 1801 ctctccaaca cagtcatgcc acgttttact gagcaagtag aagcagccgt ggaagccctc 1861 agctcggacc ctgcccagcc catggatgag aatgagttta tcgatgcttc ccgcctggta 1921 tatgatggca tccgggacat caggaaagca gtgctgatga taaggacccc tgaggagttg 1981 gatgactctg actttgagac agaagatttt gatgtcagaa gcaggacgag cgtccagaca 2041 gaagacgatc agctgatagc tggccagagt gcccgggcga tcatggctca gcttccccag 2101 gagcaaaaag cgaagattgc ggaacaggtg gccagcttcc aggaagaaaa gagcaagctg 2161 gatgctgaag tgtccaaatg ggacgacagt ggcaatgaca tcattgtgct ggccaagcag 2221 atgtgcatga ttatgatgga gatgacagac tttacccgag gtaaaggacc actcaaaaat 2281 acatcggatg tcatcagtgc tgccaagaaa attgctgagg caggatccag gatggacaag 2341 cttggccgca ccattgcaga ccattgcccc gactcggctt gcaagcagga cctgctggcc 2401 tacctgcaac gcatcgccct ctactgccac cagctgaaca tctgcagcaa ggtcaaggcc 2461 gaggtgcaga atctcggcgg ggagcttgtt gtctctgggg tggacagcgc catgtccctg 2521 atccaggcag ccaagaactt gatgaatgct gtggtgcaga cagtgaaggc atcctacgtc 2581 gcctctacca aataccaaaa gtcacagggt atggcttccc tcaaccttcc tgctgtgtca 2641 tggaagatga aggcaccaga gaaaaagcca ttggtgaaga gagagaaaca ggatgagaca 2701 cagaccaaga ttaaacgggc atctcagaag aagcacgtga acccggtgca ggccctcagc 2761 gagttcaaag ctatggacag catctaagtc tgcccaggcc ggccgccccc acccctcggg 2821 gctcctgaat atcagtcact gttcgtcact caaatgaatt tgctaaatac aacactgata 2881 ctagattcca cagggaaatg ggcagactga accagtccag gtggtgaatt ttccaagaac 2941 atagtttaag ttgattaaaa atgcttttag aatgcaggag cctacttcta gctgtatttt 3001 ttgtatgctt aaataaaaat aaaaattcat aaccaaagag aatcccacat tagcttgtta 3061 gtaatgctct gaccaagccg agatgcccat tctcttagtg atggcggcgt tagggtttga 3121 gagaagggaa tttggctcaa cttcagttga gagggtgcag tccagacagc ttgactgctt 3181 ttaaatgacc aaagatgacc tgtggtaagc aacctgggca tcttagaagc agtccctgga 3241 gaaggcatgt tcccagaaag gtctctggag ggacaaactc actcagtaaa acataatgta 3301 tcatcatgaa gaaaactgat tctctatgac atgaaatgaa aattttaatg cattgttata 3361 attactaatg tacgctgctg caggacatta ataaagttgc ttttttaggc tacagtgtct 3421 cgatgccat // . . . В этом интервале времени переписку вел Сергей Полищук, отправляя результаты на конкурс CAFASP-2 *************************************************************************** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * . . . 20000818 21.43 Leszek: ----- Original Message ----- From: Leszek Rychlewski To: pikoworld Sent: Wednesday, August 16, 2000 6:10 PM Subject: Re: answer Dear Alexander, I am back in the office and I can send You more detailed info about CASP and CAFASP. CAFASP is only a part of CASP. CASP evaluates the predictions , CAFASP stores the raw results. Please read (!!!) the instructions: http://www.cs.bgu.ac.il/~dfischer/CAFASP2/ announce3.html (attached) The CASP format is described in: http://predictioncenter.llnl.gov/casp4/doc/ casp4-format.html (attached) > pikoworld wrote: > > Dear Leszek! > I sent to you target T0098 at 14.06.2000 (t0098,t0099,t010,t0101). We have it. It is stored at the CAFASP site. > I send now this target ones more (see attachment, please!) No need to do this. > Full list of files, which I have sent: > T0089 PDB 225 248 19.05.00 18:03 t0089.pdb > T0090 PDB 113 524 19.05.00 18:04 t0090.pdb > T0086 PDB 89 452 26.05.00 18:00 t0086.pdb > T0087 PDB 154 732 26.05.00 18:01 t0087.pdb > T0088 PDB 80 272 26.05.00 18:01 t0088.pdb > T0091 PDB 56 472 21.05.00 22:40 t0091.pdb > T0092 PDB 130 864 21.05.00 22:41 t0092.pdb > T0093 PDB 87 956 21.05.00 22:41 t0093.pdb > T0096 PDB 129 708 12.06.00 13:44 t0096.pdb > T0095 PDB 128 892 06.06.00 12:56 t0095.pdb > T0094 PDB 97 816 26.05.00 10:52 t0094.pdb > T0097 PDB 55 724 12.06.00 13:44 t0097.pdb > T0098 PDB 64 632 14.06.00 9:25 t0098.pdb > T0099 PDB 31 084 14.06.00 9:26 t0099.pdb > T0100 PDB 177 444 14.06.00 9:27 t0100.pdb > T0101 PDB 205 800 14.06.00 9:27 t0101.pdb > T0102 PDB 34 372 17.06.00 10:12 t0102.pdb > T0104 PDB 85 984 29.06.00 14:37 t0104.pdb > T0103 PDB 184 720 29.06.00 14:35 t0103.pdb > T0105 PDB 52 120 07.07.00 19:42 t0105.pdb > T0106 PDB 67 624 07.07.00 19:41 t0106.pdb > T0107 PDB 101 826 13.07.00 0:13 T0107.pdb > T0108 PDB 104 818 13.07.00 0:13 T0108.pdb > T0109 PDB 100 942 13.07.00 0:12 T0109.pdb > T0110 PDB 68 302 13.07.00 0:13 T0110.pdb > T0111 PDB 226 402 13.07.00 0:13 T0111.pdb > T0112 PDB 181 250 14.07.00 19:46 T0112.pdb > T0113 PDB 130 250 14.07.00 19:47 T0113.pdb > T0114 PDB 47 290 14.07.00 19:47 T0114.pdb > T0115 PDB 156 772 28.07.00 14:46 T0115.pdb > T0116 PDB 429 860 26.07.00 14:30 T0116.pdb > T0117 PDB 139 024 26.07.00 14:30 T0117.pdb > T0118 PDB 82 176 28.07.00 14:53 T0118.pdb > T0119 PDB 179 756 28.07.00 14:54 T0119.pdb > T0120 PDB 181 524 28.07.00 14:54 T0120.pdb > 36 files 4 384 632 bytes We have all the files. > I don't understand why I must register in CASP and CAFASP once more. You don't need to. > You have written that the deadline of CASP is 15 Augast This was the deadline for target T0098 > Our Predictor Registration Code in CASP 1546-5466-5112. What must I do with it now? You need to format the predictions according to CASP and send it there. Please check the CASP site: http://predictioncenter.llnl.gov/casp4/targets/cgi/casp4-view.cgi to see the CASP deadlines for all targets. > I consider the standard SS incorrect,so I have sent you only PDB. This is enough for CAFASP, but we don't evaluate the predictions. They are evaluated by CASP. You need to translate the predictions, otherwise how are the going to evaluate it ? > It is no simple way from the real secondary structare which is built by our program to incorrect standard SS. So we can only intuitively compare our format with SS format. It is very laborious task. How is going to evaluate the predictions then ? > I would like to use only PDB format. You can use the stored CAFASP results and evaluate the results yourself after the true secondary (and tertiary) structure is know. If there is no simple way to evaluate it, then CASP is not able to do it. > If I understand you correctly PDB files(t0089...t0120.PDB) had to be send not only to your address but to some other simultaneously. No. You need to reformat the predictions and send it later, before the CASP deadline (for each target) to CASP. > We haven't made server on technical causes, so we can take participance in CAFASP only through your E-mail. > You have written that you include our PDB files in the archive of the competition.Must we send PDB files through official channel? No. All files are deposited at the CAFASP server. From CAFASP point of view You don't need to do anything else. But You probably want Your predictions to be evaluated somehow. You have two options: 1) You evaluate the predictions: CAFASP will provide the prove that they were blind predictions (all results are stored earlier at our site) 2) You ask CASP to evaluate the predictions. Then You have to format the predictions according to CASP rules and send it there. CAFASP is only a facility to store results. We do not evaluate secondary structure prediction. Our goal is to prove that the results are blind predictions obtained in automated way. -> for example: You can use CAFASP stored results to write a paper about blind predictions with Your method. Sincerely Leszek CAFASP2 RULES AND INFORMATION ANNOUNCEMENT NO. 3. This announcement overrides our previous two announcements. This announcement contains new rules and modifications of the procedures announced in our previous announcement. The main differences are: * 1. There will be two types of submissions for each target: * 1. Available and registered servers. * 2. Late-comers and unavailable servers. * 2. The main evaluation will be carried out only on the type 1 submissions. * 3. The evaluation will be carried on 90% of the targets. * 4. The deadline for submitting the formatted model has been changed from two weeks, as previously stated, to the regular casp4 deadline. In what follows is the complete CAFASP2 announcement, containing both the rules that have not changed and those that have been modified. GOAL The goal of CAFASP2 is to evaluate the performance of fully automatic structure prediction servers available to the community. In contrast to the normal CASP procedure, CAFASP2 will answer the question of how well servers do without any intervention of experts, i.e. how well ANY user can predict protein structure. As in CAFASP1, CAFASP2 assesses the performance of methods without the user intervention allowed in CASP. CAFASP2 will take place as an official section of CASP4 , which will be held by mid-2000. All developers of automatic prediction servers are invited to participate in CAFASP2. Interested parties are invited to send an e-mail to dfischer@cs.bgu.ac.il so they will be included in the CAFASP2 mailing list. Predictors wishing to include their servers in CAFASP2 should register as soon as possible at the CAFASP2 meta-server. TERMINOLOGY A "server" is the automated structure prediction server that participates in CAFASP. A server will have an identification name and its corresponding url. Servers for any aspect of structure prediction can participate in CAFASP. A "server's person" is the person in charge of the maintainance and function of a server. A "raw server output" is the output obtained by a server. A "server's submission" is the prediction that a server's person files to CAFASP based on the raw server output (and is not necessarily the same as the raw server output; see below). REGISTRATION TO CAFASP2 A server's person willing to include his/her server to CAFASP2 must register as a predictor to CASP4, and state that this registration corresponds to his/her server. The server's person will receive a participant's identification. A server's person can also register as a regular CASP4 participant. Servers that run different methods should register each of the methods separately. In addition, servers intended to participate in CAFASP need also to register at the CAFASP site , and recieve an explicit acknowledgment that the server has been included in the list of CAFASP participants. Only those servers that have registered at both the Prediction Center and at the CAFASP center will be evaluated. CAFASP2 SUBMISSION PROCEDURE As each target sequence is released, the CAFASP meta-server will submit it to all of the prediction servers and archive the raw server output (in the servers' native formats). In addition, each server's person will be responsible to reformat their archived results into CASP4 format and to submit them to CASP4 by the standard procedures, and subsequently to CAFASP2. We encourage (but do not require) that server persons make their best efforts to have their raw server output as close to the casp4 format as possible. Only the submitted entries in the CASP4 format will be evaluated, but these will be compared with the archived results to ensure that the content is identical. Any discrepancies will be publicly announced and will result in disqualification. Notice that in the above procedure, the server's person is responsible to submit the prediction to CASP4, using the regular CASP4 procedure. This submission must be identical in content to the output produced by his/her server, but may vary in format. Because it may be difficult to enforce that a server produces valid CASP4 formats, we allow a server's person to take the raw server output and transform it to valid CASP4 format, as long as it is identical in content. As the raw servers' output of the registered servers is collected, it will be made available to all, in the CAFASP2 web-site. This will allow anybody to use the predictions from the automated servers for other purposes. Let's repeat the above: Upon release of a target by the casp4 people, the cafasp2 metaserver submits it to every registered server. The metaserver compiles the raw results of each server and stores them. The compiled results of the meta-server will be available to all. Then the "server person" can look at what the cafasp2 metaserver has collected from his/her server, and produce by whatever means he/she wants, a casp4 compatible format which should be identical in content to the info stored at the meta-server. It is the server person's responsibility to submit the formatted prediction directly to casp4 and again to cafasp2. The validation process will check that the formatted submissions are identical in content to the ones stored at the metaserver. SUBMISSION DEADLINES The CAFASP deadline for receiving the automated "raw-output" from the servers will be 48 hours after the target was sent. The CAFASP deadline for submission of the formatted server's submission will be the same as the regular CASP4 deadline. Notice that the server's person is responsible to format the prediction, to submit it to CASP4 using the regular procedure, and finally to submit the validated prediction to CAFASP2. Notice that in addition to the formatted submission, the CAFASP meta-server will collect the results of each registered server immediately after the publication of each casp4 target, and that the formatted submission need be identical in content to the stored results. The automated results can be in any format, as long as they contain all the information needed to verify that the submitted prediction is identical in content to the automated results. In addition, ALL the results collected by the CAFASP meta-server will be made immediately available to the public. SUBMITTING TO THE PREDICTION CENTER AND TO THE CAFASP CENTER As stated before, the "raw-server output" will be collected by the CAFASP metaserver, and no human will intervene in this process. However, the valid-format CASP4 submission file must be prepared by the server-person. This file needs to be first submitted to the Prediction Center by the CASP4 deadline AND SUBSEQUENTLY to the CAFASP center. The CAFASP submission need include the exact CASP submission file, plus the CASP submission identification number received. Both submissions must be identical (dissimilar submissions will be disqualified) and identical in content to the raw-server output stored at CAFASP. Only those submissions submitted at both centers will be evaluated. TYPES OF SUBMISSIONS For each target there will be two main types of submissions: * 1. AVAILABLE AND REGISTERED SERVERS: Those registered servers for which the CAFASP2 meta-server was able to compile their raw-output within the 48 hours after each target is released. * 2. LATE-COMERS/UNAVAILABLE SERVERS: Those servers that did not succeed to send their raw-output to the CAFASP2 meta-server within the 48 hours. For the main CAFASP evaluation, only predictions of type 1 will be considered. Validation of the formatted submissions will only be carried out for predictions of type 1. The type 2 predictions are aimed at servers that for any reason were not available within the 48 hours deadline, and thus no raw-output could be collected. Predictors submitting in this category implicitly state that the prediction reflects a fully automated process, but this will not be validated by CAFASP. To enter a prediction in this type, predictors need send to CAFASP only the valid, formatted, regular CASP4 submission. An additional CAFASP evaluation will be carried out for predictions of type 2, based on the manually prepared, formatted submissions. Because these can not be considered fully automated predictions, they will not be validated. UN-REGISTRATION Any cafasp participant can ask to withdraw his/her participation at any time, which will mean that his/her server's url will be removed from cafasp's meta-server. CATEGORIES FOR CAFASP2 Any automated server can participate in CAFASP2, including Homology Modeling, Threading, Ab-Initio, Secondary Structure Prediction and Contacts Prediction. ASSESSMENT OF PREDICTIONS Predictions submitted to CAFASP will undergo the exact same evaluation procedure as the normal CASP submission. In addition, the CAFASP sub-committees may decide to apply additional evaluation procedures. The latter will include only automatic evaluators and the measures used by them will be made public as soon as possible. More details about the additional cafasp evaluation are available here . A comparison of the performance of CAFASP2 versus the regular CASP4 submission will also be carried out. The additional automatic evaluation of the CAFASP2 results is another difference between CASP and CAFASP. The automatic evaluation has the advantages of being reproducible, quantitative and objective. However, as it is difficult to agree upon a single evaluation measure, we encourage all server persons to understand the evaluation methods BEFORE they register, and if they find them inadequate, they may choose not to participate in CAFASP. The automatic evaluation frees CAFASP2 from the "assessment" problem, as it will be a program (described in advance) that will do the rankings. Please check this site soon to see the description of the evaluation methods. CAFASP COMMITTEES Currently the people involved in the CAFASP committee are Leszek Rychlewski, (leszek@bioinfo.pl), Arne Elofsson (arne@razor.biokemi.su.se), Burkhard Rost (rost@columbia.edu), Adam Zemla (adamz@llnl.gov), Krzysztof Fidelis (fidelis@llnl.gov), Naomi Siew (nomsiew@cs.bgu.ac.il) and Daniel Fischer (dfischer@cs.bgu.ac.il). As advisors we currently have Steven Brenner (brenner@compbio.berkeley.edu). Sub-committees for homology modeling, threading, ab-initio and secondary structure servers will be coordinated by the CAFASP committee. The Sub-committees will be in charge of the additional automated evaluation (if required) and of the comparative analysis of the results. As of today these are the sub-committees and people assigned to them: Threading sub-committee: Leszek Rychlewski (leszek@bioinfo.pl), Arne Elofsson (arne@razor.biokemi.su.se) and Daniel Fischer (dfischer@cs.bgu.ac.il). Ab initio sub-committee: Angel Ortiz (ortiz@scripps.edu) Homology Modeling sub-committee: Roland L. Dunbrack (RL_Dunbrack@fccc.edu) Secondary Structure sub-committee: Burkhard Rost (rost@columbia.edu) and James Cuff (james@ebi.ac.uk). Contacts Prediction sub-committee: Alfonso Valencia (valencia@cnb.uam.es). We invite interested parties to be involved in the various sub-committees. CAFASP2 url: http://www.cs.bgu.ac.il/~dfischer/CAFASP2 /casp4/ /casp4/Fourth Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction Formats for Prediction Submission This document describes the format required for submissions of CASP4 predictions. General rules * Models should be submitted by e-mail to submit@predictioncenter.llnl.gov. Note: the CASP4 model submission facility is available here. * Predictions for CASP4 may be submitted in four separate formats: TS # 3D atomic coordinates (Tertiary Structure) prediction AL # Format to express unambiguous ALignments to PDB entries SS # Secondary Structure prediction RR # Residue-Residue separation distance prediction * One team may make a prediction of a target by submitting up to five models in TS/AL, SS, and RR formats (models in AL format are considered equivalent to those in TS format and will be translated to TS internally before evaluation). Most of the evaluation and assessment will focus on the model labeled '1' (model index 1, see MODEL record). * Each submission may contain only one of the four format categories. * Submission of each model begins with PFRMAT and ends with END record. * Each submission may contain only one model, beginning with the MODEL record, ending with END, and no target residue repetitions. * Submission of a duplicate model (same target, format category, group, model index) will replace previously accepted model, provided it is received before the target has expired. Note: models in AL format are considered equivalent to those in TS format. * Each submitted model is automatically verified by the format verification server. Only accepted models will be assigned an ACCESSION CODE. A unique ACCESSION CODE is composed from the number of the target, prediction format category, prediction group number, and model index. Examples: Accession code T0045SS067_1 has the following components: T0045 target number SS Secondary Structure prediction (PFRMAT SS) 067 prediction group 67 1 model index 1 (see MODEL record) Accession code T0044TS005_2 has the following components: T0044 target number TS Tertiary Structure (3D atoms coordinates, PFRMAT TS) 005 prediction group 5 2 model index 2 (by default considered as FINAL/REFINED) Accession code T0044TS005_2u has the following components: T0044 target number TS Tertiary Structure (3D atoms coordinates, PFRMAT TS) 005 prediction group 5 2u model index 2 UNREFINED set of coordinates Note: the CASP4 model list viewer is available here. Format description All submissions contain the records described below. Each of these records must begin with a standard keyword. In all submissions the standard keywords must begin in the first column of a record. The keyword set is as follows: PFRMAT Format specification # TS , SS , RR or AL TARGET Target identifier from the CASP4 target table AUTHOR XXXX-XXXX-XXXX # The registration code of the Group Leader REMARK Comment record (may appear anywhere, optional) METHOD Records describing the methods used MODEL Beginning of the data section for the submitted model PARENT Specifies structure template used to generate the TS/AL model TER Terminates independent segments of structure in the TS/AL model END End of the submitted model Record PFRMAT is used for all submissions. PFRMAT TS TS indicates that the submission contains 3D atomic coordinates in standard PDB format PFRMAT SS SS indicates that the submission contains secondary structure prediction PFRMAT RR RR indicates that the submission contains residue-residue separation distance prediction PFRMAT AL AL indicates that the submission contains unambiguous alignments to PDB entries Record TARGET is used for all submissions. TARGET Txxxx Txxxx indicates the id of the target predicted. Targets from the CASP4 Target list are valid. Note: for some targets the residue numbering can be not standard. Please check residue numbering in "Template PDB file" provided for each Txxxx target from CASP4 "Target list". Record AUTHOR is used for all submissions. AUTHOR XXXX-XXXX-XXXX XXXX-XXXX-XXXX indicates the Group Leader's registration code. This code is the prediction submission code obtained upon registration at the CASP4 WEB sites (Prediction Center). Members of prediction groups who intend to submit predictions should use the registration code of the Group Leader for all predictions submitted by that group. REMARK Optional. PDB style 'REMARK' records may be used anywhere in the submission. These records may contain any text and will in general not influence evaluation. Records METHOD are used for all submissions. These records describe the methods used. Predictors are urged to provide as full a description of the methods as possible, including references, data libraries used, and values of non-default parameters. These descriptions will be made available via the Prediction Center WEB pages as well as printed along with the other materials distributed at the meeting. Length of 100 - 500 words is suggested. Record MODEL is used for all submissions. Signifies the beginning of model data (3D atomic coordinates, an unambiguous alignment to a PDB entry, residue-residue separation distance prediction, or secondary structure prediction). MODEL n [REFINED|UNREFINED] n Model index n is used to indicate predictor's ranking according to her/his belief which model is closest to the target structure (1 <= n <= 5). Model index is included automatically in the ACCESSION CODE. REFINED The set of coordinates labeled REFINED will be considered as a final model (to allow the evaluation of the results of an automated refinement process, such as molecular dynamics). Models submitted without any label: REFINED or UNREFINED will be considered by default as final. UNREFINED Coordinates labeled UNREFINED will be compared only to the final set (REFINED) with the same model index n, to evaluate the effectiveness of the refinement method. If UNREFINED model is submitted, a REFINED model must be submitted as well. The letter "u" will be added to the model index in the ACCESSION CODE of the UNREFINED model. Record PARENT is used for all submissions in the TS (and AL) format. PARENT record indicates structure templates used to generate any independent segment of MODEL (see description of the TS format below). The PARENT record should be placed as the first record of any such independent segment. Only one PARENT record per structure segment is allowed. PARENT N/A Indicates an ab initio prediction, not directly based on any known structure. Note that this is the only indication in the file that the prediction is ab initio, so is a critical piece of information. PARENT NONE [n1 n2] Indicates that the predictor believes that there is no structure in the present PDB that is close enough to be used as a template. This is an entry requested by those predictors who use threading and sequence comparison methods. With structural genomics projects being designed to determine the structure of proteins with novel folds, the ability to predict when a fold is unknown is becoming increasingly important, and predictors are urged to make such submissions. Delimiters n1 n2 indicate the range of the target sequence predicted as having no homologue in the current PDB. Omission of n1 n2 indicates the entire target (see Example 1 (C)). PARENT mabc_A Indicates that the model or the independent segment of structure is based on a single PDB entry mabc chain A (use _A to indicate chain A). Most threading and sequence search submissions would now be submitted with this form of the PARENT record. A comparative modeler using a single parent structure would also use this form. Note that, in order to be accepted, the code must correspond to a current PDB entry. PARENT mcdc ndef_g [ohij_k ...] Is used only in comparative modeling and indicates that the model is based on more than one structure template. Up to five PDB chains may be listed here with additional detailed information included in the METHOD records. In threading and sequence search, subdomains of the target structure found to correspond to different known folds should be submitted as independent segments of structure with reference to only one PDB chain per segment. Record TER is used to terminate an independent segment of structure (PFRMAT TS and PFRMAT AL). TER 3D atomic coordinates (PFRMAT TS). Standard PDB atom records are used for the atomic coordinates. Format of the submission requires that 80 column long records are used. These may be spaces when needed (see target template PDB files as provided in specific target descriptions available through the CASP4 target table). This requirement is necessitated by some of the software used in the evaluation of predictions. Coordinates for each model or an independent structure segment should begin with a single PARENT record and terminate with a TER record (see above). It is requested that coordinate data be supplied for at least all non-hydrogen main chain atoms, i.e. the N, CA, C and O atoms of every residue. Specifically, if only CA atoms are predicted by the method, predictors are encouraged to build the main chain atoms for every residue before submission to CASP. One program that can make such a conversion is Maxsprout server of Liisa Holm and co-workers. (If only CA atoms were submitted it would not be possible to run most of the analysis software, which would severely limit the evaluation of that prediction.) When multiple independent segments of structure are used in a prediction, they will be evaluated separately with no assumption of a common frame of reference between the segments. For any given MODEL, no target residue may be repeated among all such independent structure segments. In comparative modeling and in threading, potential multi-domain nature of targets will be addressed in the evaluation even if the prediction is made in a single frame of reference (i.e. without separation into multiple segments of structure). For such predictions segmentation should only be used to allow multiple model predictions (effectively up to 5 predictions for each such domain). Notes: - atoms for which a prediction has been made must contain "1.0" in the occupancy field; those for which no prediction is made must either contain "0.0" in that field or be skipped altogether - error estimates, in Angstroms, must be provided in the temperature factor field An unambiguous alignment to a PDB entry used for threading predictions (PFRMAT AL). Alignment for each model or an independent structure segment should begin with a single PARENT record and terminate with a TER record (see above). The (four column) alignment data records provide: target residue one letter symbol, target residue sequence number, PDB residue one letter symbol, and PDB residue sequence number with an insertion code if necessary (see Example 4): aa1 n1 aa2 n2 Note: - residues for which no prediction is made must be skipped - if a chain ID is specified in the PDB template of the target, then the target residue sequence number should be composed of a chain ID and residue number, e.g. A2, B44 The PDB code with chain extension of the structure the alignment is based on should be placed in the PARENT record. Only one PDB code per independent structure segment is allowed. PDB codes should refer to structures containing at least main chain atomic coordinates (see TS format). As in the coordinate submissions, when multiple independent segments of structure are used in a prediction, they will be evaluated separately with no assumption of a common frame of reference between the segments. For any given MODEL, no target residue may be repeated among all such independent structure segments. Potential multi-domain nature of targets will be addressed in the evaluation even if the prediction is made in a single frame of reference (i.e. without separation into multiple segments of structure). For such predictions segmentation should only be used to allow multiple model predictions (effectively up to 5 predictions for each such domain). Note: The facility to translate sequence - structure alignments (AL format) into standard PDB atom records (TS format) is available as an additional AL2TS service. Secondary structure prediction (PFRMAT SS). Data in this format is inserted between MODEL and END records of the submission file. The (three column) format record consists of residue code, secondary structure assignment code, and a number specifying the associated confidence level: aa ss p The symbols for the 3 state secondary structure are 'H'=helix, 'E'=strand, 'C'=Coil. Confidence level is a probability of a residue being predicted correctly with values in the range of 0.0 - 1.0. The entire sequence of the target should always be given. If parts cannot be predicted a probability of 0.0 should be used. Residue-Residue separation prediction (PFRMAT RR). Data in this format is inserted between MODEL and END records of the submission file. Format for the predicted separation distance between pairs of residues. The distance is defined as the separation between C-beta atoms (C-alpha for glycine residues). Due to the fact, that the RR contact prediction evaluation issues are still not finalized by the Consultancy Group and the dedline for the first target is approaching, the CASP3 format will be extended for CASP4. However it is STRONGLY recommended that the full flexibility of the format is not used, to allow a simple and uniform evaluation. Thus values of d1 = 0 and d2 = 8 are recommended. If it is planned to submit using other distance ranges (d1,d2) we request that a corresponding prediction with only the (0,8) ranges is submitted as model "1", and the original prediction as model "2" with the appropriate explanation in the REMARK field regarding the relation to model "1". The (five column) RR format: i j d1 d2 p Notes (see Example 3): - entire target sequence should be split over multiple lines with a maximum of 50 residues per line - for intrachain residue-residue contacts residue number indices i and j should be used for distance specification (i < j), i.e. only one diagonal of the separation matrix should be supplied - the distances d1 and d2 (real numbers) should indicate the range of Cb-Cb distance predicted for the residue pair (C-alpha for glycines) - the real number p should range from 0.0 - 1.0 to indicate probability of the distance falling between the predicted range - residue 'contacts' (defined here - as in CASP2 - as Cb-Cb<8A) can be predicted with this format as: i j 0 8 p - any pair NOT listed is assumed to be NOT considered by predictor - to evaluate the subset of residue-residue separation distances that represent 'contacts', 4 separation interval bins will be used (as in CASP2) (separation is calculated along the chain as a number of residues between the residues in contact): 1 residue or more : 1-9999 from 1 to 4 residues : 1-4 from 5 to 8 residues : 5-8 9 residues or more : 9-9999 Example: Let's assume that two residues are in contact and there are 6 residues in between them along the chain. This contact will be classified as belonging to 5-8 separation interval bin and will not be counted in 1-4 and 9-9999 bins - in addition, in the evaluation of each prediction, 'p' will be compared to what would be expected from random, i.e. the likelyhood observed in the database of protein structures for a pair of residues with residue separation (distance) d1-d2; residue separation (sequence) j-i; protein size; types of residue i, j. END record is used for all predictions and indicates the end of a single model submission. Predictions of multichain targets. Atomic coordinates should contain chain IDs as provided in template files. In residue-residue contact predictions residue indices should be composed of chain ID and residue number, e.g. A2, B44 (see Example 5). Example 1. Atomic coordinates (Tertiary Structure) The primary CASP4 format used for comparative modeling, threading and ab initio submission categories. (A) An example of comparative modeling prediction. As this model is labeled UNREFINED, submission of a REFINED model is also required. PFRMAT TS TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 UNREFINED PARENT 1abc 1def_A ATOM 1 N GLU 1 10.982 -9.774 1.377 1.00 0.50 ATOM 2 CA GLU 1 9.623 -9.833 1.984 1.00 0.50 ATOM 3 C GLU 1 8.913 -11.104 1.521 1.00 0.50 ATOM 4 O GLU 1 9.187 -11.630 0.461 1.00 0.50 ATOM 5 CB GLU 1 8.814 -8.614 1.546 1.00 0.50 ATOM 6 CG GLU 1 7.372 -8.754 2.039 1.00 0.50 ATOM 7 CD GLU 1 7.339 -8.625 3.562 1.00 0.50 ATOM 8 OE1 GLU 1 8.370 -8.307 4.131 1.00 0.50 ATOM 9 OE2 GLU 1 6.284 -8.846 4.132 1.00 0.50 ATOM 10 N THR 2 7.998 -11.599 2.304 1.00 1.60 ATOM 11 CA THR 2 7.266 -12.832 1.907 1.00 1.60 ATOM 12 C THR 2 6.096 -12.456 1.005 1.00 1.60 ATOM 13 O THR 2 5.008 -12.217 1.466 1.00 1.60 ATOM 14 CB THR 2 6.731 -13.533 3.157 1.00 1.60 ATOM 15 OG1 THR 2 7.662 -13.379 4.220 1.00 1.60 ATOM 16 CG2 THR 2 6.526 -15.019 2.864 1.00 1.60 ATOM 17 N VAL 3 6.308 -12.396 -0.278 1.00 1.70 ATOM 18 CA VAL 3 5.190 -12.030 -1.187 1.00 1.70 ATOM 19 C VAL 3 3.954 -12.870 -0.844 1.00 1.70 ATOM 20 O VAL 3 2.834 -12.471 -1.090 1.00 1.70 ATOM 21 CB VAL 3 5.608 -12.274 -2.641 1.00 1.70 ATOM 22 CG1 VAL 3 5.542 -13.771 -2.959 1.00 1.70 ATOM 23 CG2 VAL 3 4.664 -11.514 -3.573 1.00 1.70 ATOM 24 N GLU 4 4.146 -14.029 -0.272 1.00 1.70 ATOM 25 CA GLU 4 2.976 -14.882 0.086 1.00 1.60 ATOM 26 C GLU 4 2.153 -14.190 1.175 1.00 1.50 ATOM 27 O GLU 4 0.942 -14.141 1.109 1.00 1.40 ATOM 28 CB GLU 4 3.465 -16.238 0.597 1.00 1.30 ATOM 29 CG GLU 4 2.336 -17.264 0.479 1.00 1.20 ATOM 30 CD GLU 4 2.929 -18.671 0.391 1.00 1.10 ATOM 31 OE1 GLU 4 4.056 -18.846 0.823 1.00 1.00 ATOM 32 OE2 GLU 4 2.246 -19.551 -0.108 1.00 0.90 TER END (B) A model consisting of 2 independent structure segments (could be a target modeled from two PDB domains, where relative orientation is unknown; could be 2 fragments predicted by ab initio methods - ab initio example shown). In a single MODEL no residue should appear twice among all such segments. PFRMAT TS TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 PARENT N/A ATOM 1 N GLU 1 10.982 -9.774 1.377 1.00 0.50 ATOM 2 CA GLU 1 9.623 -9.833 1.984 1.00 0.50 ATOM 3 C GLU 1 8.913 -11.104 1.521 1.00 0.50 ATOM 4 O GLU 1 9.187 -11.630 0.461 1.00 0.50 ATOM 5 CB GLU 1 8.814 -8.614 1.546 1.00 0.50 ATOM 6 CG GLU 1 7.372 -8.754 2.039 1.00 0.50 ATOM 7 CD GLU 1 7.339 -8.625 3.562 1.00 0.50 ATOM 8 OE1 GLU 1 8.370 -8.307 4.131 1.00 0.50 ATOM 9 OE2 GLU 1 6.284 -8.846 4.132 1.00 0.50 ATOM 10 N THR 2 7.998 -11.599 2.304 1.00 1.60 ATOM 11 CA THR 2 7.266 -12.832 1.907 1.00 1.60 ATOM 12 C THR 2 6.096 -12.456 1.005 1.00 1.60 ATOM 13 O THR 2 5.008 -12.217 1.466 1.00 1.60 ATOM 14 CB THR 2 6.731 -13.533 3.157 1.00 1.60 ATOM 15 OG1 THR 2 7.662 -13.379 4.220 1.00 1.60 ATOM 16 CG2 THR 2 6.526 -15.019 2.864 1.00 1.60 ATOM 24 N GLU 4 4.146 -14.029 -0.272 1.00 1.70 ATOM 25 CA GLU 4 2.976 -14.882 0.086 1.00 1.60 ATOM 26 C GLU 4 2.153 -14.190 1.175 1.00 1.50 ATOM 27 O GLU 4 0.942 -14.141 1.109 1.00 1.40 ATOM 28 CB GLU 4 3.465 -16.238 0.597 1.00 1.30 ATOM 29 CG GLU 4 2.336 -17.264 0.479 1.00 1.20 ATOM 30 CD GLU 4 2.929 -18.671 0.391 1.00 1.10 ATOM 31 OE1 GLU 4 4.056 -18.846 0.823 1.00 1.00 ATOM 32 OE2 GLU 4 2.246 -19.551 -0.108 1.00 0.90 TER PARENT N/A ATOM 17 N VAL 3 6.308 -12.396 -0.278 1.00 1.70 ATOM 18 CA VAL 3 5.190 -12.030 -1.187 1.00 1.70 ATOM 19 C VAL 3 3.954 -12.870 -0.844 1.00 1.70 ATOM 20 O VAL 3 2.834 -12.471 -1.090 1.00 1.70 ATOM 21 CB VAL 3 5.608 -12.274 -2.641 1.00 1.70 ATOM 22 CG1 VAL 3 5.542 -13.771 -2.959 1.00 1.70 ATOM 23 CG2 VAL 3 4.664 -11.514 -3.573 1.00 1.70 TER END (C) Threading/Fold Recognition prediction stating that target has no homologue in the current PDB. PFRMAT TS TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 PARENT NONE TER END Example 2. Secondary structure prediction Example of secondary structure prediction. Note to predictors: it may be interesting to predict the secondary structure of proteins even when a clear structural homologue is known for the target. In cases where the target sequence is divergent from the template, secondary structure prediction may be more accurate than that implied by the template and visa versa. PFRMAT SS TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 H E 0.70 # <- residue code, L E 0.80 # <- secondary structure assignment code, E E 0.80 # <- the number specifying the associated G E 0.60 # confidence level S C 0.90 I E 0.50 G E 0.40 I E 0.60 L E 0.70 L C 0.50 K C 0.50 K H 0.90 H H 0.90 E H 0.90 I H 0.80 V H 0.70 F C 0.90 D C 0.90 G H 0.40 C C 0.40 END Example 3. Residue-Residue contact prediction The flexibility offered by the new format allows algorithms parameterized to predict any distance range to be used. Below is an example of how to use the new residue-residue separation distance format to submit a prediction of residue contacts defined as Cb-Cb distances < 8 A. PFRMAT RR TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 HLEGSIGILLKKHEIVFDGC # <- entire target sequence (up to 50 HDFGRTYIWQMSD # residues per line) 1 9 0 8 0.70 1 10 0 8 0.70 # <- indices of residues: i and j (integers), 1 12 0 8 0.60 # <- the range of Cb-Cb distance predicted 1 14 0 8 0.20 # for the residue pair: d1 and d2 (real), 1 15 0 8 0.10 # <- probability of the distance between 1 17 0 8 0.30 # Cb atoms being within the specified 1 19 0 8 0.50 # range: p (real) 2 8 0 8 0.90 3 7 0 8 0.70 3 12 0 8 0.40 3 14 0 8 0.70 3 15 0 8 0.30 4 6 0 8 0.90 7 14 0 8 0.30 9 14 0 8 0.50 END Example 4. An alternative alignment format for Threading/Fold Recognition predictions Alignments will be converted into a 3D structures. (A) Format to express unambiguous alignments to PDB entries 'mabc_A' and 'nefg'. PFRMAT AL TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 PARENT mabc_A M 21 V 11 P 22 D 12 N 23 A 12A F 24 F 12B A 25 L 13 P 32 D 22 N 33 A 23 F 34 F 24 A 35 L 25 TER PARENT nefg E 75 T 73 T 76 T 74 V 77 A 75 D 78 D 76 G 79 D 77 R 80 R 78 TER END (B) Format to express unambiguous alignments to PDB entry 'mabc_D'. An example of how to use the AL format to submit a prediction of the target with a chain name of 'A'. PFRMAT AL TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 PARENT mabc_D M A21 V 11 P A22 D 12 N A23 A 12A F A24 F 12B A A25 L 13 P A32 D 22 N A33 A 23 F A34 F 24 A A35 L 25 TER END Example 5. Predictions of multichain targets (chains A and B) (A) An example of 3D atomic coordinates model prediction. PFRMAT TS TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 PARENT N/A ATOM 17 N VAL A 3 6.308 -12.396 -0.278 1.00 1.70 ATOM 18 CA VAL A 3 5.190 -12.030 -1.187 1.00 1.70 ATOM 19 C VAL A 3 3.954 -12.870 -0.844 1.00 1.70 ATOM 20 O VAL A 3 2.834 -12.471 -1.090 1.00 1.70 ATOM 21 CB VAL A 3 5.608 -12.274 -2.641 1.00 1.70 ATOM 22 CG1 VAL A 3 5.542 -13.771 -2.959 1.00 1.70 ATOM 23 CG2 VAL A 3 4.664 -11.514 -3.573 1.00 1.70 ATOM 24 N GLU A 4 4.146 -14.029 -0.272 1.00 1.70 ATOM 25 CA GLU A 4 2.976 -14.882 0.086 1.00 1.60 ATOM 26 C GLU A 4 2.153 -14.190 1.175 1.00 1.50 ATOM 27 O GLU A 4 0.942 -14.141 1.109 1.00 1.40 ATOM 28 CB GLU A 4 3.465 -16.238 0.597 1.00 1.30 ATOM 29 CG GLU A 4 2.336 -17.264 0.479 1.00 1.20 ATOM 30 CD GLU A 4 2.929 -18.671 0.391 1.00 1.10 ATOM 31 OE1 GLU A 4 4.056 -18.846 0.823 1.00 1.00 ATOM 32 OE2 GLU A 4 2.246 -19.551 -0.108 1.00 0.90 REMARK REMARK NOTE: Predictor should NOT use TER separator between chains REMARK if multichain independent segment of structure has to REMARK be evaluated as a one fragment REMARK ATOM 1 N GLU B 1 10.982 -9.774 1.377 1.00 0.50 ATOM 2 CA GLU B 1 9.623 -9.833 1.984 1.00 0.50 ATOM 3 C GLU B 1 8.913 -11.104 1.521 1.00 0.50 ATOM 4 O GLU B 1 9.187 -11.630 0.461 1.00 0.50 ATOM 5 CB GLU B 1 8.814 -8.614 1.546 1.00 0.50 ATOM 6 CG GLU B 1 7.372 -8.754 2.039 1.00 0.50 ATOM 7 CD GLU B 1 7.339 -8.625 3.562 1.00 0.50 ATOM 8 OE1 GLU B 1 8.370 -8.307 4.131 1.00 0.50 ATOM 9 OE2 GLU B 1 6.284 -8.846 4.132 1.00 0.50 ATOM 10 N THR B 2 7.998 -11.599 2.304 1.00 1.60 ATOM 11 CA THR B 2 7.266 -12.832 1.907 1.00 1.60 ATOM 12 C THR B 2 6.096 -12.456 1.005 1.00 1.60 ATOM 13 O THR B 2 5.008 -12.217 1.466 1.00 1.60 ATOM 14 CB THR B 2 6.731 -13.533 3.157 1.00 1.60 ATOM 15 OG1 THR B 2 7.662 -13.379 4.220 1.00 1.60 ATOM 16 CG2 THR B 2 6.526 -15.019 2.864 1.00 1.60 TER END (B) An example of how to use the RR format to submit a prediction of interchain (chains A and B) residue-residue contacts defined as Cb-Cb distances < 8 A. PFRMAT RR TARGET Txxxx AUTHOR xxxx-xxxx-xxxx REMARK Predictor remarks METHOD Description of methods used METHOD Description of methods used METHOD Description of methods used MODEL 1 HLEGSIGILLKKHEIVFDGC # <- entire target sequence (up to 50 HDFGRTYIWQMSD # residues per line) A1 B9 0 8 0.70 A1 B10 0 8 0.70 # <- indices of residues: Ai and Bj, A1 B12 0 8 0.60 # <- the range of Cb-Cb distance predicted A1 B14 0 8 0.20 # for the residue pair: d1 and d2 (real), A1 B15 0 8 0.10 # <- probability of the distance between A1 B17 0 8 0.30 # Cb atoms being within the specified A1 B19 0 8 0.50 # range: p (real) A2 B8 0 8 0.90 A3 B7 0 8 0.70 A3 B12 0 8 0.40 A3 B14 0 8 0.70 A3 B15 0 8 0.30 A4 B6 0 8 0.90 A7 B14 0 8 0.30 A9 B14 0 8 0.50 END CASP4 organizers 06/24/2000 /Center.html/Center.htmlProtein Structure Prediction Center http:/// http:///Fourth Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction Targets: ranking by Target Number Targets expire on specified date at midnight (24:00) local time in California (GMT - 7 hours). Top of Form 1 Click here for help. Tar-id Name Nres Status Entry-date Expiry-date Description T0086 UBIC 164 Solved by X-ray 11 May 20 Jul (Expired) Chorismate lyase, E. coli T0087 PPX1 310 Diffracting crystals available 12 May 1 Sep PPase, S. mutans T0088 GAFD 156 Diffracting crystals available 12 May 1 Sep GafD, E. coli T0089 FTSA 419 Solved by X-ray 18 May 1 Sep FtsA, T. maritima T0090 YQIE 209 Solved by X-ray 18 May 6 Jul (Expired) ADP-ribose pyrophosphatase, E. coli T0091 YBAB 109 Diffracting crystals available 20 May 1 Sep Hypothetical protein HI0442, H. influenzae T0092 YECO 241 Solved by X-ray 20 May 22 Jul (Expired) Hypothetical protein HI0319, H. influenzae T0093 YIBK 160 X-ray structure determination in progress 20 May 22 Jul (Expired) Hypothetical protein HI0766, H. influenzae T0094 CPDase 181 Solved by X-ray 25 May 1 Aug (Expired) Cyclic phosphodiesterase, A.thaliana T0095 CTN1 244 Solved by X-ray 5 Jun 1 Sep Alpha(E)-catenin fragment, mouse T0096 FADR 239 Solved by X-ray 8 Jun 1 Sep FadR, E. coli T0097 ER29 105 Solved by NMR 8 Jun 31 Aug C-terminal domain of ERp29, rat T0098 SP0A 121 Solved by X-ray 12 Jun 15 Aug C-terminal domain of Spo0A, B. stearothermophilus T0099 - 56 Solved by NMR 12 Jun 25 Jul (Expired) - T0100 PMEA 342 Solved by X-ray 13 Jun 3 Jul (Expired) Pectin Methylesterase, E. chrysanthemi PDB code 1QJV T0101 PELL 400 Solved by X-ray 13 Jun 1 Sep Pectate lyase PelL, E. chrysanthemi T0102 AS48 70 Solved by NMR 16 Jun 30 Aug Bacteriocin AS-48, E. faecalis T0103 PICP 372 Solved by X-ray 27 Jun 1 Sep Pepstatin insensitive carboxyl proteinase, Pseudomonas sp. T0104 YJEE 158 Solved by X-ray 27 Jun 1 Sep Hypothetical protein HI0065, H. influenzae T0105 SP100 94 Solved by NMR 5 Jul 31 Aug Protein Sp100b, human T0106 SFRP3 128 X-ray structure determination in progress 5 Jul 25 Jul (Expired) Secreted frizzled protein 3, mouse T0107 CBD9 188 Solved by X-ray 10 Jul 29 Aug Family 9 carbohydrate binding module, T. maritima T0108 CBD17 206 Solved by X-ray 10 Jul 1 Sep Family 17 carbohydrate binding module, C. cellulovorans T0109 ORN 182 Solved by X-ray 10 Jul 1 Sep Oligoribonuclease, H. influenzae T0110 RBFA 128 Solved by X-ray 10 Jul 1 Sep Ribosome-binding factor A, H. influenzae T0111 ENO 431 Solved by X-ray 12 Jul 1 Sep Enolase, E. coli T0112 DHSO 352 Solved by X-ray 13 Jul 31 Aug Ketose Reductase / Sorbitol Dehydrogenase, B. argentifolii T0113 HCD2 261 Solved by X-ray 13 Jul 4 Aug (Expired) Short chain 3-hydroxyacyl-coa dehydrogenase, rat T0114 AFP1 87 Solved by NMR 13 Jul 4 Aug (Expired) Antifungal protein AFP-1, S. tendae T0115 KHSE 300 Solved by X-ray 18 Jul 1 Sep Homoserine kinase, M. jannaschii T0116 MUTS 811 Solved by X-ray 24 Jul 31 Aug MutS, T. Aquaticus T0117 DNK 250 Solved by X-ray 25 Jul 1 Sep Deoxyribonucleoside kinase, D. melanogaster T0118 ENRN 149 Solved by X-ray 26 Jul 1 Sep Endodeoxyribonuclease I, Bacteriophage T7 T0119 BENC 338 X-ray structure determination in progress 26 Jul 1 Sep Benzoate dioxygenase reductase, Acinetobacter sp. T0120 XRCC4 336 Solved by X-ray 26 Jul 1 Sep DNA repair protein XRCC4, human Bottom of Form 1 CASP4 organizers . . . 20001227 14.22 Leszek: "Alexander Yu. Kushelev" wrote: Dear Sir,I would greatly appreciate if you inform us about our results in CAFASP2 or inform us where we can get this information. With best regards A.Kushelev Dear Alexander. All results are at the main CAFASP pages: http://www.cs.bgu.ac.il/~dfischer/CAFASP2/ Secondary structure prediction results are done by Burkhard Rost: http://cubic.bioc.columbia.edu/~rost/Var/cafasp/sec/index.html But You will not be ranked if You did not submit CASP formatted predictions. If You can generate the CASP formatted predictions You can talk with Burkhard Rost to generate Your ranking or You can try to compare Your evaluation with the results of other server which are now public. Best regards and a happy new year, Leszek -- Bioinformatics Laboratory International Institute of Molecular and Cell Biology ks. Trojdena 4, 02-109 Warszawa, Poland tel: +48-22 668 5384 fax: +48-22 668 5288 . . . * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ***************************************************************************** 20010111 Кушелев лезет в интернет по адресу, указанному Leszek: http://www.cs.bgu.ac.il/~dfischer/CAFASP2/ И не может найти таблицу учатников, которая еще недавно была видна (см.20010112 Конкурс по белкам) На сервере cafasp2 найдена следующая информация: ***************************************************************************** * * * * * * * CAFASP2 What is CAFASP? Main conclusions from the alignment accuracy evaluation. * CAFASP2 evaluation results. * Targets. * The latest CAFASP2 announcement. * CAFASP2 evaluation procedures. * Summaries of the servers' results. Sponsors wanted! Our automation efforts are looking for sponsors. Please contact us! The Most Wanted. Methods for protein structure prediction have matured to the point where models produced by prediction algorithms can be used to understand and test hypotheses about biological function. The goal of this community wide effort is to provide structural and functional insights into biologically important proteins, particularly those that are currently intractable to experimental structural determination. See also the Jan. 4, 2001, report on Nature . LiveBench the continuous, large-scale evaluation of automatic structure prediction servers, extends the scope of CAFASP in that it is a continous experiment carried out using a large-number of targets. CAFASP meta server url: http://cafasp.bioinfo.pl/. Meta server mirrors: * in bioinbgu . * http://USA-CAFASP.BioInfo.PL/ . *************************************************************** CAFASP-2 EVALUATION RESULTS DISCLAIMER: As of today, it is widely believed that in general, human-expert, computer-aided protein structure prediction can be more powerful than fully automated predictions. Similarly, we believe that human-expert assessment can be significantly more accurate than automatic evaluation. All CAFASP-2 evaluations were produced using fully automated methods. Consequently, there are a number of limitations and shortcomings in the evaluations presented below, including the lack of an expert interpretation of the numerical data. However, this was one of the goals of CAFASP: to not only evaluate fully automated predictions, but also carry out the evaluation using fully automated methods. These methods were described in advance before the experiment began, and participation in CAFASP implied that participants agreed with these methods. One of the advantages of running CAFASP in parallel with CASP, is that CAFASP participants can also benefit from the human-expert assessment provided by the casp assessors. We believe that as of today, the human-expert assessment will in most cases be a better indicator of the servers' performance than the one presented below. Although all evaluations in CAFASP-2 were performed by Fully Automated programs, the full responsibility of running the programs and present the results lied entirely on each of the categories' coordinators . The procedures applied by these programs were delineated in advance . Participation in CAFASP-2 implied acceptance of these procedures and rules. The results of the automatic evaluation are presented here and are independent of any other human assessment that may be carried out. In addition to the automated evaluation of CAFASP-2 predictions, two comparative analyses of the automated results filed to CAFASP-2 with those filed at CASP-4 may be carried out. These are described at the bottom of this page. CAFASP-2 EVALUATION RESULTS: * Fold Recognition Evaluation Results (Fischer) * Secondary Structure Evaluation Results (Rost). * Homology Modeling Evaluation Rules and Results (Dunbrack). Preliminary raw data only. * Ab initio Evaluation Rules and Results (Ortiz). Preliminary raw data only. * Contacts Prediction Evaluation and Results (Valencia). CAFASP-CASP PERFORMANCE COMPARISONS: The two comparative analyses of the automated results filed to CAFASP-2 with those filed at CASP-4 that will be carried out are: * 1. CAFASP-CASP automated evaluation comparison. After the CASP-4 predictions become available to all, we may apply to them the exact same procedures as those described above. * 2. CAFASP-CASP human assessor comparison. The CASP-4 assessor of each category may apply his/her assessing procedures to both the CASP-4 and CAFASP-2 predictions. It is obvious that a CAFASP-CASP performance comparison is not 100% fair for several reasons: * The CAFASP-2 predictions are filed months before the CASP-4 predictions are filed. Thus, it is likely that for some cases, the availability of more sequence or structure entries in the databases may make a particular target easier to predict. * The manual casp4 predictions can make use of the CAFASP results (but not vice-versa). In summary, although the above comparisons may be somewhat unfair towards the automatic results, we beleive that they may provide important insights about the servers' capabilities. Most importantly, this comparison will provide valuable insights about the humans' capabilities; identifying the latter is essential in order to be able in the future to incorporate new features into the automatic programs. We will make our best efforts to assess how much the "time factor" plays a role. *************************************************** CAFASP2 ALIGNMENT ACCURACY EVALUATION This document contains four sections: * 1. Evaluation Description. * 2. Main Results. * 3. Additional Automatic Evaluations. * 4. Conclusions. DISCLAIMER The results of the automatic Fold Recognition evaluation are presented here and are independent of any other human assessment that may be carried out. 1. Evaluation Description In this section we shortly summarize the evaluation procedure applied: * The Scoring Functions used and * * * * * * * * * representative galleries of publicly released targets and of CASP4 targets of superpositions of pairs of models and targets at different scores. gallery/gal.htmlgallery/gal.html * Multiple models considered for each target. * Multi-domain partitioning. * Target classification into Homology Modeling Targets, Easy, and Hard Fold Recognition Targets. * The "N-1" ranking procedure. * The categories of predictions: on-time and late. * Links to the targets, models and all the data used in the evaluation. 2. Main Results WARNING In the results below, the servers are identified using their casp-id number. The name and casp-id number of each server, is given here . These results correspond to the MaxSub evaluation applied on the on-time, first models only. The results are shown in a number of partitions. The main result of the evaluation is based on the MaxSub evaluation on the 26 Fold Recognition Targets, but we also present the evaluation on the 15 Homology Modeling Targets. Set # Targets Maximum Correct Best Servers Second Best Servers Third Best Servers Summary Homology Modeling Targets (All) 15 15 093 106 107 132 260 108 111 259 103 395B HM Table Fold Recognition Targets (All) Main Result 26 5 093 106 108 395B 103 259 132 260 FR Table Servers Appearing at the top ranks: * 093 106 107 108 132 260 395B. We subdivide the FR Targets into 5 easy targets (T0100, T0101, T0104, T0109 and T0127) and 21 hard targets, and present the subtotals for the easy and hard targets separately: Set # Targets Maximum Correct Best Servers Second Best Servers Third Best Servers Summary Easy Fold Recognition Targets (Easy). 5 4 093 106 108 395B 259 260 395 Easy FR Hard Fold Recognition Targets (Hard). 21 2 (results not significant) 103 127 132 216 108 280 Hard FR As this subdivision shows, it is hard to derive significant conclusions from the evaluation on the hard FR targets alone, because our evaluation awards credit to very few models. To gain a better insight on the servers' performance on the hard FR targets, other evaluation methods are needed, including those with more lenient thresholds, sequence independent ones, or others (see below). A further subdivision of the HM targets into 4 harder HM targets (T0089, T0090 T0092 and T0103) and 11 easy HM targets, and a partition of the 4 harder HM targets + the 5 easy FR targets are presented here. In addition, for each target we present the detailed results of the MaxSub evaluation, which includes individual statistics on all models (including models 2 to 5 and late submissions). Specificity Analysis. Fold recognition servers usually report the list of top hits for each prediction, along with an assigned score. We used this information to compute the specificity or selectivity of the servers in the lines of CAFASP-1. We classified the hits into correct and incorrect ones using the MaxSub scores (FR Table above). Then, we computed the number of correct hits that a server produced above the server's score of its largest incorrect hit. We did the same for the scores of the following incorrect hits. The specificity or selectivity results show that the servers with the best specificity (106, 107, 259 and 260) were able to identify most of the easy FR targets with better scores than their first false positive. The most selective server had 4 correct predictions before its first false positive. Using the classification of correct and incorrect predictons according to the MaxSub scores at the larger threshold of 5.0 A (see the Additional Automatic Evaluations section below), a similar result is obtained, but at this threshold, servers 108 and 395 ranked higher. Allowing one false positive, servers 132, 260 and 395 are at the top with 6 correct answers. (See also the specificity analysis using a CAFASP1-like scoring system below). 3. Additional Automatic Evaluations In addition to the above "official" evaluation, we have performed various different evaluations , including: * MaxSub evaluation using * the best of the top 5 models and * normalization by Structural Alignment * first models but at a MaxSub threshold of 5.0 A * lgscore evaluation * CAFASP1-like evaluation * touch evaluation (experimental) The results from these additional evaluations were very similar to the official result above, showing only minor differences in the final ranking of the servers. 4. Conclusions What did the servers succeed to predict? It is clear that with the strict MaxSub evaluation criteria, which considered only on-time models no.1, good models were predicted for all 15 HM targets and for the 5 easy FR targets. However, there was much lower success among the 21 hard FR targets. When it is clearly determined which of these 21 targets correspond to new folds, then we will be able to know how many of these targets could not possibly be predicted by fold recognition. Nevertheless, even for a target with a known fold, the fact that MaxSub scored a prediction with a zero does not necessarily imply that the prediction is totally wrong. In some cases, a prediction may have identified the correct parent but due to alignment errors, only small regions were modeled accurately. Because of the relatively low sensitivity of our automated evaluation, a "human-expert" evaluation is required to learn more about the prediction capabilities of the servers among the 21 hard FR targets. The casp human assessor may provide some additional insights when he/she assesses the servers' models of these hard FR targets. In addition, we searched for any predictions that may have captured something about the true fold, and we have found that: * No potentially valuable prediction could be found for targets T0086, T0094, T0096.2, T0105, T0106, T0118, T0120, T0126. * Valuable predictions that may have captured (at least in part) the correct fold or a correct motif were found for targets T0087.1 , T0087.2, T0091, T0097, T0098, T0102, T0107, T0108, T0110, T0114, T0115, T0116, T0121.2. * Some of the especially interesting predictions not scored in the above tables were: * Model no.2 from server 132 on target T0108 * The late model no. 1 from server 216 on target T0087.2 * Model no.3 from server 389 on target T0110 * Model no.5 from server 220 on target T0114 * Model no.3 from server 105 on target T0115 * Model no.1 from servers 093 and 107, and Model no. 2 from server 108 on target T0121.2 Here is a list of main conclusions that we can draw from CAFASP2: * Hard to assess beyond the 15 HM targets and the 5 easy FR targets. * Hard to use automatic evaluation on hard cases and especially when only a few hard targets exist. To discriminate borderline predictions, more accurate automatic evaluation methods are needed, although it is not clear how useful such predictions might be. * At this point, the conclusions listed here are mainly based on the evaluation within the easier FR targets. * 4 servers better than the rest: ffas (395B), threaders, 3dpssm and inbgus, but fugue is not much behind. These servers are significantly better than pdbblast, even within the HM targets alone. sam-t99 also appears to follow closely after the top 4 servers, also showing excellent performance in the HM targets, although with lgscore it ranked second. * HM servers not better than FR servers on HM targets. * The additional automatic evaluations generally confirmed the above findings, with very minor exceptions. * Selectivities as bad as in cafasp1, but the difficulty of targets has increased significantly. Selectivity on the 5 easy FR targets is good. * From the new servers that did not participate in CAFASP1, fugue is approaching the performance of the top 4 servers. * The ab initio server isites appears to give interesting , promising models for the targets where FR fails. * For future CAFASP experiments, the raw output will be required to be in PDB format containing at least C-alpha atoms. * Taken together, the servers as a group identified roughly double the number of correct targets than the best of the servers. To determine how useful the servers as a group might have been for a human predictor, it would be interesting to evaluate human participants at casp who used the servers' results, as well as the cafasp-consensus group predictions filed at casp. CAFASP2 url: http://www.cs.bgu.ac.il/~dfischer/CAFASP2 . . . *********************************************************************** Короче, мы исчезли из списков участников. Всего участников было 33 (мы были как раз под номером 33), а осталось 28... *********************************************************************** . . . 20001229 15.42 Leszek: "Alexander Yu. Kushelev" wrote: Dear Sir,I haven't found our CASP-id number of our server. So I don't know if our results were taken into consideration. Would you be so kind as inform me what's the matter. With best regards A.Kushelev Dear Sir, What is You CASP id ? We can contact the CASP organizers and ask if they have Your files. If they send us Your submissions we can talk with Burkhard to include You in the evaluation. I could not find any CASP formatted files from Your server. Best regards, Leszek -- Bioinformatics Laboratory International Institute of Molecular and Cell Biology ks. Trojdena 4, 02-109 Warszawa, Poland tel: +48-22 668 5384 fax: +48-22 668 5288 . . . 20010105 14.20 Leszek: "Alexander Yu. Kushelev" wrote: Dear Sir,We have sent our results in the CASP format but they were rejected because the amino acid consequence received by your nucleotide consequence differed from aminoacid consequence of the targets. When we undersood the situation the receiving of the data was finished. So we have sent our results only to you. We understood you so that you included our results into the data base under the number 33(they are in the CAFASP data base). I would greatly appreciate if you clear the situation. With best regards Dear Sir, To evaluate secondary structure prediction You need to provide results formatted in the "secondary structure prediction format" as described at the CASP site. Otherwise there is no way to evaluate them. If they were rejected by the CASP server, that's not good. Can You send me the predictions You sent to the CASP server ? Leszek -- Bioinformatics Laboratory International Institute of Molecular and Cell Biology ks. Trojdena 4, 02-109 Warszawa, Poland tel: +48-22 668 5384 fax: +48-22 668 5288 . . . 20010105 17.33 "Alexander Yu. Kushelev" wrote: Dear Sir,We have predicted 3D structure of the targets in the format PDB. As I understood you personally have put our results into CAFASP data base under number 33. I greatly appreciate your will to help us. It's very important to us to know how works our program. With best regards A.Kushelev Do You predict 3D structure or secondary structure ? Leszek -- Bioinformatics Laboratory International Institute of Molecular and Cell Biology ks. Trojdena 4, 02-109 Warszawa, Poland tel: +48-22 668 5384 fax: +48-22 668 5288 . . . 3D-structure only Alexander Kushelev 20010105 . . . Далее ждем-с ответа... 20010111 . . .