THE CONSTRUCTION OF THE SCALE MODEL OF THE SPATIAL STRUCTURE OF THE PROTEIN ON ITS NUCLEOTIDE SEQUENCE.

A.Y.Kushelev, S.A.Pisarzhevsky

Laboratory “Nanoworld”

The Bauman Moscow State Technical University

Address for correspondence:

Alexander Kushelev
The Nanoworld laboratory
The Bauman Moscow
State Technical
University
The second Bauman pr.,5
107005, Moscow
Russia
Tel/fax 7-095-263-6608
E-mail:

RUNNING TITLE:COMPOSITION GENETICFL CODE

TOTAL NUMBER OF MANUSCRIPT PAGES:16

TABLES:1

FIGURES:3

A NOTE:A DISKETTE OF THE MANUSCRIPT IS INCLUDED

ENVIRONMENT:WINDOWS 98

WORD PROCESSOR:WORD

ABSTRACT.

The strong correlation dependence of spatial structure of the protein from its nucleotide sequence was theoretically predicted by physical modelling, experimentally discovered and statistically confirmed.

In the process of biosynthesis the third nucleotide of the codon controls the orientation of the amino acid forming the concrete spatial isomer that is the conformation of the protein molecule cutting off competition ways of the forming of 2D and 3D structures.

KEY WORDS: nanoworld, circular model of electron, composition genetical code,prediction of protein structure

ABBREVIATIONS:TCGC-table of compositional genetical code

 

1.INTRODUCTION

The circular model of the electron was created by us what allows to apply the laws of classical physics to “nonclassical” objects, in particular to create the scale geometrical models of the electron covers of the atoms and the molecules. In our classical models of the microobjects the electrons are presented by the circular magnets. We marked for the simplification of further reasonings that the consideration of the inner structure of the electron, the dependence of the radius of the electron ring from the tension on of the nuclear field and other physical nuances of the circular model of the electron go out from the limits of this article.

2. RESULTS .

The first experimental result was obtained in the model experiment with the circular magnets which was carried out with the aim of verification of the hypothesis: “The rings-electrons forms the electron cover-the polyhedron. Really 8 circular magnets supports the form of the octahedron from the rings packed up on the sphere of the electrostatic equilibrium of the electrons. This model of electron cover corresponds to the experimentally discovered fermi-surface (fig.1).

The substitution of the four rings-electrons by the unfinished octahedrons (without one ring) turns the form of the octahedron atom to the familiar to us form the tetrahedron molecule. Electron surfaces of the molecules of carbon tetrachloride and phosphoric acid have such a form.

We constructed more than 1500 circular models of the various chemical compounds.

The special class of chemical compounds in conjugated systems. The model experiment showed that mutual influence of the similar to carbon atoms appeared to be the cause of the reconstruction of the electron structure of the molecule from polyhedron to polylayer what leads to formation of the “flat” conjugated systems, for example, molecule of benzol.

The composition of polyhedron and polylayer atoms and molecules allowed us to compose ringsided structure of DNA nuleotides (fig.2).The structure of our model of DNA doesn’t contradict generally accepted model.

We constructed the model of the group CCON. The different versions of the mutual arrangement of the groups CCON in the protein structure are given on the figure 3.

T-RNA holds amino acid by the ACC-end .The feature of the structure of t-RNA found by us consists in the loop structure of the ACC-end.The last nucleotide of the triplet ACC is turned on the hinge-molecule of the phosphoric acid thus that the process of the formation of the additional diether bond by its free group PO3 with the group PO3 of the êîìïëåìåíòàðíîé chain of RNA has appeared possible.

The last from the two nucleotides forming 3’-end of t-RNA has inversion magnetic properties,which distinguish all nucleotides of the reverse chains of DNA and RNA from the nucleotides of the straight chains.

The northern magnetic poles of the two electrons (one from oxygen and the second from nitrogen),by which the öèòîçèí –N ends ( in which the northern poles of the electrons forming hydrogen bonds with the êîìïëåìåíòàðíûì base are inverted outside) and the southern magnetic poles of the two electrons of the öèòîçèí-S ( in which the southern poles of the electrons forming hydrogen bonds with the êîìïëåìåíòàðíûì base are inverted outside) are disposed êîìïëåìåíòàðíî to the four electrons of the peptide group,which oriented by the group of nitrogen along the rotation axis of t-RNA.The amino acid may be placed by the manipulat-t-RNA in the growing protein chain with the different angle of the turn around the rotation axis.The angle of the turn is regulated by the varible loop of t-RNA and the distance between the third mucleitide of the codon of m-RNA and the first nucleotide of the anticodon of t-RNA.The distance between the first nucleotide of the anticodon (èíîçèí) and the third nucleotide of the codon depends of the nature of the base êîìïëåìåíòàðíîãî for èíîçèí(A,C,G,T).

We have constructed the circularside models of the proteins with the known structure in particular myoglobin, insulin, troponin, oxytocin. The sorting out helped to choose version of the connection of the models of group CCON, which gave the forms of these proteins. It turned out that for example, the amino acid glycin is encoded by the code GGC if it was a member of beta-layer, by the code GGG if it was a member of 3/10 alpha-helix where hydrogen bond wasn’t with the forth but with the third amino acid residue and finally by the code GGT if it was the second amino acid residue of the beta-turning. Thus the composition genetical code was composed showing how the amino acid is disposed relatively previous one depending on its genetical code(table 1).

The regular repetition of the version 1 leads to the formation of alpha-helix.

The repetition of the version 2 leads to the formation of beta-layer.The turn of the chain in beta-layer demands the sequence of the versions 1-4-1.

The repetition of the versions 3 leads tj the formation of xi-helix deviding alpha and beta parts of the proteins.

The repetition of the version 4 leads to the formation of the diminished alph-helix.Its structure differs from the structure of the ordinary alpha-helix :the hydrogen dond between the group CO and the nitrogen group is closed not on the fourth but on the third amino acid residue.

The special case is the helix of collagen. The versions 1,2,3,4 isalternate in it. Collagen has the unusual structure.Its peculiarity is the presence of the soft axis of the synnetry rolled up into helix. This helix has elastic properties.

The geometry of the amino acids and TCGC we layed to the foundation of the program “Pikotechnology”, which assemble the structure of the protein molecule on the screen of the PC. Had used accessible for us codes the program reckoned 30 varieties of keratins to the alpha-structural proteins and 3 types of fibroins to beta-structural proteins.

We compared the data about 2D structure of the proteins from the Brookhaven Protein Data Bank with the data obtained with the help of our program by the codes of the same proteins from Oxford Bank of Nucleotide Sequences.

Eight corresponsing sequences from the Oxford Bank were found for 314 structures of the Brookhavon Bank.Per cent of the content of alpha-codons corresponding to the alpha-helix parts (according to X-ray analysis) is indicated below.

Cholesterol oxidase (1.1.3.6) 92.43

Arabinose-binding protein 57.52

Lysozyme (3.2.1.17) 41.54

Aldolase(4.1.2.13) 73.00

Aminotransferase(2.6.1.1) 72.02

Trp-repressor(E-coli) 61.68

Glycogen phosphorylase (2.4.1.1) 85.75

Myoglobin 70.59

The data obtained by us demands the revision of the existing views on the structures of the proteins, in particular on interpretation of the data of the X-ray structural analysis. The fact is that many globular proteins according to our model experiments consist in sewed together alpha-helix strips, which is interpreted as beta-strips now.

4.DISCUSSION.

Comparing structures of the proteins received with the help of this program with the data of X-ray structural analysis the specialists engaged in study of the protein structure discover divergences consisting in displacing of the ends of alpha-helix at time even for five aminoacid residues.

Can X-ray analysis make such a mistake?

Let's reason. What is alpha-helix? This is the helix each coil of which includes approximately four aminoacid residues.

Consequently residues with numbers 1 and 4 are located side by side and are binded by hydrogen bond.

Adjacent residues to the residue 4 (3 and 5) are also located very closely from the residue number 1 (to say not on 1.5 A but on 2 A).

It is turning out that making a mistake on 1 - 1.5 residues we may actually make a mistake on 5 residues.

So if we see that our program shows all turning of protein alpha-helix with "the mistakes up to 5 aminoacid residues" then the conclusion can be made that it works faultlessly, but X-ray making a mistake on one and a half residues may make a mistake on 5 residues.

In the process of the discussion the following arguments were expressed against the existing of the composition genetical code.

    1.  
    2. It is known that in some cases the proteins can restore their native conformation after denaturation. What is the significance of the composition code in this case.
    3.  
    4. It is known that the 2D structure of the polypeptide may be predicted by its amino acid sequence (but not by its nucleotide sequence). How it could be made without drawing in the composition code?
    5.  
    6. Various organisms have different G-C composition, some codons practically aren’t used if the content of G-C is high. Some proteins of the different organisms fulfil the same function, have high gomology and form the similar 2D and 3D structure. Could the composition code be applied in this case?

Let’s consider the first argument.

In fact the proteins can regenerate. This property of the proteins doesn’t contradict the existance of the composition code which according to our supposition appeared in the process of evolution as duplicating physico-chemical process mechanisms able to accelerate formation of the protein molecule. The stability of the structure is possible in the case, if physic-chemical interactions duplicate the program of the formation of the protein molecule. This could be coordinated in the process of the natural selection of the functional molecules that is the process of the evolution of the mechanisms of biosynthesis.

Let’s consider the second argument.

It fact the 2D and even the 3D structure of the protein may be predicted by its amino acid sequence. And it was made without composition code. But this doesn’t contradict the existence of the latter. Does arifmetic contradict algebra? The amino acid sequence is coordinated with the physico- chemical interactions of the groups composing the protein. The nucleotide sequence contains the information about the primary structure and the additional information about the spatial structure which is coordinated with the physic – chemical interaction of the groups composing the protein molecule. The difference between the arithmetic of the amino acid sequence and algebra of the nucleotide sequence consists of the following: the trustworthiness of the prediction of 2D structure with the help of the latter exceeds the trustworthiness of the first for some orders.

Let’s consider the third argument.

In fact G+C composition of the DNA of the different organisms varies in the limits from 35% for sea urchin to 52% for Escherichia coli. However the proteins fulfilling similar functions may be gathered from the codons used by these organisms in full accordance with the TCGC .

2. METHODS .

The main research method was the physical modeling .It consisted in carrying out model experiments with circular magnets representing the electrons. The number of the circular magnets corresponded to the number of the valent electrons of the molecule.

The supplementary surfaces imitating compensation of electrostatic influence of the nuclei of the atoms and the other electrons of the molecule were used in the experiments.

Some models were made by the method of the symmetrical placing of the set number of the rings with due regard for all kinds of the properties of the real electrons.

The methods of looking through all the versions, analogy, optimization, systematization and others were used.

The data bases PDB and EMBL were used.

THE TABLE OF COMPOSITON GENETICAL CODE

N code code name subname version freq midi
00 AAA K Lys   3 746 66
01 AAC N Asn   1 995 71
02 AAG K Lys   1 746 66
03 AAT N Asn   3 995 71
04 ACA T Thr   2 1328 76
05 ACC T Thr   1 1328 76
06 ACG T Thr   4 1328 76
07 ACT T Thr   3 1328 76
08 AGA R Arg bottom 3 591 62
09 AGC S Ser teta 1 1773 81
10 AGG R Arg bottom 1 591 62
11 AGT S Ser teta 3 1773 81
12 ATA I Ile   2 1054 72
13 ATC I Ile   1 1054 72
14 ATG M Met   1 790 67
15 ATT I Ile   3 1054 72
16 CAA Q Gln   3 790 67
17 CAC H His   1 887 69
18 CAG Q Gln   1 790 67
19 CAT H His   3 887 69
20 CCA P Pro   2 3546 81
21 CCC P Pro   1 3546 81
22 CCG P Pro   4 3546 81
23 CCT P Pro   3 3546 81
24 CGA R Arg top 2 591 62
25 CGC R Arg top 1 591 62
26 CGG R Arg top 4 591 62
27 CGT R Arg top 3 591 62
28 CTA L Leu lambda 2 995 71
29 CTC L Leu lambda 1 995 71
30 CTG L Leu lambda 4 995 71
31 CTT L Leu lambda 3 995 71
32 GAA E Glu   3 790 67
33 GAC D Asp   1 995 71
34 GAG E Glu   1 790 67
35 GAT D Asp   3 995 71
36 GCA A Ala   2 3546 93
37 GCC A Ala   1 3546 93
38 GCG A Ala   4 3546 93
39 GCT A Ala   3 3546 93
40 GGA G Gly   2 395 55
41 GGC G Gly   1 395 55
42 GGG G Gly   4 395 55
43 GGT G Gly   3 395 55
44 GTA V Val   2 1579 79
45 GTC V Val   1 1579 79
46 GTG V Val   4 1579 79
47 GTT V Val   3 1579 79
48 TAA   TKD   1 0 115
49 TAC Y Tyr   1 591 62
50 TAG   TKD   1 0 115
51 TAT Y Tyr   3 591 62
52 TCA S Ser lambda 2 1773 81
53 TCC S Ser lambda 1 1773 81
54 TCG S Ser lambda 4 1773 81
55 TCT S Ser lambda 3 1773 81
56 TGA   TKD   1 0 115
57 TGC C Cys   1 1328 76
58 TGG W Trp   1 591 62
59 TGT C Cys   3 1328 76
60 TTA L Leu teta 3 995 71
61 TTC F Phe   1 790 67
62 TTG L Leu teta 1 995 71
63 TTT F Phe   3 790 67

The version 1 corresponds to the entry of the residue to the 4/10 alpha-helix.

The version 2 corresponds to the entry of the residue to the 3/10 alpha-helix.

The version 3 corresponds to the entry of the residue to the classical beta-layer.

The version 4 corresponds to the entry of the residue to the beta-turn.

FIGURE LEGENDS.

FIG.1

THE STRUCTURE OF THE 8-ELECTRON COVER FOR WHICH THE MAJORITY OF THE ATOMS OF THE MENDELEYEV TABLE ARE COMPLETED.

FIG.2

THE STRUCTURE OF THE NUCKEOTIDES

FIG.3

THE MODEL OF THE SUPERPOSITION OF THE GROUP CCON WITH THE PREVIOUS GROUP CCON IN ACCORDANCE WITH COMPOSITION GENETICAL CODE.

Version 1:the point F coincides with the point B ( moved counter clock-wise for some degrees).

The version 2:the point F coincides with the point A

The version 3:the point F coincides with the point C.

The version 4:the point F coincides with the point B (moved clock-wise for some degrees).

 

FIG.1

FIG.2

FIG.3

http://ftp.decsy.ru/nanoworld/index.htm