n
Project Objectives:
1) Use the Human Genome Sequence in conjunction with the known Human Protein
Sequences to accurately identify nucleotide sequences within the Human
Genome.
2) Once the nucleotide sequences have been identified find accurate consensus
sequences and statistics information
Methodology:
By using the Protein Databases from NCBI and EMBL, and also using the Human
Genome Sequence,
Stage 1 Locate all nucleotide sequences within the HGS, by reverse mapping
Protein sequences to their location in the Genome. By definition the
Hypothetical Protein Sequences are not to be used to identify consensus
sequences.
Stage 2 Analyze the Nucleotide Sequences to determine accurate consensus
sequences and Statistics.
Stage 3 Using the information from stage 2 scan the Genome for previously
unidentified Genes
Current Status: -- Currently in Stage 1 as of 3/10/2010 6:32:30 AM
Database Status:
| Chromosome |
Unique |
Non-Unique |
Non-Hypothetical |
Hypothetical |
Finished on |
| CHR1 |
2734 |
4928 |
1997 |
2931 |
11/3/2002 10:46:13 AM |
CHR10 |
1624 |
2660 |
1434 |
1226 |
10/31/2002 9:25:43 PM |
CHR11 |
1784 |
3142 |
1379 |
1763 |
10/30/2002 11:47:36 PM |
CHR12 |
1436 |
2466 |
913 |
1553 |
10/30/2002 1:23:36 AM |
CHR13 |
1014 |
2914 |
1591 |
1323 |
10/29/2002 4:02:29 AM |
CHR14 |
1030 |
1754 |
725 |
1029 |
10/28/2002 12:18:36 AM |
CHR15 |
795 |
1756 |
817 |
939 |
10/27/2002 10:09:25 AM |
CHR16 |
1173 |
1971 |
957 |
1014 |
10/26/2002 11:10:10 AM |
CHR17 |
1284 |
1807 |
818 |
989 |
10/25/2002 7:48:14 PM |
CHR18 |
887 |
1484 |
731 |
753 |
10/25/2002 7:40:47 AM |
CHR19 |
1179 |
1601 |
657 |
944 |
10/24/2002 4:26:58 AM |
CHR2 |
2362 |
4109 |
1821 |
2288 |
10/23/2002 7:04:59 PM |
CHR20 |
749 |
1061 |
491 |
570 |
10/22/2002 6:18:41 AM |
CHR21 |
403 |
562 |
284 |
278 |
10/21/2002 3:34:15 PM |
CHR22 |
603 |
840 |
412 |
428 |
10/20/2002 11:08:11 PM |
CHR3 |
1782 |
3504 |
1373 |
2131 |
10/21/2002 12:51:14 AM |
CHR4 |
1515 |
3093 |
1191 |
1902 |
10/19/2002 7:02:40 AM |
CHR5 |
1697 |
3316 |
1415 |
1901 |
10/18/2002 3:33:35 AM |
CHR6 |
1980 |
3702 |
1748 |
1954 |
10/17/2002 12:40:45 AM |
CHR7 |
1784 |
3142 |
1640 |
1502 |
10/15/2002 9:50:08 PM |
CHR8 |
1453 |
2707 |
1242 |
1465 |
10/14/2002 9:33:31 PM |
CHR9 |
1335 |
2302 |
1024 |
1278 |
10/14/2002 12:30:48 AM |
CHRX |
1325 |
3248 |
1056 |
2192 |
10/13/2002 7:03:29 AM |
CHRY |
265 |
821 |
234 |
587 |
10/12/2002 1:55:25 PM |
Genome Wide Status:
Total Unique Proteins found 28507 of 38050 Proteins
74%
Total Protein sequences found 58890 including duplicates.
Total Hypothetical Sequences 32940 including duplicates.
Total Non-Hypothetical Sequences 25950 including duplicates.
Total Unique Hypothetical Sequences 17605 of 23440 which is 75%
Total Unigue Non-Hypothetical Sequences 10902 of 14610 which is 74%