ENCODE Target Regions (January 2004)

For background information, please see: NHGRI: ENCODE Target Selection Process

Stratified Random Picks

To make the stratified picks, the human genome was divided into a top 20%, middle 30%, and bottom 50% stratum along two axes: gene density and non-exonic conservation. Three random picks were taken from each stratum, and a fourth pick was made from the strata that were under-represented in the manual picks. One additional backup pick was made in each stratum as a contingency for unforeseen technical problems within the region. The backup pick is the last (parenthesized) entry listed in each table section.

For an explanation of how gene density and non-exonic conservation were determined, see the Methods section.

Non-Exonic Conservation 0% - 50%, Gene Density 0% - 50%
November 2002 April 2003 July 2003RegionStats
chr13:24500016-25000015 chr13:29450016-29950015 chr13:28318016-28818015 ENr111 Non-Exonic Conservation 2.8%, Gene Density 0.5%
chr2:51837455-52337454 chr2:51616414-52116413 chr2:51633239-52133238 ENr112 Non-Exonic Conservation 3.8%, Gene Density 0.0%
chr4:118527386-119027385 chr4:118639860-119139859 chr4:118705475-119205474 ENr113 Non-Exonic Conservation 3.9%, Gene Density 0.0%
chr10:54489120-54989119 chr10:55376221-55876220 chr10:54828416-55328415 ENr114 Non-Exonic Conservation 2.8%, Gene Density 1.2%
chr5:16187472-16687471 chr5:15942554-16442553 chr5:15962554-16462553 (ENr115) Non-Exonic Conservation 5.1%, Gene Density 1.7%

Non-Exonic Conservation 0% - 50%, Gene Density 50% - 80%
November 2002 April 2003 July 2003RegionStats
chr2:116215329-116715328 chr2:118201388-118701387 chr2:118389719-118889718 ENr121 Non-Exonic Conservation 6.2%, Gene Density 2.3%
chr18:61234622-61734621 chr18:61046295-61546294 chr18:59410290-59910289 ENr122 Non-Exonic Conservation 3.4%, Gene Density 3.4%
chr12:40239443-40739442 chr12:40056957-40556956 chr12:38626477-39126476 ENr123 Non-Exonic Conservation 1.7%, Gene Density 3.1%
chr2:197214044-197714043 chr2:198465410-198965409 chr2:198703930-199203929 (ENr124) Non-Exonic Conservation 5.4%, Gene Density 3.3%

Non-Exonic Conservation 0% - 50%, Gene Density 80% - 100%
November 2002 April 2003 July 2003RegionStats
chr2:233173598-233673597 chr2:234508167-235008166 chr2:234778639-235278638 ENr131 Non-Exonic Conservation 1.3%, Gene Density 4.6%
chr13:107927238-108427237 chr13:112376702-112876701 chr13:111238065-111738064 ENr132 Non-Exonic Conservation 1.1%, Gene Density 5.5%
chr21:36983033-37483032 chr21:39242992-39742991 chr21:39242993-39742992 ENr133 Non-Exonic Conservation 2.3%, Gene Density 5.2%
chr4:48032776-48532775 chr4:47573442-48073441 chr4:47639061-48139060 (ENr134) Non-Exonic Conservation 1.9%, Gene Density 4.4%

Non-Exonic Conservation 50% - 80%, Gene Density 0% - 50%
November 2002 April 2003 July 2003RegionStats
chr16:25969826-26469825 chr16:25800363-26300362 chr16:25839478-26339477 ENr211 Non-Exonic Conservation 9.7%, Gene Density 0.5%
chr5:142482586-142982585 chr5:141883116-142383115 chr5:141928468-142428467 ENr212 Non-Exonic Conservation 6.7%, Gene Density 1.7%
chr18:25196197-25696196 chr18:25353226-25853225 chr18:23717221-24217220 ENr213 Non-Exonic Conservation 7.4%, Gene Density 0.9%
chr4:124166677-124666676 chr4:124280349-124780348 chr4:124345964-124845963 (ENr214) Non-Exonic Conservation 6.3%, Gene Density 0.9%

Non-Exonic Conservation 50% - 80%, Gene Density 50% - 80%
November 2002 April 2003 July 2003RegionStats
chr5:57392856-57892855 chr5:55805775-56305774 chr5:55851135-56351134 ENr221 Non-Exonic Conservation 7.9%, Gene Density 2.2%
chr6:132023965-132523964 chr6:132111977-132611976 chr6:132157417-132657416 ENr222 Non-Exonic Conservation 6.9%, Gene Density 2.1%
chr6:73699933-74199932 chr6:73683390-74183389 chr6:73728830-74228829 ENr223 Non-Exonic Conservation 6.4%, Gene Density 3.6%
chr4:53859184-54359183 chr4:53728692-54228691 chr4:53794311-54294310 (ENr224) Non-Exonic Conservation 9.0%, Gene Density 2.1%

Non-Exonic Conservation 50% - 80%, Gene Density 80% - 100%
November 2002 April 2003 July 2003RegionStats
chr1:146905332-147405331 chr1:147933156-148433155 chr1:148374643-148874642 ENr231 Non-Exonic Conservation 10.2%, Gene Density 8.4%
chr9:123331831-123831830 chr9:125138972-125638971 chr9:127061347-127561346 ENr232 Non-Exonic Conservation 8.3%, Gene Density 5.9%
chr15:36628619-37128618 chr15:41311935-41811934
(manually placed)
(manually placed)
ENr233 Non-Exonic Conservation 9.7%, Gene Density 10.6%
chr17:35665792-36165791 chr17:33478638-33978637 chr17:33775531-34275530 (ENr234) Non-Exonic Conservation 7.7%, Gene Density 6.1%

Non-Exonic Conservation 80% - 100%, Gene Density 0% - 50%
November 2002 April 2003 July 2003RegionStats
chr14:47673341-48173340 chr14:51867364-52367363 chr14:51867364-52367363 ENr311 Non-Exonic Conservation 14.9%, Gene Density 0.1%
chr11:132612235-133112234 chr11:131133068-131633067 chr11:130637240-131137239 ENr312 Non-Exonic Conservation 13.5%, Gene Density 0.3%
chr16:62362206-62862205 chr16:62010885-62510884 chr16:62051662-62551661 ENr313 Non-Exonic Conservation 15.4%, Gene Density 0.0%
chrX:42149253-42649252 chrX:42714870-43214869 chrX:42934523-43434522 (ENr314) Non-Exonic Conservation 13.4%, Gene Density 0.7%

Non-Exonic Conservation 80% - 100%, Gene Density 50% - 80%
November 2002 April 2003 July 2003RegionStats
chr8:118874200-119374199 chr8:118481838-118981837 chr8:118769628-119269627 ENr321 Non-Exonic Conservation 11.4%, Gene Density 3.2%
chr14:93204045-93704044 chr14:97378512-97878511 chr14:97378512-97878511 ENr322 Non-Exonic Conservation 15.9%, Gene Density 2.9%
chr6:108287568-108787567 chr6:108264834-108764833 chr6:108310274-108810273 ENr323 Non-Exonic Conservation 18.6%, Gene Density 2.3%
chrX:119675382-120175381 chrX:120734591-121234590 chrX:121480070-121980069 ENr324 Non-Exonic Conservation 10.7%, Gene Density 2.0%

Non-Exonic Conservation 80% - 100%, Gene Density 80% - 100%
November 2002 April 2003 July 2003RegionStats
chr2:218998720-219498719 chr2:220241365-220741364 chr2:220479885-220979884 ENr331 Non-Exonic Conservation 13.3%, Gene Density 9.1%
chr11:65865884-66365883 chr11:64434365-64934364 chr11:63959673-64459672 ENr332 Non-Exonic Conservation 13.4%, Gene Density 9.0%
chr20:33559944-34059943 chr20:34509944-35009943 chr20:34556944-35056943 ENr333 Non-Exonic Conservation 11.5%, Gene Density 9.2%
chr6:41294331-41794330 chr6:41299332-41799331 chr6:41344772-41844771 ENr334 Non-Exonic Conservation 15.2%, Gene Density 4.8%
chr9:124831831-125331830 chr9:126638972-127138971 chr9:128561347-129061346 (ENr335) Non-Exonic Conservation 11.4%, Gene Density 5.4%

Stratification of Manual Picks

The following targets were manually selected as regions of biological interest. See the Methods section for information on strata boundaries.

November 2002 April 2003 July 2003Region Interest
chr7:114288155-116165580 chr7:115351222-117228647 chr7:115365024-117242449 ENm001 CFTR
chr5:131703638-132703637 chr5:131287278-132287277 chr5:131332631-132332630 ENm002 Interleukin_Cluster
chr11:117969240-118469239 chr11:116491019-116991018 chr11:115994758-116494757 ENm003 Apo_Cluster
chr22:28500001-30200000 chr22:30128508-31828507 chr22:30128508-31828507 ENm004 Chr22
chr21:30406794-32102778 chr21:32666762-34362746 chr21:32666763-34362747 ENm005 Chr21
chrX:149572309-150846234 chrX:150700001-151950000
(manually placed)
ENm006 ChrX
chr19:54724484-55728861 chr19:59007794-60008669
(manually placed)
ENm007 Chr19
chr16:10001-510000 chr16:1-500000
(manually placed)
ENm008 Alpha_Globin
chr11:5076527-6078118 chr11:4733457-5735048 chr11:4738729-5740320 ENm009 Beta_Globin
chr7:26599801-27099800 chr7:26665793-27165792 chr7:26699793-27199792 ENm010 HOXA_cluster
chr11:1941933-2547980 (manually placed) chr11:1702703-2308750
(manually placed)
ENm011 IGF2/H19
chr7:112410791-113410790 chr7:113473834-114473833 chr7:113487636-114487635 ENm012 FOXP2

Semi-Manual Picks

These targets were manually selected from regions that have been extensively studied and help balance the stratification.

November 2002 April 2003 July 2003Region Chrom band
chr7:88318937-89433360 chr7:89381916-90496339 chr7:89395718-90510141 ENm013 7q21.13
chr7:91589027-92559435 chr7:92652026-93622434 chr7:92665828-93636236 (ENm015) 7q21.3
chr7:93650512-94868626 chr7:94713518-95931632 chr7:94727320-95945434 (ENm016) 7q21.3
chr7:124556244-125719432 chr7:125619343-126782531 chr7:125633145-126796333 ENm014 7q31.33
chr7:126427507-127330461 chr7:127490599-128393553 chr7:127504401-128407355 (ENm017) 7q32.1


Gene density is defined as the percentage of bases covered either by Ensembl genes or human mRNA best Blat alignments in the UCSC Genome Browser database.

Non-exonic conservation was measured by a fairly elaborate process. 125 base non-overlapping sub-windows were taken inside the 500,000 base windows. Sub-windows with less than 75% of their bases in a mouse alignment were discarded. Of the remaining sub-windows, those with at least 80% base identity were used as the conservation score. To calculate the non-exonic conservation score, the mouse alignments in regions corresponding to the following were discarded: Ensembl genes, all GenBank mRNA Blastz alignments, Fgenesh++ gene predictions, Twinscan gene predictions, spliced EST alignments, and repeats.

The following table shows the non-exonic conservation and gene density of non-overlapping 500 kb regions in the manual picks. The boundaries between strata are:

                          low 50%  middle 30%  high 20%
Gene Density              0.0-1.9%  1.9-4.2%   4.2-100%
Non-Exonic Conservation   0.0-6.3%  6.3-10.6% 10.6-100%

See also: Previous version of this WEB page data

