FB2024_02 , released April 23, 2024
Result: mE2_34_mRNA_expression_clusters
Open Close
General Information
Name
mE2_34_mRNA_expression_clusters
Species
D. melanogaster
Result type
FlyBase ID
FBlc0000358
Project
Data Provider
Title
Clustering of genes into 34 groups based on similar expression dynamics throughout development, as determined from modENCODE RNA-Seq data.
Status
Current
Accessions
    Biosample Source
    Overview
    Other tissues studied
    Cell component
    Cell line
    Key genes
    Methods
    Sample preparation

    Adult fly population cages were maintained at 24 ̊C on a 24-hour light cycle (14 hours light/10 hours dark). After a 2hr pre-lay, embryos were collected on agar plates for a 2 hour interval during a light cycle, aged appropriately, dechorionated and frozen on dry ice.

    Biosamples analyzed by this result (30)
    Biosample
    Type
    Title
    D. melanogaster, iso-1 strain, embryo (0-2hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (10-12hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (12-14hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (14-16hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (16-18hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (18-20hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (2-4hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (4-6hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (6-8hr AEL), source for RNA.
    D. melanogaster, iso-1 strain, embryo (8-10hr AEL), source for RNA.
    Showing 10 / 30 records. Use Export to HitList above to see all
    Data Analyzed
    Key genes
    Protocol

    Frozen samples were homogenized and extracted using the TRIzol reagent protocol (Invitrogen). RNA was purified on an RNeasy spin column (Qiagen), and DNase treated. Polyadenylated RNAs were purified from total RNA extracts via oligo(dT) binding, using standard Illumina protocol. The poly(A)+ RNA was fragmented using divalent cations under elevated temperature, following by first and second strand cDNA synthesis primed with random hexamers. The cDNA fragments were end-repaired using T4 DNA polymerase and Klenow DNA polymerase, and phosphorylated at their 5' ends with T4 polynucleotide kinase. After adding A bases to the 3' end of the DNA fragments, Illumina adaptor oligonucleotides were ligated to the ends and ~ 300 bp fragments were isolated from an agarose gel, enriched by PCR amplification, and gel-purified again.

    Mode of Assay

    Read length (bases):76

    The samples were quantitated using a Nanodrop, and loaded onto a flow cell for cluster generation and sequenced on an Illumina Genome Analyzer II using either single read or paired end protocols (Illumina).

    Raw Data Analyzed (30)
    Assay / Reagent collection
    Type
    Title
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (0-2hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (10-12hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (12-14hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (14-16hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (16-18hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (18-20hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (2-4hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (4-6hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (6-8hr AEL), modENCODE, unstranded.
    RNA-Seq of D. melanogaster, iso-1 strain, embryo (8-10hr AEL), modENCODE, unstranded.
    Showing 10 / 30 records. Use Export to HitList above to see all
    Processed Data Analyzed (0)
    Result
    Type
    Title
    Analysis
    Methods
    Reference Genome
    Reference Annotation
    Data analysis

    Reads were aligned to Dmel_Release_6 using the STAR aligner v2.3.0e (Linux x86_64) with default parameters on the FASTQ files to generate multiply-mapped BAM files. These were filtered to include reads with only 1 aligned hit ( NH:i:1 attribute) to generate uniquely-mapped BAM files. A custom script was used to convert BAM files into bedgraph files (bam2bedgraph.cc).

    Note that for each pair of paired-end reads, the two reads were mapped independently, and only those reads mapping uniquely to the genome were included in the data submission to FlyBase. In other words, information from one read was not used to resolve ambiguous mapping of its paired read.

    FlyBase reports gene expression levels calculated from RNA-Seq coverage data as RPKM (reads per kilobase of exon model per million mapped reads). The RPKM value is calculated as follows. The uniquely transcribed region(s) for each gene is determined by taking regions covered by exons of the gene and excluding transcribed regions from any overlapping genes, both with respect to genes lying on same strand (for calculation using strand-specific RNA-Seq coverage data), and for genes on either strand (for calculation using unstranded RNA-Seq coverage data). RNA-Seq coverage read-count data was then correlated by location with the uniquely transcribed region(s) of each gene to produce the sum of reads over the entire uniquely transcribed region for the gene. Reads per kilobase of exon model per million mapped reads (RPKM) was then calculated using the method from Motazavi et al, Nat. Methods 6, 621-628 (2008). (RPKM = 10^9 * C / N * L * R, where C = number of reads in gene, N = number of uniquely mappable reads in the experiment, L = sum of uniquely transcribed bases in bp, and R = read length in bp).

    The RPKM values are binned into eight expression levels: Bin 0: No/Extremely low expression (0); Bin 1: Very low expression (1-3), percentiles 1-25, approximately; Bin 2: Low expression (4-10), percentiles 26-50, approximately; Bin 3: Moderate expression (11-25), percentiles 51-75, approximately; Bin 4: Moderately high expression (26-50), percentiles 76-85, approximately; Bin 5: High expression (51-100), percentiles 86-95, approximately; Bin 6: Very high expression (101-1000), percentiles 96-99, approximately; Bin 7: Extremely high (>1000), the 100th percentile, approximately.

    FlyBase RPKM data for all genes can be downloaded from the FlyBase Downloads page (link in the blue navigation bar at the top of all FlyBase web pages).

    Comments

    GO term enrichment and motif enrichment of each cluster is presented in fig. S15 (FBrf0213506).

    Associated Data
    Size
    Files
    Additional Information
    Synonyms and Secondary IDs (6)
    Reported As
    Symbol Synonym
    34 expression clusters
    Gene co-expression clusters
    clustered expression profiles
    co-expression clusters
    mE2_34_mRNA_expression_clusters
    Name Synonyms
    Clustering of genes into 34 groups based on similar expression dynamics throughout development, as determined from modENCODE RNA-Seq data.
    Secondary FlyBase IDs
      References (4)