FB2024_03 , released June 25, 2024
Reference Report
Open Close
Reference
Citation
dos Santos, G. (2023.2.8). Assessment of genes associated with sequence targeting reagents. 
FlyBase ID
FBrf0255493
Publication Type
FlyBase analysis
Abstract
PubMed ID
PubMed Central ID
DOI
Associated Information
Comments

The association of genes to sequence targeting reagents (e.g., RNAi_reagent, sgRNA) is re-assessed with each FlyBase release to identify discrepancies in input data, errors in data processing, or changes in the predicted targets of reagents due to changes in transcript annotations. These reagents include long dsRNAs, short dsRNAs and short guide RNAs (sgRNAs). The sgRNAs may target a gene for knockdown or overexpression; some sgRNAs are expressed as a set of sgRNAs from a single construct that all target the same gene(s). Existing gene associations may be removed, and new gene associations may be created, based on tests for overlap of the sequence targeting reagent and various components of a gene. Criteria for making new reagent-gene associations are more stringent than criteria for validating existing reagent-gene associations. Changes to reagent-gene associations are reviewed by curators, who review the necessary updates to the related alleles.

Existing reagent-gene associations may be validated in many ways, giving deference to the original curated association. For reagents that target knockdown of a gene, the reagent-gene association is validated if any of the following are true: 1) the entire reagent sequence matches at least one of the gene's transcripts perfectly; 2) for longer dsRNA reagents, at least 21 nt of the reagent sequence matches at least one of the gene's transcripts perfectly; 3) there is at least 10 nt of overlap between a reagent's genomic location and at least one of the gene's exons; 4) the gene is a paralog of another gene that has been validated by any of the previous three criteria; 5) for sgRNAs that are expressed as a set of sgRNAs from a single construct, all sgRNA-gene associations validate if any one sgRNA of the set validates by the previous criteria. For sgRNAs that target a gene for overexpression, the sgRNA must have at least 10nt overlap with at least one transcription start site (TSS) of the gene, defined as the region -1250 nt upstream and +350 nt downstream of an annotated TSS; for a set of sgRNAs expressed from a single construct, all sgRNA-gene associations are validated if any one of the sgRNAs in the set meets the previous TSS overlap criteria. Any reagent-gene associations that do not validate are deleted.

After validation of existing gene associations, if a reagent lacks any validated gene associations, new gene associations are searched for by comparing the reagents genomic location to current gene annotations. The criteria for creating new gene associations are more stringent than the criteria for the validation of existing gene associations. For knockdown reagents, a new gene association is made if there is at least 18 nt of overlap between the reagent's genomic location and at least one of the gene's exons. When a new gene association is made, the gene's paralogs are also checked, and gene associations are made where there is at least 21 nt of identity between the reagent and the paralog. For sgRNAs that target genes for overexpression, a new gene association if there is at least 18 nt overlap between a reagent's genomic location and at least one TSS of the gene, defined as the region -750 nt upstream and +250 nt downstream of an annotated TSS.

Note: For 121 RNAi_reagents of the NIG_RNAi_Fly-1 collection, the exact dsRNA sequence encoded is unknown (it was not unavailable at the time of curation). To allow for validation of these RNAi_reagents, FlyBase deduced the possible dsRNA sequence by in silico PCR, using primers known to be used to generate the reagent, and as templates, the cDNAs and annotated transcripts of the gene associated with the reagent. Where many in silico PCR products were possible, products were ranked and the highest ranking product was taken as the dsRNA sequence; a PCR product common to cDNA and annotated transcripts was ranked highest, followed by a PCR product obtained from annotated transcripts, and finally, transcripts obtained from a cDNA associated with the gene. Where many PCR products were found for a given template type, products were ranked based on their size; an in silico PCR product of exactly 500nt was ranked highest, followed by products between 475nt and 525nt in size, and finally, products of any size. In this way, a deduced dsRNA sequence was obtained for all 121 of these NIG_RNAi_Fly-1 reagents. In over two thirds of cases, there was only a single PCR product found of exactly 500nt in size. For the rest, please be aware that the deduced dsRNA sequence may be only one of many possible dsRNA sequences found. In all cases, annotated transcripts/cDNAs from which the dsRNA sequence was deduced are listed on the reagent's FlyBase web report.

Associated Files
Other Information
Secondary IDs
    Language of Publication
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Abbreviation
    Title
    ISBN/ISSN
    Data From Reference
    Alleles (456)
    List limited to the first 200 records. Use the HitList export button in the left sidebar to view all records.
    Genes (434)
    List limited to the first 200 records. Use the HitList export button in the left sidebar to view all records.
    Sequence Features (451)
    List limited to the first 200 records. Use the HitList export button in the left sidebar to view all records.
    Natural transposons (1)
    Experimental Tools (4)
    Transgenic Constructs (443)
    List limited to the first 200 records. Use the HitList export button in the left sidebar to view all records.