Frequently Asked Questions (FAQs)
Pooled Lentiviral shRNA Libraries
- shRNA / Bar-code Design
- What are the details of Cellecta's shRNA design algorithm? How has it been optimized and validated?
- How many constructs went into the assessment of shRNA structure and length?
- What is the length of the bar-code sequence? How are they designed / vetted? What is the extent of cross-reactivity in microarrays?
- Can you use an alternative bar-code design?
- Why did you choose short hairpin over mir30-based libraries?
- How does your knockdown efficiency compare to that of other designs?
- Oligo Preparation
- How many oligos can be prepared per chip?
- What are the limits on the oligo length?
- Is there an optimal length for downstream PCR and cloning?
- What is the oligo mutation rate?
- What is the optimal number of shRNAs per chip (i.e. shRNAs/library)?
- Can we provide our own shRNA designs for oligo preparation?
- Can Cellecta deliver arrayed shRNA libraries?
- Can Cellecta deliver pooled oligos and/or arrayed oligos?
- What are the advantages of pooled format vs. arrayed format shRNA libraries?
- Library Preparation: QC and Quantity
- What is the "mutation" or mis-synthesis rate? What percentage of constructs are correct as designed?
- What percentage of designed constructs are actually in your plasmid DNA library?
- What is the representation after the PCR step as determined by sequencing? What is the error rate?
- What is the representation after cloning and after the plasmid DNA prep?
- What percentage of designed constructs are actually present in your viral library?
- How are you determining viral titer?
- What is the typical representation of the library after infection?
- What quality control measures are in place at each stage of the process to determine errors or dropouts of desired triggers?
- Vector Design
- Sequencing
- Pooled Screens
- Delivered product
- With the pooled plasmids, how many transfections can we do to package more virus? With the virus provided, what is the titer and how much is provided?
- How many screens can be performed with the virus that you provide?
- How many packaging reactions can be done with the DNA that is provided?
- Will you perform the repeat viral packaging for an additional fee?
- Licensing/IP
- Experience
Pooled Lentiviral shRNA Libraries
- shRNA / Bar-code Design
- What are the details of Cellecta's shRNA design algorithm? How has it been optimized and validated?
We have improved the shRNA design for pooled format RNAi screens.
Pooled format screens require uniform shRNA representation in the library and reliable knockdown of target genes by shRNAs expressed at low levels from a single copy of integrated shRNA construct in transduced cells. In addition to our efforts to identify highly active shRNAs in HT validation screens (see below), we were able to optimize the vector and shRNA design for the highest knockdown activity and most uniform representation in the library. Based on recent publications and our own results from RNAi screens in several in vitro cell models (see below), we believe that the design of a lentiviral vector with expression of shRNAs from an internal U6 promoter provides the highest knockdown efficiency in comparison with the H1 promoter or expression of shRNAs in a mir30-based backbone (An D.S., et al. Mol Ther. 2006 October;14(4), Boudreau R.L., et al. Mol Ther. 2009 Jan;17(1):169-75).In order to optimize the structure of the shRNA, we developed an shRNA-target library in an shRNA validation vector (U6Tet-sh-PGK-GFP-Target) for 150 shRNAs, with each shRNA constructed in 40 different designs described in the literature . We performed RNAi screens in CHO-TRex cells transduced with this 150x40 bar-coded shRNA-Target library and measured representation (in the library and transduced cells) and knockdown efficiency of each shRNA construct. The ten best designs were selected based on those having the highest target knockdown activity, equal amplification efficiency of same-design pooled oligonucleotides, and equal representation of same-design shRNAs in the pooled RNAi libraries. The knockdown efficiency of three p53 shRNAs constructed with these 10 best designs in a lentiviral vector was further analyzed by RT-PCR after transduction into mouse fibroblast cells.
- How many constructs went into the assessment of shRNA structure and length?
See above.
- What is the length of the bar-code sequence? How are they designed / vetted? What is the extent of cross-reactivity in microarrays?
We have designed several types of bar-code sequences with sizes of 18, 25, and 34 nucleotides. Based on the results of more than 25 screens performed in the last 3 years, HT sequencing significantly outperforms the hybridization-based approach for identification of shRNA/bar-codes in genetic screen experiments. The main disadvantages of the hybridization-based approach are (1) limited dynamic range of bar-code detection (approximately 100-fold), (2) loss of hybridization signals for at least 30% of bar-codes due to a minimum 100-fold sequence-dependent difference in hybridization efficiency for different bar-codes, and (3) significant cross-hybridization between different bar-codes. HT sequencing provides digital data, which doesn't have these limitations. As a result, we have mainly focused our efforts on developing bar-codes which are compatible with HT sequencing. For shRNA libraries with complexities of up to 55K, our current 18nt bar-codes provide enough discrimination power to reliably identify each specific bar-code using the Illumina-Solexa platform with a mutation rate of approximately 0.2%. This high discrimination power was achieved by developing a unique algorithm for designing bar-codes with maximum differences in sequence and "resistance" to random mutations introduced by oligo synthesis and Illumina-Solexa HT sequencing technology.
We also designed 34 nucleotide bar-codes which can be used for identification of shRNAs by both hybridization and HT sequencing. In addition to maximum distance in sequence similarity, these 34nt bar-codes are also designed with similar GC content (50-70%) and can be used for hybridization with Agilent microarrays. However, we have not tested their performance with microarrays due to the superior performance of HT sequencing.
- Can you use an alternate bar-code design?
Yes. We are constantly improving our bar-code design based on the specific application and complexity of the library. It will not be a problem to design specific bar-codes which will work optimally for your custom libraries.
- Why did you choose short hairpin over mir30-based libraries?
The opinion that short hairpins have better knockdown efficiency than mir30-based shRNAs came from several sources:
- Only one large-scale validation study was published about knockdown activities of mir30-derived shRNAs. NCI (Cancer Gene Anatomy Project) gave a contract research project to Open Biosystems to test mir30-derived constructs (3 shRNAs per gene in replicates) against 136 cancer-related genes (http://cgap.nci.nih.gov/RNAi/shRNAValidation/) in OVCAR-8 and MCF-7 cell lines in 2008. The results demonstrated that approximately 46% (in OVCAR-8 cells) and only 25% (in MCF-7 cells) of mir30 shRNA constructs gave at least 70% knockdown of target mRNAs. TRC collection with short (GG21-S6) shRNAs was validated in several published studies [7,20]. The reported levels of knockdown (more than 70%) for TRC-type shRNA constructs are: 35% (54 shRNA constructs in both A549 and HT29 cells [7]), 38% (256 shRNA constructs in A549 and HT29 cells [20]), 57-68% (340 shRNA constructs mainly in A549 (http://www.sigmaaldrich.com/life-science/functional-genomics-and-rnai/shrna/learning-center/mission-application-data.html). We also analyzed unpublished data from large-scale validation studies performed by Sigma. The percent of shRNA constructs (from approximately 3,000 tested shRNA constructs) with knockdown of more than 70% are: 66% (A549 cells), 39% (HT29 cells), 51% (HeLa cells), and 38% (HepG2 cells).
- Several published articles from independent academic groups (e.g. Boudreau, R.L. et al. Mol. Therapy (2009)17:169-175; McBride, J.L. et al. Proc. Natl. Acad. Sci. (2008)105:5868-5873) recently compared levels of processed (mature) siRNA and levels of knockdown using a set of shRNA and mir30-based constructs designed against several genes (with the same mature sequence) and found that the level of knockdown and concentration of processed siRNA in the cells is consistently better (approximately 1.5-2-fold) for shRNA constructs.
- Based on very promising results published by developers of mir30-derived shRNAs, we tested knockdown activity of mir-based shRNA constructs (at least 10 different mir-based designs) for several control genes in the last three years. Unfortunately, in all of our studies, the standard shRNA design outperformed mir30-based designs in terms of knockdown activity by at least 50%.
In summary, we believe that the mir30-based design is a very promising tool to perform in vivo, ex vivo, and in primary cell studies with regulated expression of siRNAs and induce knockdown without significant toxicity (as reported for U6-expressed shRNA constructs). But for high-throughput RNAi screens, the standard shRNA design is more reliable with at least 50% better performance. Please also note that we decided to choose the RNAi consortium platform (TRC collection) due to large-scale efforts (from MIT, Sigma, and Cellecta) to validate an existing collection of in silico-predicted shRNAs. We are not aware of any serious effort from Cold Spring Harbor Laboratory (mir30 collection) or Thermo Fisher to perform any functional validation project on a genome-wide scale.
- How does your knockdown efficiency compare to that of other designs?
Based on the literature referenced above and on recent experiments to functionally test 120 of our design-validated shRNA constructs, we estimated and compared the percentage of shRNA that were functionally-validated (>70% mRNA knockdown):
miR30 design: 20% 21-mer shRNA design: 50% Cellecta design: 65%
- What are the details of Cellecta's shRNA design algorithm? How has it been optimized and validated?
- Oligo Preparation
- How many oligos can be prepared per chip?
The current oligo pools can be synthesized with complexities of 6.5K, 13K, 27K, and 55K. Actually, the upper limit is defined by the complexity of the library which can be effectively used in the screen (based on our experience, not more than 27K to 55K) rather than the limitation of oligonucleotide synthesis.
- What are the limits on the oligo length?
The maximum length is 180 nucleotides, but the best quality oligos require a length of not more than 160 nucleotides.
- Is there an optimal length for downstream PCR and cloning?
Amplification and cloning is not affected by oligo sizes of up to 200-300bp. However, our research indicates that it is important to destabilize the stem-loop structure of shRNA constructs. Otherwise, in the course of PCR, more stable structures will be depleted. More stable structures are also depleted during propagation of plasmid libraries in bacteria.
- What is the oligo mutation rate?
Approximately 0.2%, i.e. two mutations/deletions in approximately 1,000 nucleotides. We have a special way to reduce the mutation rate to approximately 0.1%, but for shRNA library construction, it is not necessary.
- What is the optimal number of shRNAs per chip (i.e. shRNAs/library)?
The optimal complexity of the library is up to 27K in order to obtain statistically reliable data using the current Illumina-Solexa sequencing platform with approximately 10 x 106 reads per sample. Libraries with a complexity of 55K are acceptable, but duplicate samples should be used at the HT sequencing step.
- Can we provide our own shRNA designs for oligo preparation?
Yes, you can, if you believe that your design will have the best performance. Keep in mind that pilot experiments will be required in order to test the performance of your shRNA design in our HT sequencing-based pooled library platform.
- Can Cellecta deliver arrayed shRNA libraries?
Yes, we can provide you with shRNA libraries in any format you wish, but libraries in arrayed format are significantly more expensive and require a significantly longer time to develop.
- Can Cellecta deliver pooled oligos and/or arrayed oligos?
We can deliver pooled oligos synthesized by Agilent, but not arrayed.
- What are the advantages of pooled format vs. arrayed format shRNA libraries?
Advantages of pooled format RNAi screen with bar-coded shRNA libraries:
- A genome-wide RNAi screen (a set of at least 10,000 genes with at least 3 shRNA per gene) in pooled format is at least 100-fold less expensive than a similar screen in arrayed format. Realistically, it means that RNAi screens in pooled format are the only practical option for individual researchers. Even large-scale experiments with dozens of cell lines could be easily accomplished in pooled format (please refer to recent publications from David Root's group).
- Pooled format screens don't require special infrastructure (HT screening facilities with robotics) and can be performed by a single researcher.
- Pooled format screens can be used for drug target discovery in cell model systems with pathogenic organisms (at least Biosafety Level 3). It is unrealistic to perform similar screens in arrayed format.
- Digital data generated by HT sequencing of bar-codes in biological samples are ideal for statistical analysis and development of databases in a format compatible between different research groups.
- Pooled formats allow development of shRNA libraries with the same set of shRNAs in practically any vector with a choice of fluorescent marker (GFP, RFP, etc.), drug selection (Puro, Neo, Bleo, etc.), shRNA promoter (H1, U6, etc.), tet-regulated shRNA expression, or even the modification of the design of shRNA and generation of new shRNA libraries. This flexibility in the design and set of shRNAs is a clear advantage of pooled format versus arrayed format libraries.
Advantages of arrayed format RNAi screen:
- There are many biological assays (developed for HTS) that can be used only with shRNA libraries in arrayed format. For example, high content analysis can only be performed in arrayed format.
- The quality of data generated from an arrayed format screen is usually better than that from a pooled format screen. The main reason is the difficulty in maintaining complexity of shRNA constructs in pooled format screens.
- Arrayed format screens can be used to establish infrastructure of existing core-facilities for HT screening of small molecule libraries.
Conclusions: We see the shRNA library in pooled format as a complementary tool to the shRNA library in arrayed format. Based on price considerations, it looks rather logical to perform primary (first round) screens in pooled format (if possible, due to limitations of the model system). Hits identified in the pooled format screen could be validated in the secondary screen in arrayed format and used for follow-up experiments and validation studies in animal models. We believe that a combination of pooled and arrayed strategies would provide the most flexible and cost-effective way to perform large-scale RNAi screens.
- How many oligos can be prepared per chip?
- Library Preparation: QC and Quantity
- What is the "mutation" or mis-synthesis rate? What percentage of constructs are correct as designed?
The mutation rate is approximately 0.2%. This means that for 25nt shRNAs, approximately 95% of shRNAs have the correct sequence. It is also typical to have an insert rate of at least 95% (single expected insert).
- What percentage of designed constructs are actually in your plasmid DNA library?
We can identify approximately 90-95% of all designed inserts (with a number of reads of at least 10) with a total number of 10 x 106 reads in the sample.
- What is the representation after the PCR step as determined by sequencing? What is the error rate?
It is possible to estimate representation of individual oligos in the pool after the PCR step, but it is best to perform these analyses after the vector cloning step (see below).
- What is the representation after cloning and after the plasmid DNA prep?
The distribution in abundance level between different shRNA constructs after the plasmid library construction step is approximately 100-fold (depends on shRNA design, with better distribution for destabilized hairpins). The error rate is approximately 0.2%, i.e. 2 mutations/deletions per 1,000 nucleotides.
- What percentage of designed constructs are actually present in your viral library?
- What is the representation after virus production? How is that determined?
Representation of bar-coded shRNA constructs after the packaging step can be measured by amplifying bar-codes from viral RNA by RT-PCR followed by HT sequence analysis. However, we find it more informative to measure shRNA representation after the transduction step by amplifying bar-codes from integrated lentivectors by PCR followed by HT sequence analysis.
- What are the limitations on pool size for packaging virus (i.e. How many constructs per transfection)?
No limitation in pool size exists at the packaging step with complexities of shRNA libraries up to 100,000. We just increase the number of packaging cells up to 1x108 if we need to package complex libraries.
- How did you determine the maximum complexity of the library for packaging?
By comparing representation of bar-codes (shRNAs) at plasmid library construction and after the transduction step.
- What is the representation after virus production? How is that determined?
- How are you determining viral titer?
If the vector has a marker (fluorescent protein or drug resistance), we measure viral titer by infection and marker selection. If the vector has no marker, we determine viral titer by measuring the number of integrated viral copies in infected cells by PCR using vector primers specific to integrated DNA.
- What is the typical representation of the library after infection?
Usually, the representation of the transduced library is very similar to that of the plasmid library. The only exceptions are constructs with cytotoxic shRNA that quickly disappear from the transduced pool if cells are grown for several days and shRNAs are expressed from wild-type (non tet-regulated) promoters.
- What quality control measures are in place at each stage of the process to determine errors or dropouts of desired triggers?
We measure quality of the plasmid library by PCR amplification, sequence analysis of 20 random clones (to determine percentage of correct inserts), and mutation rate. Additionally, we can measure the representation of shRNA constructs in the plasmid library and in the viral library after the transduction step by amplification of all bar-codes followed by HT sequence analysis.
- What is the "mutation" or mis-synthesis rate? What percentage of constructs are correct as designed?
- Vector Design
- Do you have data showing the efficiency of tet control?
Our main goal was to develop a vector that expresses a Tet repressor, marker (Puro/Bleo or GFP/RFP), and an shRNA under the H1Tet or U6Tet promoter. We successfully developed a set of Tet-regulated vectors and demonstrated at least a 10-50-fold regulation of shRNA expression with or without dox treatment in transduced cells. Tet-dependent promoters are less active than wt promoters, however. As a result, the percentage of active shRNAs will be less for Tet-regulated versus wt promoters, so we suggest having at least 10 different shRNAs per target gene in a tet-regulated shRNA library in order to minimize the chance of false negative hits. Currently, we are performing site-directed mutagenesis of the U6-Tet promoter (about 6,000 mutants) with the goal of developing a modified U6-Tet promoter with activity similar to the U6wt promoter.
- Can we use our own vector instead?
Yes, we can certainly use your vector backbone. However, we will probably need to modify the shRNA expression cassette (U6 or H1 promoter) in order to incorporate shRNA cloning and Gex primer binding sites for HT sequencing. If not already present, selection markers would have to be incorporated as well as the Tet Repressor gene if Tet-regulated promoters are used.
- How do various vectors compare?
All lentivectors are very similar to each other and have very similar backbone structures. The main differences are what elements are cloned inside (see above).
- Do you have data showing the efficiency of tet control?
- Sequencing
- What is the bottleneck step for multiplexing?
The bottleneck for multiplexing at the sequencing step is the number of reads. For a library of 27K complexity and a number of reads (per sample) of 10 x 106, on average you will get approximately 350 reads per shRNA/bar-code. Considering that there is always a variation in representation (see above) in abundance level between different constructs and that 50 reads is the limit for a statistically significant number of reads, a 27K library is probably the limit for a genetic screen utilizing the current technology. A 27K bar-coded library will consist of either 27,000 mono-barcoded or 9,000 triple-barcoded shRNAs.
- Can you provide us with raw data for triplicate bar-codes, so that we can see if this is useful?
Triplicate bar-codes and biological repeats are very useful and important in the discrimination between false-positive and true-positive hits when performing statistical analysis. The trade-offs, however, are either an increased cost for HT sequencing or using a lower complexity shRNA library (for example 5-10K).
- Have you sequenced hairpins?
Yes, we have sequenced hairpins for QC (see Library Preparation above). For genetic screens, however, we prefer to sequence unique bar-codes which are present in each shRNA construct and don't require amplification through the stem-loop structure or use of a primer complementary to the loop sequence.
- Do you have a head to head comparison of sequencing vs. hybridization?
Yes. Our first generation shRNA libraries were based on design of shRNAs complementary to Affymetrix HG-U133 or Exon 1.0 arrays. We generated a large amount of data, and it revealed serious problems in dynamic range, cross-hybridization, and low efficiency of hybridization due to differences in sequence-dependent hybridization efficiency. HT sequencing overcomes all of these hybridization-based limitations. The limitations of HT sequencing are the number of reads per sample (which is being constantly increased due to new technologies) and, in positive-selection screens, the difficulty of revealing less-abundant hits if a few hits are extremely abundant in the screen. For example, in screens for TGF-β-resistant cells, the bar-codes for TGF-β receptor shRNAs typically comprise approximately 95% of all sequences, making it difficult to reveal other effectors of TGF-β signaling. Such a problem can be solved at the PCR amplification step before HT sequencing and re-sequencing of samples.
- Do you offer sequencing services? What is the price?
Yes, we can perform HT sequencing for a fee, including sample preparation, HT sequencing, conversion of raw data to number of reads, and statistical data analysis.
- What is the bottleneck step for multiplexing?
- Pooled Screens
- What are the limits on complexity in a pooled screen? How was this determined?
As we have discussed above, the practical limit in shRNA library complexity is approximately 25K-50K, primarily determined by 1) the need for maintaining even representation of shRNA constructs in the library by keeping the ratio of cells/shRNA at or above 100 and 2) the HT sequencing limit of 10 x 106 reads/sample.
- What is the recommended MOI and enrichment in a screen (or multiplicity of clones)?
We usually infect cells at an MOI of 0.2 (negative selection screens) to 0.5 (positive selection screens) with at least 100-200 infected cells/shRNA, followed by marker selection and then selection for specific phenotype with about 1000 infected cells/shRNA. At the phenotypic selection step, the enrichment factor ultimately depends on the screen. In positive selection screens, over 1000-fold enrichments are easily achieved. In negative selection screens, the enrichment level is mostly affected by the percentage of infected cells by a given shRNA construct with a high enough shRNA expression level to give the "counter-selectable" phenotype (e.g., loss of viability). If 90% of cells carrying the bar-code for a given shRNA express such shRNA at a high enough level to give the phenotype, the maximum achievable enrichment is 10-fold. If only 50% of cells carrying the bar-code give the phenotype, the maximum achievable enrichment is 2-fold.
- What cell lines have you screened?
Until now, we have performed more than 25 screens in a wide range of common human and mouse cancer cell lines.
- How does library representation vary across cell lines?
We haven't seen significant differences in library representation between different cell lines. The representation drops when less than 100 original clones/shRNA are transduced at the infection step or when infected cells are passed in culture at a ratio of less than 500 infected cells/shRNA.
- What are the limits on complexity in a pooled screen? How was this determined?
- Delivered product
- With the pooled plasmids, how many transfections can we do to package more virus? With the virus provided, what is the titer and how much is provided?
We usually provide 0.5 mg of plasmid library DNA which is enough material to package viral particles for at least 100-200 screens. We can provide any amount of packaged virus you would need. In order to perform a triplicate screen with a 27K complexity library, you would need approximately 1-2 x 107 ifu.
- How many screens can be performed with the virus that you provide?
See above.
- How many packaging reactions can be done with the DNA that is provided?
See above.
- Will you perform the repeat viral packaging for an additional fee?
Yes, we can.
- With the pooled plasmids, how many transfections can we do to package more virus? With the virus provided, what is the titer and how much is provided?
- Licensing/IP
- Will there be any rights associated with any findings from our experiments, or is this strictly fee-for-service?
This is strictly fee-for-service, so all libraries and data belong to you. Our business strategy is to help you in your research using our expertise in genetic screens.
- Is the oligo printing Agilent technology? Is there reach-through by Agilent?
We have a licensing agreement with Agilent to use their oligo pool technology and develop custom libraries. The cost we pay for oligo pools includes the licensing rights for you to use these libraries for Research Purposes without transfer to a third Party. Please read the Cellecta-Agilent License Statement for the terms of use.
- Will there be any rights associated with any findings from our experiments, or is this strictly fee-for-service?
- Experience
- How many libraries has your company made?
We have developed and created more than 200 shRNA, mir30, peptide, and shRNA-target libraries in dozens of different vectors. Based on our knowledge, we are the first and probably the most experienced group that develops shRNA libraries.
- How many repeat customers do you currently have?
We consistently work with 5-7 research groups. Here are some references:
Andrei Gudkov, Cleveland BioLabs, Inc.
Costas (Gus) Frangou, Fred Hutchinson Cancer Research Center (FHCRC)
Peiqing Sun, The Scripps Research Institute
Yutaka Eguchi, Osaka University
To protect the privacy of our collaborators and prevent spam from contaminating their email boxes, we don't display any contact information online. Please contact us for any collaborator email or phone numbers.
In regards to pharmaceutical and biotech companies, we have been doing business with many biotech and pharmaceutical companies but do not have permission to disclose any information.
If you need an answer to a question not listed here, please email us or call us at 650-938-3910.

Back to top / shRNA/Bar-code Design
Back to Top