"Recovery of DNA and RNA from Microorganisms in Water Samples" by Catherine Dougherty

Recovery of DNA and RNA from Microorganisms in Water Samples

by Catherine Dougherty, Fairleigh Dickinson University

Abstract: This study utilized a systematically narrowing method of detection to identify the presence of Escherichia, Shigella, and Salmonella, in a local source of surface water fed by wastewater effluent. A technique was developed to isolate DNA and RNA from microorganisms in those samples, and use polymerase chain reactions (PCR) to amplify regions of the genetic material using species-specific primers. Preliminary research found that adapting the commercial DNA/RNA AllPrep Kit to incorporate DNeasy PowerWater bead-beating technology should provide the greatest amount of quality DNA and RNA from water samples with consideration of cost and ease of use. Initial trials successfully isolated DNA from E. coli-spiked water using the PowerWater Kit. Due to the low biomass of environmental samples PCRs were investigated as a method of nucleic acid amplification. Previously validated species-specific primers targeting organisms in the Enterobacteriaceae family were selected to target the lacZ, lamB, tuf, eaeA and the SLT-I genes to differentially detect Escherichia, Shigella, and Salmonella spp. All primers were tested using E. coli-spiked samples, though only those for the lacZ and lamB genes successfully produced products identifying the target organism. No primers successfully amplified DNA for environmental samples collected. Future studies aim to establish a limit of detection to determine the minimum concentration required to visualize a PCR product through gel electrophoresis. Primers will also be optimized so the novel protocol can be implemented. Ultimately, this will then be used to monitor fluctuations in freshwater biodiversity over time with respect to environmental and anthropogenic variables.


Freshwater ecosystems are essential for the growth and fitness of some of the most rich and abundant organisms in the biosphere. These species are organized into three domains: Bacteria, Archaea, and Eukarya, based on differences in the cell's ribosomal RNA nucleotide sequence, membrane lipid structure, and sensitivity to antibiotics (Kaiser, 2018). This study focuses on Bacteria which include the first and most abundant life forms on Earth, and can be found in nearly every environment on the planet. Though about 5% of bacteria are pathogenic, most are either harmless or beneficial to ecosystems as they play a crucial role in stimulating and replenishing nutrients in freshwater food webs. As primary decomposers and mineralizers, they convert organic compounds into their inorganic components, thus cycling biologically active elements throughout the aquatic ecosystem (Newton et al., 2011). They also contribute to biomass production, creating a nutrient sink which can be utilized by other organisms for energy (Newton et al., 2011; Sanderman & Amundson, 2003), and to trophic coupling, by which microbial trophic interactions are linked to those of aquatic macroorganisms (Moore et al, 2018; Newton et al., 2011). Lastly, bacteria are incredibly sensitive to biotic and abiotic changes in their environment, such as extreme temperatures or pollution, thus they can be utilized to monitor elemental fluctuations, water quality, and climate change over time.

All bacteria are anatomically similar in that their cell walls contain peptidoglycan, a substance made of complex polysaccharide chains interlinked with short peptides that coat the outside of the plasma membrane. Bacteria can be broadly divided into two groups, gram-positive and gram-negative, based on differences in this cell wall structure. Gram-positive bacterial cell walls contain a thick layer of peptidoglycan, while Gram-negative bacterial cell walls have a much thinner layer of peptidoglycan and are surrounded by an outer membrane (Sizar & Unakal, 2019). Gram-negative bacteria can be further classified as coliform or non-coliform. Coliform bacteria are members of the family Enterobacteriaceae and are considered facultatively anaerobic, non-spore-forming rods that contain the lacZ gene for β-galactosidase, the enzyme used to ferment lactose with the production of acid and gas (Gerba, 2015; Octavia & Lan, 2014). Non-coliform bacteria lack this gene and thus cannot cleave lactose into glucose and galactose (Gerba, 2015). Total coliform bacteria comprise a wide variety of generally harmless bacteria found in soil, water, vegetation, and animal or human waste.

Fecal coliforms are a subdivision of total coliforms commonly found in the intestines and in animal and human waste (Figure I). Presence of fecal coliforms in drinking water suggests potential fecal contamination which poses a significant health threat to consumers (U.S. Environmental Protection Agency, 2012). Escherichia coli presents a species of fecal coliform, commonly found in the gastrointestinal tract of mammals that exists in a mutualism with its host (Figure I). Over time, some E. coli clones evolved virulence factors which led to the development of new pathotypes of bacteria which can cause disease. The most common of these pathotypes is enterohemorrhagic E. coli serotype O157:H7, which produces shiga-like cytotoxin (Imtiaz et al., 2013; Mead & Griffin, 1998) that results in serious illnesses including diarrhea, hemorrhagic colitis, (Cohen & Giannella, 1992), hemolytic uremic syndrome (Canpolat, 2015), and, in the most severe cases, death (Mead & Griffin, 1998).

Two other genera within the Enterobacteriaceae family commonly found in aquatic environments are Shigella and Salmonella. Shigella are gram-negative, facultatively anaerobic, non-spore-forming rods that do not ferment lactose (Hale & Keusch, 1996). Research has shown that the nucleotide sequences of Shigella and E. coli are an estimated 80-90% similar, thus they are often treated as a single genetic species and cannot easily be distinguished based on DNA sequences alone (Brenner et al., 1972; Devanga Ragupathi et al., 2018; Maheux et al., 2009, 2011). Despite the many similarities, E. coli and Shigella remain in two separate genera based on differences in biochemical and pathogenicity tests (Maheux et al., 2011). Shigella can be transmitted to water through fecal contamination and, if consumed, the pathogen can cause shigellosis or bacillary dysentery (Devanga Ragupathi et al., 2018; Hale & Keusch, 1996). Like Shigella, Salmonella are gram-negative, facultatively anaerobic, non-spore-forming rods that present a major cause of foodborne illness (Kasturi & Drgon, 2017). The consumption of raw or undercooked meat or eggs, as well as drinking water contaminated with Salmonella can lead to a variety of serious illnesses including gastroenteritis, enteric fevers, bacteremia, septicemia, and focal infections (Bush & Perez, 2020; Giannella, 1996).

In summary, microorganisms have been found to play substantial roles in natural water sources through nutrient cycling, trophic coupling, and stimulating aquatic food webs (Newton et al., 2011). This microbial activity emits and absorbs greenhouse gases and fluctuates with environmental changes, thus making microorganisms vital to the study of climate change (Dutta & Dutta, 2016). Furthermore, microbes have also been used to maintain an accurate measurement of water quality in the ecosystem. More specifically, Enterobacteriaceae presents a family of bacteria commonly used as indicators in environmental microbiology. As described previously, total coliforms such as Escherichia are used to indicate contamination from an outside source, while specific species and pathotypes can suggest fecal contamination or particular waterborne pathogens (U.S. Environmental Protection Agency, 2012). Similarly, non-coliforms such as Shigella and Salmonella can be transmitted through the water and cause serious diseases. These attributes allow these genera of microorganisms to be ideal measures of the efficacy of wastewater treatment. As such, this research aims to monitor the presence of Enterobacteriaceae in local sources of surface water fed by wastewater effluent, and to map out fluctuations in diversity due to climate change and anthropogenic effects (Figure II).

Analysis of Commercial Extraction Methods

Research on the diversity of waterborne organisms in natural sources has revealed a myriad of microbial species in varying abundance, however work remains to better understand how this diversity varies over time due to environmental and anthropogenic impacts. Because these microorganisms are so small, it can prove especially difficult to survey and monitor for signs of contamination. Meanwhile, the need to develop a cost-effective, replicable methodology becomes increasingly dire. As of 2019 it was reported that approximately 2.1 billion people, or 29% of the world, do not have access to safe drinking water, an issue which is responsible for an estimated 1.2 million deaths each year (Ritchie & Roser, 2019). This is particularly serious for low-income countries where 6% of deaths are the result of water contamination yet they cannot afford to monitor water quality over extended periods of time, even with assistance from nonprofit organizations (Ritchie & Roser, 2019).

To overcome the size challenge microorganisms pose, scientists utilize extraction techniques which isolate nucleic acids from samples and use this genetic material to characterize species of microorganisms and to hypothesize community relationships with macroorganisms. Traditional methods for DNA or RNA extraction are time-consuming, require large quantities of reagents, and are prone to contamination (Guillén-Navarro et al., 2015; Purdy, 2005; Tan & Yiap, 2009; Yang et al., 2010), while many modern methods, though effective, are expensive and complex to perform (Tan & Yiap, 2009; Triant & Whiteheard, 2009; Yang et al., 2010). Through detailed analyses of each of these nucleic extraction protocols, it has been concluded that the most widely accepted extraction methods utilize commercial extraction kits which employ multiple lysis and purification techniques. Given this information, the first objective of this research was to perform a meta-analysis of common commercial extraction techniques and evaluate each with respect to sample type, cost, ease of use, and nucleic acid extraction ability. The ultimate goal was to identify a method that allowed for the simultaneous extraction of DNA and RNA to identify a wide variety of microorganisms, particularly Enterobacteriaceae in a single trial. This would minimize time, resources, cost, and the likelihood of cross-contamination. All kits examined are shown in Table I.

Ultimately, it was confirmed that no single kit allows for the simultaneous extraction of both DNA and RNA from water samples in a cost-effective manner, as supported by Yang et al. (2010). Based on these findings a novel approach was proposed that combines the protocols of the AllPrep DNA/RNA Kit (Qiagen) and the DNeasy PowerWater Kit (Qiagen). The AllPrep DNA/RNA Kit allows for the simultaneous extraction and purification of DNA and RNA isolated from tissue samples. Through this procedure cell or tissue samples are first lysed by a denaturing buffer which deactivates DNases and RNases. The lysate then passes through an AllPrep spin column which allows for the selection and binding of genomic DNA from the sample. This can then be washed and eluted to isolate pure DNA. RNA is purified from the AllPrep column flow-through using an RNeasy Mini spin column (Qiagen, 2005).

Following a similar protocol, the DNeasy PowerWater Kit also isolates genomic DNA but from water samples. The sample is first filtered onto a filter membrane. This is then added to a specialized PowerWater bead-beating tube where the filter and sample are lysed through vortexing in a lysis buffer designed to enhance microorganism isolation. Total genomic DNA can be gathered in an MB spin column where it is washed and eluted to complete the pure DNA extraction process (Qiagen, 2017). Because the AllPrep DNA/RNA Kit allows for simultaneous extraction of DNA and RNA, and the PowerWater Kit is optimized for isolation from water samples, one possible methodology that will be considered for this experiment is adapting the AllPrep Kit to incorporate the PowerWater bead-beating tube which could be used to break apart the filter and sample prior to extraction following the AllPrep protocol. Once the DNA has been extracted, a polymerase chain reaction (PCR) cab be performed using species-specific primers to target and identify the presence of bacterial species expected to be present in the samples.

Materials and Methods

Water Samples

Experiments were performed using both water spiked with E. coli in a blind study and natural water samples collected from the Whippany River. Control samples were prepared by spiking deionized water with Escherichia coli HB101 K12 at a concentration high enough to ensure a positive result. Environmental water samples were collected in mid-September 2018 and early November 2019 during rainfall were collected from three sample sites as follows: effluent samples were collected directly from the surface water of the source, an output pipe running from the Morristown Wastewater Treatment Plant in Morristown, New Jersey and emptying into the river; upstream samples were collected 50 feet upstream of the source, directly underneath a road overpass; and downstream samples were collected 50 feet downstream of the source in the middle of the river. All surface water was gathered in sterile 500 ml Pyrex media bottles and stored at 4°C until nucleic acid extraction.

Recovery of DNA

Control and environmental DNA extractions from all 500 ml samples were performed using the DNeasy PowerWater Kit (Qiagen) according to the manufacturer’s recommendations using 0.45 µm filters. Following the initial filtration step, filter membranes from all environmental samples were lifted from opposite edges using sterile forceps and placed face down on appropriately labelled Trypticase soy agar (TSA) or Hektoen enteric agar (HEK) plates to visualize bacterial growth. Membranes were lightly patted to ensure transfer of microorganisms; plates were incubated at 30°C or 37°C, respectively for 24 hours then stored at 4°C. Membranes were then used to continue the extraction, and at the conclusion of the protocol, extracted DNA from all samples were frozen at -20°C in 20 µl aliquots.

PCR Amplification

PCR amplification was performed using a DNA thermal cycler and Invitrogen Platinum Taq DNA polymerase (ThermoFisher Scientific) following the manufacturer’s suggested protocol for 25 µl reaction mixtures. Briefly, the PCR master mix was prepared using 2.5 µl 10x PCR reaction buffer, 0.75 µl 50 mM MgCl2, 0.5 µl 10 mM dNTP mix, 0.1 µl platinum Taq polymerase, 1 µl isolated DNA, 0.5 µl of both forward and reverse primers, and RNase-free water to fill a total 25 µl mixture. Primer sequences and their respective gene and species targets are shown in Table II. In order to identify multiple species in a single run of the thermal cycler, annealing temperatures were identified between those reported in literature for the primers of interest in each trial. All PCR mixtures were subject to an initial denaturation at 94°C for 3 minutes. This was followed by 25 PCR cycles of the following conditions: denaturation at 94°C for 30 seconds; primer annealing at 50°C (lacZ and lamB gene primer pairs) and 56°C (tuf, eaeA, and invA gene primer pairs) for 30 seconds; and DNA extension at 72°C for 2 minutes. PCR products were stored at -20°C prior to analysis with gel electrophoresis.

Visualization of Products

PCR-amplified products were visualized using gel electrophoresis. 0.9% agarose gel was prepared by mixing 0.90 g of agarose with 100 ml of 1X TAE buffer and stained with 10 µl SYBR DNA gel safe stain. PCR reaction mixtures were thawed and 4.5 µl of 6x loading dye were added to each. 12 µl of each mixture were loaded into the wells and the gel was run in 1X TAE buffer at 200 V and 110 mA for approximately 45 minutes. Final gels were visualized under UV transilluminator and photographed.

Results and Discussion

Preliminary research focused on studying and comparing methodologies for nucleic-acid extraction and purification. Because the most popular and efficient techniques utilize commercial extraction kits, the most common were investigated to determine which would yield the best results with consideration of cost, ease, and ability to extract both DNA and RNA from water samples over an extended period of time. Unfortunately, from the final list depicted in Table I, it was concluded that no single kit allows for the simultaneous extraction of DNA and RNA from water samples in a cost-effective manner. The two most viable options included the AllPrep DNA/RNA Kit (Qiagen) and the DNeasy PowerWater Kit (Qiagen). The AllPrep Kit allows for the simultaneous extraction of both DNA and RNA from samples which proves cost-effective and increases the ease of the experiment (Table I). However, it is designed to isolate material from cell and tissue samples, not water.

Alternatively, the PowerWater Kit is optimized to increase yields from low biomass samples and presents an easy-to-follow protocol which allows for the isolation of large quantities of high-quality DNA from samples, though it does not allow for the simultaneous extraction of both DNA and RNA (Table I). As such, it was determined that the best long-term option would be a novel extraction technique which combines the protocols of the AllPrep DNA/RNA and DNeasy PowerWater Kits to meet all of the extraction criteria for this experiment. With consideration of the protocols for each kit, it was planned to adapt the AllPrep Kit to incorporate the PowerWater bead-beating tube to break apart the filter and sample prior to extraction using the AllPrep protocol, thus effectively extracting both DNA and RNA from water samples.

Spiked Samples

Initial tests focused on perfecting a single nucleic acid extraction kit prior to experimenting with a novel technique. As such, the first experiment tested the efficacy of the DNeasy PowerWater Kit using deionized water spiked with Escherichia coli HB101 K12 in a blind test. One sample was spiked with E. coli while the other was left unaltered. The extraction was performed following the manufacturer’s instructions and the results were visualized using gel electrophoresis (Figure III). Only lanes filled with DNA isolated from Sample A showed fragments, which indicates that Sample A was spiked with such a high concentration of E. coli HB01 K-12 that the gel was unable to separate out the fragments, resulting in single a blurry band for both digested and undigested DNA (Lanes 1-3). Lanes with material from Sample B showed no DNA fragments, confirming that Sample B was not spiked with E. coli. These results confirm that the DNeasy PowerWater Kit effectively extracts DNA from high concentrations and support the decision to incorporate this protocol into the novel extraction technique.

While the spiked sample contains a very high concentration of bacteria, environmental samples are likely much more dilute, and thus require amplification through polymerase chain reactions prior to downstream applications. The literature investigation identified six well-researched genes of interest which can be used to test for the presence of three genera within the Enterobacteriaceae family: Escherichia, Shigella, and Salmonella in water samples. The selected primer pairs were: ZL-1675 and ZR-2025 (Bej et al., 1990), used to target the lacZ gene and detect total coliform bacteria; BL-4910 and BR-5219 (Bej et al., 1990), used to target the lamB gene and detect Salmonella, Escherichia, and Shigella; TEcol553 and TEcol754 (Maheux et al., 2009), used to target the tuf gene and detect Escherichia and Shigella; InvA F and InvA R (Kasturi, et al., 2017), used to target the invA gene and detect Salmonella enterica; ConceaeA F and ConceaeA R (Barak et al., 2005), used to target the eaeA gene and detect pathogenic E. coli; and SLT-I F and SLT-I FR (Imitaz et al., 2013), used to target the SLT-I gene and detect the pathogenic serogroup of E. coli O157:H7 (Table II).

To test the effectiveness of these primer sets a PCR was performed on the DNA isolated from Sample A in the blind E. coli study, and products were visualized through gel electrophoresis (Figure IV). Sample B was included as a control. As anticipated, no PCR products were observed from material extracted from Sample B (Lanes 2-12). Two fragments less than 500 bp were noted from Sample A amplified using the ZL-1675 and ZR-2025, and BL-4910 and BR-5219 primer pairs (Lanes 2-3), consistent with the expected 264 and 309 bp lengths (Table II). The PCR products in Lane 2 demonstrate that the lacZ gene was successfully targeted and confirms that the water was spiked with a coliform bacterium. Likewise, the amplified DNA in Lane 3 demonstrates that the lamB gene was also successfully targeted and narrows the options of potential species to a member of the genera Salmonella, Escherichia, or Shigella. This agrees with the known target species of Escherichia coli HB101 K12.

With this in mind, the tuf gene should have also been targeted by the TEcol553 and TEcol754 primers, thus showing positive detection of Escherichia and Shigella. Contrary to this expectation, no bands appeared in Lane 4, showing that no DNA was amplified using primer pair targeting the tuf gene. Additionally, no DNA was amplified using primer pairs targeting the eaeA or invA genes in Sample A (Lanes 5-6); this was expected given the ability of these genes identify Salmonella enterica and pathogenic E. coli, neither of which describe the known target species. It is important to consider that, though these genes are not found in E. coli HB101 K12, it can only be assumed based on literature that this is the reason bands were not observed; in order to confirm that this is the true cause, the primers would need to be validated through other spiked samples with known concentrations of the appropriate bacteria to guarantee a positive test. At this time such tests cannot be performed as this is a BSL1 teaching facility which lacks access to pathogenic species for PCR validation.

In order to identify multiple species in a single run of the thermal cycler, annealing temperatures were identified between those reported in literature for the primers of interest in each trial. The annealing temperature tested for TEcol553 and TEcol754 was 56°C, though the reported ideal annealing temperature for this primer pair was 58°C (Maheux et al., 2009). This presents one factor that can influence the specificity and sensitivity of DNA amplification. If the annealing temperature is too high, primers fail to bind to the template, and if it is too low, the primers may bind non-specifically to the template (Bio-Rad Laboratories, 2020; Sipos et al., 2007). The annealing temperature should be relatively close to the melting temperature (Tm) of the primers. The Tm for TEcol553 and TEcol754 is approximately 52.6°C and 48.9°C, respectively (Table II). Based on this information, the PCRs of DNA isolated from water spiked with Escherichia coli HB101 K12 using tuf gene primers were repeated using temperatures surrounding the ideal temperature and the melting temperatures of the primers. Results were visualized through gel electrophoresis (Figure V). No PCR products were found for samples run with an annealing temperature of 50°C (Lanes 2-3), 58°C (Lanes 6-7), and 60°C (Lanes 8-9) nor for a repeat trial using 56°C (Lanes 4-5). These results indicate that the annealing temperature likely was not the factor impeding detection of Escherichia and Shigella. Other variables that impact DNA amplification through polymerase chain reaction include primer concentration, and Mg2+ concentration, and the type of DNA polymerase used (Imtiaz, 2013). Future studies will examine these factors to determine if any impeded the ability of TEcol553 and TEcol754 primers to target the tuf gene and correctly identify Escherichia and Shigella as the target species.

Environmental Studies

Having confirmed the efficacy of the DNeasy PowerWater Kit, as well as the accuracy of the primer sets, the method could then be used to extract and analyze nucleic acids from environmental samples. Freshwater samples were gathered in mid-September 2018 and early November 2019 from three sites in the Whippany River. Effluent samples were collected from an output pipe running from the Morristown Wastewater Treatment Plant into the river, while the remaining samples were collected 50 feet upstream and