Human DNA Analysis

Human DNA Analysis

The application of human DNA sequence analysis is broad. DNA analysis has the potential to solve critical public health and law enforcement problems through advancements in post-sequence processing and analysis.

The law enforcement community often deals with complex evidentiary samples. The inability to accurately process samples containing multiple contributors impairs the interpretation of sequence data and hinders efforts to generate data that meet current DNA profiling standards.

Similarly, the healthcare industry faces challenges posed by another type of mixed DNA sample: detecting and identifying microbial pathogens in the presence of human host DNA. Generally, the pathogen DNA concentration in a sample is so low relative to the host human DNA that it is difficult to unambiguously identify the pathogen in the raw sequence reads and, therefore, diagnose the disease.

Augmenting the impact of human DNA analysis on diverse missions calls for critical improvements to current DNA sequence analysis processes. Noblis’ BioVelocity tool offers an effective solution to identifying multiple contributors within a DNA sample, and isolating unique genetic indicators in whole genomes.


CHALLENGES IN MULTI-CONTRIBUTOR DNA ANALYSIS: A NOBLIS AREA OF RESEARCH

Any two individuals share 99.99% of the same genetic code. The high degree of similarity among individuals poses a problem for the accurate analysis of samples containing mixed DNA. The only way to determine if a DNA sample contains genetic information from more than one individual is to isolate and analyze the 0.01% that differs.

Current BioVelocity capabilities include:

  • Rapidly and accurately detecting SNPs
  • Providing data about genome sequencing coverage and base call distributions

Moving forward, our research efforts will develop a tool to determine the number of individual contributors in a mixed human DNA sample using whole genome sequencing data. This tool will:

  • Use allele frequency calculations to determine whether a given human DNA sample has the specific indicators of a multi-contributor sample
  • Give an estimate of the number of contributors in a mixed DNA sample
  • Validate algorithms using open source machine learning tools such as TensorFlow
DNA nucleotide sequence

USING BIOVELOCITY TO DETECT PATHOGENS IN HUMAN CLINICAL SAMPLES

The healthcare industry is faced with the challenge of detecting microbial pathogens in the presence of human DNA samples. This is especially difficult when the target represents a very low percentage of the total DNA in the sample. Noblis’ BioVelocity tool addresses this challenge by following a robust approach that:

  • Identifies and removes the host’s DNA sequence reads from a complex sample
  • Aligns the remaining target sequence reads to large indexes of bacterial, viral, and parasite whole genome sequences to determine if there are any matches
  • This process can be applied for clinical cases such as looking for pathogens in cerebrospinal fluid
pathogen mixed with dna

Using BioVelocity to Identify Ancestry

Research from the 1000 Genomes Project shows that genetic variants shared among individuals and populations can provide useful information about population history. As more data becomes available, identifying ancestry from whole genome sequencing instead of traditional single nucleotide polymorphism (SNP) chips will likely become more accurate and informative. Currently, Noblis’ BioVelocity tool can:

  • Identify millions of SNPs quickly and accurately
  • Compare identified SNPs to public databases of human genetic variation to statistically infer the likely ancestral origin of the DNA sample contributor
young woman stanind in front of genetic profile