Pokala Genome Analysis
Summary: Analyze TAS2R28 PTC population data. Analyze Dr. Pokala's SNPs.
BRING YOUR LAPTOP TO CLASS.
Bring your family's PTC tasting pedigree. Empty symbol = 0 taste, half-filled symbol = 0.5 taste, filled symbol = 1.0 taste
If you did not have access to biological family members over Thanksgiving, please let Dr. Pokala know ... he will assign you some data for analysis.
Analyze class TAS2R38 PTC taste data
Anonymized and sorted PTC data from your class are on the website as a Google Sheet
Plot a bargraph of genotype on the x-axis vs. the average taste score for your class on the y-axis.
Plot the trend-line and calculate the correlation coefficient.
You can copy the data to your own Excel sheet or other program for plotting.
Or use Google Sheets:
Create a new Google Sheets file of your own (File --> New --> Spreadsheet).
Highlight and copy and paste the contents of the Google Sheet file to your own sheet.
Put the cursor in the "Average taste" cell corresponding to a genotype.
Type " =AVERAGE( " in that cell.
Highlight the "PTC taste score" cells corresponding to individuals with that genotype.
Type " ) " and hit return
Do for all three genotypes (tt, Tt, TT).
Put the cursor in the "Average taste scores" cell
Go to "Insert --> Chart"
Click on "Chart type" and go to "column chart"
Click on "Customize" and go to "Series"
Click "Trendline" and "Show R2". This shows the correlation between the genotype and phenotype (the number of T alleles versus taste strength)
Click on plot.
Click on the three dots in the upper right corner of the plot and "Save Image" to download "chart.png"
Print this graph for your lab worksheet
Analyze Dr. Pokala's SNPs.
To analyze Dr. Pokala's single nucleotide polymorphism (SNP) data, download, uncompress, and open genome_Navin_Pokala.txt in a text editor like TextEdit (Mac), NotePad (Windows), or Text.app (Chromebook). The SNP data was obtained from 23andMe.com, via a variant of the microarray-based golden-gate genotyping technologies. It may take a while for the entire file to open,
The columns in genome_Navin_Pokala.txt are as follows:
ID chromosome base-position genotype
For example:
rs11240777 1 798959 AG
means SNP ID rs11240777 on Chromosome 1 at base 798959 has an AG genotype (one homolog has A, the other homolog has G; heterozygote at this position). For some positions, the genotyping failed or was ambiguous; these will have other letters such as I or D.
For your lab report, you will pick two polymorphic loci (some suggestions listed below), and write a brief summary for each (~ one paragraph, 12 point font, 1.5-2 spacing; no more than one page for each). The target audience are Genetics students at another school, so assume your reader has some basic knowledge of genetics, but is not an expert in genetics or medicine. Discuss the following:
1. What are the molecular and phenotypic effects of the different alleles? For some, the polymorphism may be just a marker that is physically near a yet-unknown genetic variation responsible for the phenotypic effect. If the causative mechanism is known, briefly discuss it (missense mutation, deletion, nonsense mutation resulting in a truncated protein, promoter mutation, splicing mutation, etc). What does the protein encode? If the causative mechanism is unknown, describe hypotheses put forth in the literature based on neighboring genes, etc.
2. How strong is the effect of alleles at this position? How good a predictor of phenotype is the genotype? Is this a high penetrance allele or a mild modifier of risk? By how much is the risk increased or decreased (odds ratio)?. How statistically significant is this effect?
3. What should homozygotes for each allele and heterozygotes and their offspring do or be concerned about (if anything)? What is Dr. Pokala's genotype, and what is your advice to him and his family, especially for a disease-risk-modifying allele? How much should he worry?
The following websites will be especially useful for finding information by SNP ID number:
Start with Online Mendelian Inheritance in Man (OMIM) ( http://omim.org/ ), a professionally written and curated Wikipedia-like site for genetic variants.
The National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) and the National Institutes of Health (NIH) has the most useful website in the world for biomedical researchers ( http://www.ncbi.nlm.nih.gov ). Your taxes at work ( for now ... ).
These sub-sites are especially useful for molecular information about SNP alleles:
Clinvar: http://www.ncbi.nlm.nih.gov/clinvar/
dbSNP: http://www.ncbi.nlm.nih.gov/projects/SNP/
Like OMIM, SNPedia ( http://www.snpedia.com/ ) is a Wikipedia-like site. Unlike OMIM, anyone can edit it ... there is little editorial control, so treat it the same way you treat Wikipedia (trust but verify!). On the plus side, it has information that might be more "cutting edge" than OMIM, and has a lot of links to primary literature sources, as well as to relevant OMIM and the NCBI pages.
While you can pick any of the SNPs Dr. Pokala has been genotyped for, these SNPs might be especially worth considering:
rs4481887 asparagus odor detection
rs713598 PTC tasting
rs505922 ABO blood type
rs34598529 or rs63750783 beta-thalessemia
rs17822931 earwax type, sweating and body odor
rs4988235 lactose intolerance
i4000436 or rs121907954 Tay-Sachs
i3000001 or rs113993960 Cystic fibrosis
rs1815739 muscle performance
rs53576 social behavior and personality
rs7412 and rs429358 risk of Alzheimer's disease
rs1333049 risk of coronary heart disease
rs662799 weight gain from high fat diets
rs333 resistance to HIV
i3003137 sickle cell anemia
rs1160312 male pattern baldness
rs801114 risk of basal cell carcinoma
rs210138 risk of testicular cancer
rs3814113 risk of ovarian cancer
rs1805007 red hair color and sensitivity to anesthetics
rs9939609 obesity and type-two diabetes risk
Summary: Analyze TAS2R28 PTC population data. Analyze Dr. Pokala's SNPs.
BRING YOUR LAPTOP TO CLASS.
Bring your family's PTC tasting pedigree. Empty symbol = 0 taste, half-filled symbol = 0.5 taste, filled symbol = 1.0 taste
If you did not have access to biological family members over Thanksgiving, please let Dr. Pokala know ... he will assign you some data for analysis.
Analyze class TAS2R38 PTC taste data
Anonymized and sorted PTC data from your class are on the website as a Google Sheet
Plot a bargraph of genotype on the x-axis vs. the average taste score for your class on the y-axis.
Plot the trend-line and calculate the correlation coefficient.
You can copy the data to your own Excel sheet or other program for plotting.
Or use Google Sheets:
Create a new Google Sheets file of your own (File --> New --> Spreadsheet).
Highlight and copy and paste the contents of the Google Sheet file to your own sheet.
Put the cursor in the "Average taste" cell corresponding to a genotype.
Type " =AVERAGE( " in that cell.
Highlight the "PTC taste score" cells corresponding to individuals with that genotype.
Type " ) " and hit return
Do for all three genotypes (tt, Tt, TT).
Put the cursor in the "Average taste scores" cell
Go to "Insert --> Chart"
Click on "Chart type" and go to "column chart"
Click on "Customize" and go to "Series"
Click "Trendline" and "Show R2". This shows the correlation between the genotype and phenotype (the number of T alleles versus taste strength)
Click on plot.
Click on the three dots in the upper right corner of the plot and "Save Image" to download "chart.png"
Print this graph for your lab worksheet
Analyze Dr. Pokala's SNPs.
To analyze Dr. Pokala's single nucleotide polymorphism (SNP) data, download, uncompress, and open genome_Navin_Pokala.txt in a text editor like TextEdit (Mac), NotePad (Windows), or Text.app (Chromebook). The SNP data was obtained from 23andMe.com, via a variant of the microarray-based golden-gate genotyping technologies. It may take a while for the entire file to open,
The columns in genome_Navin_Pokala.txt are as follows:
ID chromosome base-position genotype
For example:
rs11240777 1 798959 AG
means SNP ID rs11240777 on Chromosome 1 at base 798959 has an AG genotype (one homolog has A, the other homolog has G; heterozygote at this position). For some positions, the genotyping failed or was ambiguous; these will have other letters such as I or D.
For your lab report, you will pick two polymorphic loci (some suggestions listed below), and write a brief summary for each (~ one paragraph, 12 point font, 1.5-2 spacing; no more than one page for each). The target audience are Genetics students at another school, so assume your reader has some basic knowledge of genetics, but is not an expert in genetics or medicine. Discuss the following:
1. What are the molecular and phenotypic effects of the different alleles? For some, the polymorphism may be just a marker that is physically near a yet-unknown genetic variation responsible for the phenotypic effect. If the causative mechanism is known, briefly discuss it (missense mutation, deletion, nonsense mutation resulting in a truncated protein, promoter mutation, splicing mutation, etc). What does the protein encode? If the causative mechanism is unknown, describe hypotheses put forth in the literature based on neighboring genes, etc.
2. How strong is the effect of alleles at this position? How good a predictor of phenotype is the genotype? Is this a high penetrance allele or a mild modifier of risk? By how much is the risk increased or decreased (odds ratio)?. How statistically significant is this effect?
3. What should homozygotes for each allele and heterozygotes and their offspring do or be concerned about (if anything)? What is Dr. Pokala's genotype, and what is your advice to him and his family, especially for a disease-risk-modifying allele? How much should he worry?
The following websites will be especially useful for finding information by SNP ID number:
Start with Online Mendelian Inheritance in Man (OMIM) ( http://omim.org/ ), a professionally written and curated Wikipedia-like site for genetic variants.
The National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) and the National Institutes of Health (NIH) has the most useful website in the world for biomedical researchers ( http://www.ncbi.nlm.nih.gov ). Your taxes at work ( for now ... ).
These sub-sites are especially useful for molecular information about SNP alleles:
Clinvar: http://www.ncbi.nlm.nih.gov/clinvar/
dbSNP: http://www.ncbi.nlm.nih.gov/projects/SNP/
Like OMIM, SNPedia ( http://www.snpedia.com/ ) is a Wikipedia-like site. Unlike OMIM, anyone can edit it ... there is little editorial control, so treat it the same way you treat Wikipedia (trust but verify!). On the plus side, it has information that might be more "cutting edge" than OMIM, and has a lot of links to primary literature sources, as well as to relevant OMIM and the NCBI pages.
While you can pick any of the SNPs Dr. Pokala has been genotyped for, these SNPs might be especially worth considering:
rs4481887 asparagus odor detection
rs713598 PTC tasting
rs505922 ABO blood type
rs34598529 or rs63750783 beta-thalessemia
rs17822931 earwax type, sweating and body odor
rs4988235 lactose intolerance
i4000436 or rs121907954 Tay-Sachs
i3000001 or rs113993960 Cystic fibrosis
rs1815739 muscle performance
rs53576 social behavior and personality
rs7412 and rs429358 risk of Alzheimer's disease
rs1333049 risk of coronary heart disease
rs662799 weight gain from high fat diets
rs333 resistance to HIV
i3003137 sickle cell anemia
rs1160312 male pattern baldness
rs801114 risk of basal cell carcinoma
rs210138 risk of testicular cancer
rs3814113 risk of ovarian cancer
rs1805007 red hair color and sensitivity to anesthetics
rs9939609 obesity and type-two diabetes risk