林勇欣 副教授 國立陽明交通大學 生物資訊及系統生物研究所

Yeong-Shin Lin, Associate Professor
Institute of Bioinformatics and Systems Biology
National Yang Ming Chiao Tung University
Hsinchu, Taiwan
Main Page Lab Members Lectures Resources Bioinformatics Center

生物多樣性與生態 Biodiversity and Ecology

Time:10:10 ~ 12:00 on Wednesday
Room:博愛校區 賢齊館 BI310
Textbook:"Biology, 12th Edition" by Campbell, Urry, Cain, Wasserman, Minorsky, and Orr; Pearson 2021
Grading:期末開書考 100%
Office hour:15:00 ~ 17:00 on Tuesday
15:00 ~ 17:00 on Thursday

2/21Introduction to Viruses
3/6The Origin and Evolution of Eukaryotes
3/13Nonvascular and Seedless Vascular Plants
3/20Seed Plants
3/27Introduction to Fungi
4/10An introduction to Animal Diversity
5/1An Overview of Ecology
5/1Behavioral Ecology
5/8Populations and Life History Traits
5/15Biodiversity and Communities
5/22Energy Flow and Chemical Cycling in Ecosystems
5/29Conservation and Global Ecology
6/5Final exam (open book)

分子演化 Molecular Evolution

Time:13:20 ~ 16:20 on Monday
Room:賢齊館 BI305
Grading:Homework 100%
Office hour:15:00 ~ 17:00 on Tuesday
15:00 ~ 17:00 on Thursday
Reference:以下圖書可在交大浩然圖書館借閱 (連結為電子書):
  • "Molecular evolution" by Wen-Hsiung Li; Sinauer Associates, 1997
  • "Molecular evolution :a phylogenetic approach" by Roderic D.M. Page & Edward C. Holmes; Blackwell Science, 1998
  • "Fundamentals of molecular evolution" by Dan Graur & Wen-Hsiung Li; Sinauer Associates, 2000
  • "Molecular evolution and phylogenetics" by Masatoshi Nei & Sudhir Kumar; Oxford University Press, 2000
  • "Data analysis in molecular biology and evolution" by Xuhua Xia; Kluwer Academic, 2000
  • "Bioinformatics and molecular evolution" by Paul G. Higgs & Teresa K. Attwood; Blackwell Pub., 2005
  • "Statistical methods in molecular evolution" by Rasmus Nielsen; Springer, 2005
  • "Computational molecular evolution" by Ziheng Yang; Oxford University Press, 2006

  • 2/19IntroductionPPT
    2/26Dynamics of Genes in PopulationsPPT
    3/4Dynamics of Genes in PopulationsPPT
    3/11Dynamics of Genes in Populations
    3/18Models of Nucleotide SubstitutionPPT
    3/25Models of Nucleotide SubstitutionPPT
    4/1Models of Amino Acid and Codon SubstitutionPPT
    4/8Models of Amino Acid and Codon SubstitutionPPT
    4/22Phylogeny Reconstruction: Distance MethodsPPT
    4/29Phylogeny Reconstruction: Maximum ParsimonyPPT
    5/6Phylogeny Reconstruction: Maximum LikelihoodPPT
    5/13Comparison of Methods and Tests on TreesPPT
    5/20Molecular Clock and Estimation of Species Divergence TimesPPT
    5/27Neutral and Adaptive Protein EvolutionPPT
    6/3Bayesian MethodsPPT
    6/3DNA Polymorphism in Populations


    E-mail your homework to me directly before the due date.
    HW 1.
    (due on 2/22)
    Collect the coding sequences of the HLA class I family, and their homologous sequences in other species (at least including human, chimpanzee, and macaque). Build a fasta file (*.fas). Use MEGA to perform a multiple sequences alignment and export a MEGA file (*.meg).
    Collect the coding sequences of the mitochondrial cytochrome b genes for as many mammalian species as you can (at least 30 species, including some closely related species, and some divergent species pairs). Also export a MEGA file.
    HW 2.
    (due on 2/29)
    Show the changes of allele frequencies over time for recessive alleles, dominant alleles, codominant alleles, overdominant alleles, and underdominant alleles under different selection coefficients and different initial allele frequencies.
    HW 3.
    (due on 3/7)
    If you can program, (a) draw a figure showing the changes in frequencies of alleles subject to random genetic drift in populations of different sizes (say, 10 different sizes). Try different initial allele frequencies. (b) Draw figures showing the probability distributions of allele frequencies in a diploid population of N=100 (with 10,000 replicates) for generation 1, 5, 20, 100, 500, and 2000. Also try different initial allele frequencies; If you cannot program, use Excel to do the second job. You can use N=5 (2N=10) and 100 replicates instead. You can survey generation 1, 3, 5, and 20 instead.
    HW 4.
    (due on 3/14)
    (a) Include the factor of "selection" to repeat the last homework. (b) Calculate the probability of fixation in slide 20.
    HW 5.
    (due on 3/21)
    Use the "general substitution model" (the parameters refer to the substitution numbers observed in pseudogenes as shown in the PPT file) to display the nucleotide (A, T, C, G) probability (frequency) changes with time, as well as the change of the similarity, I. You can define different initial frequencies for A, T, C, and G.
    HW 6.
    (due on 3/28)
    Display the transitional difference (ts) and the transversional difference (tv) with time.
    Calculate the number of nucleotide differences, the proportion of nucleotide differences, JC69 one-parameter distance, and K80 two-parameter distance for the mitochondrial cytochrome b sequences you constructed in HW1. Compare your results with what MEGA computes for you.
    HW 7.
    (due on 4/4)
    Calculate S0, S2, S4, V0, V2, and V4 between human HLA-A and HLA-B genes for the first 240 nucleotides.
    HW 8.
    (due on 4/11)
    Use MEGA to calculate different genetic distances (number of transitions, number of transversions, JC69 one-parameter distance, K80 two-parameter distance, synonymous distance, nonsynonymous distance, and amino acid distance, etc.) for the mitochondrial cytochrome b sequences you constructed in HW1. Draw figures to compare these distances.
    HW 9.
    (due on 4/18)
    Align the two sequences manually with identity score = 5, transition score = -1, transversion score = -3, gap penalty = -7. Try different parameters.

    How about adding 2 nucleotides in the second sequence?
    HW 10.
    (due on 4/25)
    Build a UPGMA tree and a NJ tree manually based on the mitochondrial cytochrome b sequence alignment you constructed in HW 1 (you can select 6 ~ 10 sequences). You can select any distance model you like. Compare your results with what MEGA builds for you.
    HW 11.
    (due on 5/2)
    Build a NJ tree for the mitochondrial cytochrome b sequence alignment you constructed in HW 1 first. Use this topology and the parsimony principle to assign possible nucleotides on each internal node. You can just use the first 5 informative sites. Count the number of total substitutions on this tree. Compare this result with the Maximum Parsimony tree generated by MEGA. If they are different, illustrate what might be the reason.
    HW 12.
    (due on 5/9)
    Calculate and compare the log likelihood values for the two topologies in the last slide.
    HW 13.
    (due on 5/16)
    Build a phylogenetic tree based on the mitochondrial cytochrome b sequence alignment you constructed in HW 1 with 100 bootstrap repeats. Repeat this process 10 times. Construct another tree with 1000 bootstrap repeats. Also repeat this process 10 times. Compare these trees and their bootstrap supporting values. Identify the nodes with their bootstrap values less than 80 (or the couple nodes with the least supports). Based on these nodes, redraw a tree topology with polytomies. Try to list all possible bifurcating tree topologies based on these polytomies.
    HW 14.
    (due on 5/23)
    Use synonymous distances (Ks) and nonsynonymous distances (Ka) to build two NJ trees for the mitochondrial cytochrome b sequences. Compare the topologies and branch lengths of these two trees. Select some branches which may have different evolutionary rates. Perform the relative rate test on them. Build one ML tree using the same parameters used in Timetree with 6~10 species remained. Construct its timetree manually. Assume human and chimpanzee diverged 6 million years ago, try to estimate the divergence time for other nodes.
    HW 15.
    (due on 5/30)
    Use the HLA sequence alignment you constructed in HW 1. Calculate Ka and Ks; Could you show the evidence of positive selection? Could you identify which region is under positive selection mostly? Does positive selection occur between different loci? Does positive selection occur between different species?
    HW 16.
    (due on 6/6)
    Try different prior for Treeleft and Treeright to calculate their posterior probabilities (in HW12 last slide).


    計算生物實驗 Computational Biology Lab.

    Time:13:20 ~ 16:20 on Wednesday
    Room:賢齊館 BI305
    Grading:Homework 100%
    Office hour:15:00 ~ 17:00 on Tuesday
    15:00 ~ 17:00 on Thursday

    4/24Retrieve sequences from database
    5/1Sequence alignment -- dot matrixPPT
    5/8Sequence alignment -- dynamic programming
    5/15Calculate pairwise distances
    5/22Construct a phylogenetic tree
    5/29Calculate codon usage bias
    6/5Protein structure dataPPT


    1. Retrieve the protein sequences of human hemoglobin (alpha 1) and hemoglobin (beta) from database
    2. Align these two sequences manually
    3. Build a dot matrix for these two sequences
    4. Using dynamic programming to align these two sequences
    5. Using BLOSUM 62; Compare the obtained result with the previous one and make some discussion
    6. Using local alignment; Compare the obtained result with the previous one and make some discussion
    7. Using two types of gap penalty; Compare the obtained result with the previous one and make some discussion
    8. Align the protein sequences of human hemoglobin (alpha 1) and hemoglobin (zeta). To generate the alignment represented in our textbook, what range of the gap penalty should be assigned?
    9. Retrieve all the protein sequences of human and mouse (Mus musculus) hemoglobin from database, and align them based on the alignment result of hemoglobin (alpha 1) and hemoglobin (beta)
    10. Calculate pairwise distances
    11. Based on the calculated pairwise distances, construct a phylogenetic tree; Make some discussion
    12. Retrieve the DNA coding sequences and their corresponding intron sequences of all human hemoglobins, and human TBPL1 (TATA-box binding protein like 1 gene) from the database
    13. Calculate GC content for the coding sequences and intron sequences (GCi)
    14. Calculate GC1, GC2, and GC3 for the coding sequences
    15. Compare GC3 and GCi among these genes
    16. Calculate codon usage frequencies for the coding sequences
    17. Calculate RSCU values for the coding sequences
    18. Retrieve the DNA coding sequences of the virus RaTG13 and Human betaherpesvirus 5 strain SOMA from the database
    19. Calculate their GC1, GC2, GC3, and RSCU values
    20. Compare the GC1, GC2, GC3, and RSCU values between the two viruses and human genes


    Office:+886-3-5712121 # 56960
    +886-3-5712121 # 56960
    賢齊館 415室
    R415, Jan Qi Building, 75 Po-Ai Street, Hsinchu, Taiwan 30068

    Lab:+886-3-5712121 # 56961


    General Biology (I)

    Computational Biology Lab.


    Biodiversity and Ecology

    Evolutionary Biology

    Molecular Evolution


    Genome OnLine Database
    Approved Sequencing Targets
    UCSC Genome Bioinformatics
    Stanford Genomic Resources
    TGI - The Gene Index
    J. Craig Venter Institute
    Broad Institute
    ExPASy - SwissProt - PROSITE
    CE - Combinatorial Extension

    Structure (population)
    MCL - a cluster algorithm for graphs
    The R Manuals
    Chi-square Test
    Fisher's Exact Test
    Kolmogorov-Smirnov Test

    PLoS Biology
    Current Biology
    Nature Ecology & Evolution
    Nature Genetics
    Nature Biotechnology
    Trends in Genetics
    Genome Research
    Genome Biology
    Molecular Biology & Evolution
    Nucleic Acids Research
    Journal of Molecular Biology
    Journal of Molecular Evolution

    Journal Citation Reports
    Main Page Lab Members Lectures Resources Bioinformatics Center