Assignment 9, Character-Based Evolutionary Tree Reconstruction

Due Date: 11:59pm, June 22 (Sun), 2025

 

Purpose
  • Practice of understanding the small parsimony problem
  • Practice of implementing the character-based evolutionary tree reconstruction algorithm

Description
Write a python code to implement Fitch algorithm as a character-based evolutionary tree reconstruction algorithm to solve the small parsimony problem. Your python script should take the input file and output file names as command-line arguments, read 2^n DNA sequences (where n >= 1) in a FASTA format from the input file, implement the Fitch algorithm to find optimal sequences for internal nodes of an evolutionary tree, print the sequences for all nodes in the evolutionary tree to the output file, and print the parsimony score to screen. Please assign the DNA sequences in the input file to the leaf nodes of the evolutionary tree from left to right. Please assume all input DNA sequences have the same length. Please print the sequences of all nodes of the evolutionary tree (including leaf nodes), from left to right and top-down, to the output file.

Submission
Submit your python code "Assignment9.py" via LearnUs.

Note
  • The input sequence is case-insensitive.
  • If the input file in a command-line argument does not exist, then print "No input file".
  • If the input file has nothing, then print "No DNA sequence".
  • If any sequence in the input file is not a DNA sequence, then print "No DNA sequence".
  • If any sequence in the input file does not follow the fasta format, then print "No correct format".
  • If the input file does not contain 2^n sequences, then print "Incorrect number of sequences".
  • If the sequences in the input file have different length, then print "Incorrect length of sequences".
  • Please print the sequence of each node per line.