- Again change to the
exercises directory (Section 2), and read the
files 3.fasta and 5.fasta. Both contain sequences in FASTA
format:
read ("3.fasta", "5.fasta")
- The cross_references argument of the report command
is quite useful in checking the completeness of data files relative to
one another. In addition to giving us an idea of our data completeness
(to a level we may not want to know!), producing a presence/absence
table of terminals versus files:
report (cross_references)
- We will now construct 10 Wagner trees with the
command build (the default), then select the best unique trees resulting from
the Wagner builds and report the trees in parenthetical notation:
build ()
select ()
report (trees:(total))
- Another useful way to view the data is to report the
implied alignment of the molecular data currently loaded. Implied
alignments can be used to discover problems in your data, and unexpected
results before running the complete analysis:
report (ia)
- Implied alignments also show us where we have issues with variability in
sequence lengths as is the case with t16 (5.fasta). However, note that
sequence length is not problematic for t18 (5.fasta). In POY the
characters N and X are symbols used to represent any
nucleotide base (as the IUPAC code specifies), while a question mark
'?' represents any base or a gap. However, for missing sequences, the
implied alignment always show them as gap-only sequences. This way your
files will remain readable by other programs.
- We are done for now with this tutorial. Close the
interactive console:
exit ()