Return to the Course Home Page
TLDR Portfolio Assistance Week 4
A/Prof Olin Silander
Purpose
TLDR summary of how to approach to Portfolio Assessment from Week 4.
- Find a way to get the GC content and any other useful info from the fastq.gz sequence files using
fx2tab
- Write the output to a file; this will automatically be in tabbed column format.
- Import the data (i.e. file) into R;
read.table()
is one possibility.read.table
has a default expectation of tabbed column format - Make sure you know what the columns mean. You can rename them in the text file or using
col.names=c("sequence-name", "length", "etc")
within theread.table()
line. You can also rename them after having read in the data usingcolnames(mydata) <- c("sequence-name", "length", "etc")
. Note that in naming variables and column / row names, spaces are highly discouraged. - Look for correlations or patterns in the data: in Chrome, How can I look for correlations in data
- Produce visualisations of your findings and explain/interpret them in a caption below the visualisaltion.