Friday, 9 August 2013

Counting bases in FASTA format in c

Counting bases in FASTA format in c

I have a file in FASTA format which looks like:
>gi|15706275|emb|aj344566.1| paralabrax maculatofasciatus partial mrna for
trypsin
caggtgtctctgaactctggctaccacttctgtggtggctccctggtcaacgagaactgg
gttgtgtctg ctgctcactgctacaagtcccgcgttgaggtgcgtcttggagagcacaa
catcagggtcaccgagggaag cgagcagttcatcagctcctcccgcgtcatccgccacc
ccaggtacagctcctacaacatcgacaatgac atcatgctgatcaagctgagcaagccc
gccaccctcaaccagtacgtgaagcccgtggctctgcccacca gctgtgcccccgctgg
caccatgtgcaaagtctccggctggggaaacaccatgagctccactgctgacag gaaca
agctccagtgcctggacctccccatcctgtctttccaggactgtgacaactcctaccctg
gcatg atcaccgacgccatgtactgcgctggatacctggagggaggcaaggactcttac
cagggtgactctggtg gccccgtcgtgtgcaacggtgagctgcagggtgttgtgtcctg
gg
I want to count the bases (i.e. AGTC) in the file, and not the starting
header and protein name. I want to do this in C.

No comments:

Post a Comment