To obtain the aminoacid name you just send #asAminoacidName to a String, for example
'a' asAminoacidName. " --> 'Alanine' "
'G'asAminoacidName. " --> 'Glycine' "
Many times we copy and paste sequences from several sources which aren't properly formatted , so to remove all spacing characters from a sequence you could use:
' ATGCTAGT
CAG
C
AGTTAGCGACA ' asCondensedString.
Both messages are received by a String and answer a String. Now another basic object is Boolean, with its two instances: true and false. For example to determine if a DNA sequence contains ambiguous letters:
'atcggtcggctta' hasAmbiguousDNABases. " -> false "
'atcggfcggctta' hasAmbiguousDNABases. " -> true "
An important part of working in Smalltalk is the Collection objects, being Array one of the most used. A case which answer an Array instance would be to get the positions of gaps (i.e. : - characters) in a DNA sequence:
'ATCGAT-CAGTGCA--CAGTCA-TTC' indicesOfAscii: $- asciiValue.
" --> #(7 15 16 23) "
and of course, you could use the impressive amount of features of the String hierarchy.
"Get the sequence size:" 'AATGATCGATGCTAGTCGACA' size. " -> 21 (a SmallInteger) " "Compare two sequences:" 'AATGATCGATGCTAGTCGACA' = 'AATGATCGATCCTAGTCGACA'
" -> false (a False) " " Find the position of the first (answer 0 if doesn't) subsequence passed as parameter " 'AATGATCGATGCTAGTCGACATGCTA' findString: 'TGCTA' " -> 10 "
And that's all for now, as I don't like long posts we will stop here and I will check the feedback if any, hopefully I will receive comments about what people would like to use.
0 comments:
Post a Comment