On preprocessing of protein sequences for neural network prediction of polyproline type II secondary structures
Polyproline type II stretches are somewhat rare on proteins. The backbone of this secondary structural element folds to a triangular form instead of the normal alpha -helix with 3.6 residues per turn. It is a very challenging task to try to detect them computationally from protein sequence. Here, we have studied the preprocessing phase in particular, which is important for any machine learning met