Using the protein sequence and electrophysiological data from VKCDB, we have been trying to find a model that can predict the half activation voltage (Va) of a given VKC with only its amino acid sequence. Using wrapper feature selection and a k-nearest neighbor (KNN) classifier, we generated a Va predictor and its MAE (mean absolute error) of 7.0mV. Those residues that are selected during the training process are highlighted in prediction result. Please keep in mind that, although all results are evaluated with cross validation, we still risk data over-fitting to certain extent.
Note: The learner will attempt to predict a Va value for any given protein sequence. Of course, it will do terribly if the sequence is not a VKC sequence.