Note that the "Comp." sets are subsets of the "Train" sets. The evaluation of SignalP 4.0 was done using a nested cross-validation approach, where different partitions were used for training, optimization and evaluation, see Supplementary Materials for details.
166 AJL2_ANGJA Evaluation MVSFKLPAFLCVAVLSSMALVSHGAVLGLCEGACPEGWVEHKNRCYLHVAEKKTWLDAELNCLHHGGNLASEHSEDEHQF LKDLHKGSDDPFWIGLSAVHEGRSWLWSDGTSASAEGDFSMWNPGEPNDAGGKEDCVHDNYGGQKHWNDIKCDLLFPSIC VLRMVE SSSSSSSSSSSSSSSSSSSSSSSS........................................................ ................................................................................ ...... 503 A1BG_BOVIN Evaluation Train MSAWAALLLLWGLSLSPVTEQATFFDPRPSLWAEAGSPLAPWADVTLTCQSPLPTQEFQLLKDGVGQEPVHLESPAHEHR FPLGPVTSTTRGLYRCSYKGNNDWISPSNLVEVTGAEPLPAPSISTSPVSWITPGLNTTLLCLSGLRGVTFLLRLEGEDQ FLEVAEAPEATQATFPVHRAGNYSCSYRTHAAGTPSEPSATVTIEELDPPPAPTLTVDRESAKVLRPGSSASLTCVAPLS GVDFQLRRGAEEQLVPRASTSPDRVFFRLSALAAGDGSGYTCRYRLRSELAAWSRDSAPAELVLSDGTLPAPELSAEPAI LSPTPGALVQLRCRAPRAGVRFALVRKDAGGRQVQRVLSPAGPEAQFELRGVSAVDSGNYSCVYVDTSPPFAGSKPSATL ELRVDGPLPRPQLRALWTGALTPGRDAVLRCEAEVPDVSFLLLRAGEEEPLAVAWSTHGPADLVLTSVGPQHAGTYSCRY RTGGPRSLLSELSDPVELRVAGS SSSSSSSSSSSSSSSSSSSSS........................................................... ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ .......................The format is:
Eukaryota sequence data
Gram positive sequence data
Gram negative sequence data