Progress

We have created ANNs for the females of all species in this group. Identification accuracy varies widely depending mostly on the number of unique individuals used in training the ANN. When sufficient numbers (more than 10 specimens) were available, accuracy falls in the 90—99% range. We have developed tools to partially compensate for the lack of replicate specimens, but it is difficult to predict the effectiveness of these techniques for some very rare groups, as the amount of intraspecific and interspecific variation differs considerably between groups. We are currently working on training ANNs for the males. Early results indicate that the accuracy of identification may be higher due to the extra information provided by the two views we are submitting to the system.

SPIDA-web: The Internet Interface

SPIDA-web is up and running for female specimens only, allowing registered users to submit images to the networks for species level identifications (see Instructions for details). Users submit minimally processed images and receive back identifications and information on matching species from a database, including images, drawings and distribution maps. Submitted images need only be 256x256 pixels, grayscale, and cropped square. We hope to have the males finished by early fall, 2005.

Every image submitted to SPIDA-web will be stored in a database for future supervised inclusion in the training sets for underrepresented species. This will allow the system to evolve and improve as people use it. In the future, we would like to collect locality data for each image as well, so we can expand the distribution maps in our database.

The flowchart illustrates the sequence of events.

Future Work

*My [Dr. K. Russell] vision is to create some sort of non-profit center or institute for automated species identification. This center could be housed anywhere, but preferably at a museum or university and would consist of myself, Mr. Martin Do (the computer scientist/programmer on the project) and a few programmers and technicians. Scientists or other organizations interested in developing an identification system for their particular group (or location) would contract us for the work. Through consultation with them, we would make the necessary input modifications (one image, many images, sound, etc.) and build the system. Once built, it could either be housed on the center's servers, or taken by the contracting institution. ANN-based systems should ideally be dynamic, such that as people use the system, it improves by making use of the incoming information. In addition, as taxonomy is still rather fluid for many groups, these changes would need to be incorporated. Therefore, there is a certain amount of necessary maintenance. Part of the center's purpose could be to maintain the identification systems or to train others to do so. I think it makes more sense to have a central location for this work than to move from institution to institution as part of various individual grants to work on different groups. The center would also serve as a place to bring disparate initiatives together--different taxa and different approaches to automated species identification.