Record Details
Field | Value |
---|---|
Title | Creating, Understanding and Applying Machine Learning Models of Multiple Species |
Names |
Griffioen, Arwen Twinkle E.
(creator) Dietterich, Thomas (advisor) |
Date Issued | 2015-06-03 (iso8601) |
Note | Graduation date: 2015 |
Abstract | Many problems in ecology and conservation biology can be formulated and solved using machine learning algorithms for multi-label classification. This dissertation addresses three topics related to predicting the distributions of multiple species. It improves existing methods and proposes a new modeling paradigm to address the multi-species, multi-label problem. The first topic involves the calibration of multi-species distribution models (SDMs). Species distribution models (SDMs) have become important inputs to conservation modeling and decision making. However, this dissertation shows that current approaches to the calibration of multiple individual SDMs do not result in the best outcome. To develop better methods, individual and joint model tuning approaches and evaluation metrics, both traditional and novel, are tested on survey and reserve design tasks. Experiments show that careful calibration can produce substantial improvements on these tasks. The second topic concerns the task of discovering floristic communities from presence/absence data. A novel mixed membership modeling approach is introduced and applied to floristic survey data from Wilsons Promontory (Victoria, Australia). The results are evaluated by a panel of domain experts and are found to be valid community groups. The mixed membership approach produces community definitions that are biologically meaningful and achieve a better fit to data than a standard clustering approach. The third topic introduces a new algorithm for multi-label classification in which latent group variables represent higher-order positive co-occurrence information in conjunction with traditional covariates. This novel approach creates a model with improved predictive performance compared to existing methods. The latent group variables identify sets of species whose co-occurrences cannot be explained by the covariates. These can provide insights regarding mutualistic species interactions. |
Genre | Thesis/Dissertation |
Access Condition | http://creativecommons.org/licenses/by/3.0/us/ |
Topic | machine learning |
Identifier | http://hdl.handle.net/1957/56049 |