2015. február 25., szerda

Burai et al. (2015) Remote Sensing

Burai, P., Deák, B., Valkó, O., Tomor, T (2015): Classification of herbaceous vegetation using airborne hyperspectral imagery. Remote Sensing 7: 2046-2066.


Abstract
Alkali landscapes hold an extremely fine-scale mosaic of several vegetation types, thus it seems challenging to separate these classes by remote sensing. Our aim was to test the applicability of different image classification methods of hyperspectral data in this complex situation. To reach the highest classification accuracy, we tested traditional image classifiers (maximum likelihood classifier—MLC), machine learning algorithms (support vector machine—SVM, random forest—RF) and feature extraction (minimum noise fraction (MNF)-transformation) on training datasets of different sizes. Digital images were acquired from an AISA EAGLE II hyperspectral sensor of 128 contiguous bands (400–1000 nm), a spectral sampling of 5 nm bandwidth and a ground pixel size of 1 m. For the classification, we established twenty vegetation classes based on the dominant species, canopy height, and total vegetation cover. Image classification was applied to the original and MNF (minimum noise fraction) transformed dataset with various training sample sizes between 10 and 30 pixels. In order to select the optimal number of the transformed features, we applied SVM, RF and MLC classification to 2–15 MNF transformed bands. In the case of the original bands, SVM and RF classifiers provided high accuracy irrespective of the number of the training pixels. We found that SVM and RF produced the best accuracy when using the first nine MNF transformed bands; involving further features did not increase classification accuracy. SVM and RF provided high accuracies with the transformed bands, especially in the case of the aggregated groups. Even MLC provided high accuracy with 30 training pixels (80.78%), but the use of a smaller training dataset (10 training pixels) significantly reduced the accuracy of classification (52.56%). Our results suggest that in alkali landscapes, the application of SVM is a feasible solution, as it provided the highest accuracies compared to RF and MLC. SVM was not sensitive in the training sample size, which makes it an adequate tool when only a limited number of training pixels are available for some classes.


Keywords
grassland, habitat mapping, hyperspectral, maximum likelihood classifier, minimum noise fraction, nature conservation, open landscape, random forest, support vector machine

2015. február 18., szerda

Lengyel & Podani (2015) Journal of Vegetation Science



Abstract
Questions: What is the relative importance of our methodological decisions concerning sampling (plot size) and data analysis (data transformation, resemblance coefficient, hierarchical clustering strategy and number of clusters) in vegetation classification? Are there differences between the conclusions when the full range or only a more practical narrow range of methodological choices is tested? What is the difference between results for actual and random data?
Location: Rock grassland in Hungary.
Methods: The full procedure of vegetation classification was simulated using actual and random data. Variation in classification results was partitioned using distance-based redundancy analysis. The RDA models were subjected to variation partitioning to determine the relative importance of methodological decisions.
Results: RDA models explained more variation in classifications of random than in real data. Classification algorithm, cluster level, data transformation and mean plot size were always included among the most significant variables, however, the other variables also had a considerable effect in certain situations.
Conclusions: As adjusted R2 values suggest, the overall effect of methodological decisions on classifications is larger for randomly structured than actual data, due possibly to a stronger clustering tendency in the latter. The clustering algorithm, cluster level, data transformation and plot size should be chosen most carefully before classification analyses, but any of the examined decisions can significantly affect the result. In addition to the mean, the range of plot sizes should also be carefully delimited during relevé selection for classification studies. The main decision about the classification algorithm is whether a chain-forming or group-forming method is used. The data transformation had a more significant effect on real data than on simulations with random variation, thus supporting the ability of the application of different abundance scales in revealing different facets of biologically relevant patterns in community composition. The resemblance measure had a relatively weak effect, suggesting that it is not as influential as previously thought.


Keywords
Data transformation, Flexible clustering, Model selection, Multivariate analysis, Plot size, Resemblance measure